You are here

Harvesting audio streaming content and making it accessible for research and academic purposes

Title (author1): 
First names (author1): 
Surname (author 1): 
Other authors: 
Thomas Drugeon; Félicien Vallet
Presentation type: 

Since the law on legal deposit extended to broadcast material in 1992, along with 100 TV channels, Ina has been recording around the clock radio programming from 20 national radio stations. When, in 2006, another law applied that same legal framework to the web and Ina was made responsible for collecting online broadcast related content, it appeared that streaming radio harvesting would enrich and complement terrestrial radio recordings.
Ina R&D teams have developed and implemented tools that automatically collect, organize and index online radio streaming content. In parallel, they have focused on speech and music detection in broadcast archives and implemented a speech music discrimination tool on web audio content.
This presentation will aim at framing the technical approach to collecting online streaming audio content as well as the interface and tools that enable structured access to these collections.
The whole process is still experimental and ambitions to provide researchers who access these collections with the most efficient and intuitive search and navigation affordances.