Audio source separation techniques including novel time-frequency representation tools

Cancela, Pablo

Supervisor(es): Sapiro, Guillermo - Randall, Gregory

Resumen:

The thesis explores the development of tools for audio representation with applications in Audio Source Separation and in the Music Information Retrieval (MIR) field. A novel constant Q transform was introduced, called IIR-CQT. The transform allows a flexible design and achieves low computational cost. Also, an independent development of the Fan Chirp Transform (FChT) with the focus on the representation of simultaneous sources is studied, which has several applications in the analysis of polyphonic music signals. Dierent applications are explored in the MIR field, some of them directly related with the low-level representation tools that were analyzed. One of these applications is the development of a visualization tool based in the FChT that proved to be useful for musicological analysis . The tool has been made available as an open source, freely available software. The proposed Transform has also been used to detect and track fundamental frequencies of harmonic sources in polyphonic music. Also, the information of the slope of the pitch was used to define a similarity measure between two harmonic components that are close in time. This measure helps to use clustering algorithms to track multiple sources in polyphonic music. Additionally, the FChT was used in the context of the Query by Humming application. One of the main limitations of such application is the construction of a search database. In this work, we propose an algorithm to automatically populate the database of an existing Query by Humming, with promising results. Finally, two audio source separation techniques are studied. The first one is the separation of harmonic signals based on the FChT. The second one is an application for which the fundamental frequency of the sources is assumed to be known (Score Informed Source Separation problem).


Detalles Bibliográficos
2015
Procesamiento de Señales
Inglés
Universidad de la República
COLIBRI
http://hdl.handle.net/20.500.12008/20197
Acceso abierto
Licencia Creative Commons Atribución – No Comercial – Sin Derivadas (CC - By-NC-ND)
Resumen:
Sumario:The thesis explores the development of tools for audio representation with applications in Audio Source Separation and in the Music Information Retrieval (MIR) field. A novel constant Q transform was introduced, called IIR-CQT. The transform allows a flexible design and achieves low computational cost. Also, an independent development of the Fan Chirp Transform (FChT) with the focus on the representation of simultaneous sources is studied, which has several applications in the analysis of polyphonic music signals. Dierent applications are explored in the MIR field, some of them directly related with the low-level representation tools that were analyzed. One of these applications is the development of a visualization tool based in the FChT that proved to be useful for musicological analysis . The tool has been made available as an open source, freely available software. The proposed Transform has also been used to detect and track fundamental frequencies of harmonic sources in polyphonic music. Also, the information of the slope of the pitch was used to define a similarity measure between two harmonic components that are close in time. This measure helps to use clustering algorithms to track multiple sources in polyphonic music. Additionally, the FChT was used in the context of the Query by Humming application. One of the main limitations of such application is the construction of a search database. In this work, we propose an algorithm to automatically populate the database of an existing Query by Humming, with promising results. Finally, two audio source separation techniques are studied. The first one is the separation of harmonic signals based on the FChT. The second one is an application for which the fundamental frequency of the sources is assumed to be known (Score Informed Source Separation problem).