Resumen:: Visual music transcription of clarinet video recordings trained with audio-based labelled data :: SILO. Sistema nacional de repositorios digitales. Uruguay

Conferencia Publicado

Visual music transcription of clarinet video recordings trained with audio-based labelled data

Zinemanas, Pablo - Arias, Pablo - Haro, Gloria - Gomez, Emilia

Resumen:

Automatic transcription is a well-known task in the music information retrieval (MIR) domain, and consists on the computation of a symbolic music representation (e.g. MIDI) from an audio recording. In this work, we address the automatic transcription of video recordings when the audio modality is missing or it does not have enough quality, and thus analyze the visual information. We focus on the clarinet which is played by opening/closing a set of holes and keys. We propose a method for automatic visual note estimation by detecting the fingertips of the player and measuring their displacement with respect to the holes and keys of the clarinet. To this aim, we track the clarinet and determine its position on every frame. The relative positions of the fingertips are used as features of a machine learning algorithm trained for note pitch classification. For that purpose, a dataset is built in a semiautomatic way by estimating pitch information from audio signals in an existing collection of 4.5 hours of video recordings from six different songs performed by nine different players. Our results confirm the difficulty of performing visual vs audio automatic transcription mainly due to motion blur and occlusions that cannot be solved with a single view

Detalles Bibliográficos
Fecha de publicación:	2017
Temas:	Visualization Kalman filters Feature extraction Instruments Video recording Procesamiento de Señales
Idioma	Inglés
Institución:	Universidad de la República
Repositorio:	COLIBRI
Enlace(s):	https://hdl.handle.net/20.500.12008/43537
Nivel de acceso:	Acceso abierto
Licencia:	Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)

Resumen:
Sumario:	Trabajo presentado en el International Conference on Computer Vision Workshops (ICCVW), Venicia Italia, 22-29 oct., 2017.

Visual music transcription of clarinet video recordings trained with audio-based labelled data

Resultados similares