Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding

Fuentes, Magdalena - Steers, Bea - Zinemanas, Pablo - Rocamora, Martín - Bondi, Luca - Wilkins, Julia - Shi, Qianyi - Hou, Yao - Das, Samarjit - Serra, Xavier - Bello, Juan Pablo

Resumen:

Automatic audio-visual urban traffic understanding is a growing area of research with many potential applications of value to industry, academia, and the public sector. Yet, the lack of well-curated resources for training and evaluating models to research in this area hinders their development. To address this we present a curated audio-visual dataset, Urban Sound & Sight (Urbansas), developed for investigating the detection and localization of sounding vehicles in the wild. Urbansas consists of 12 hours of unlabeled data along with 3 hours of manually annotated data, including bounding boxes with classes and unique id of vehicles, and strong audio labels featuring vehicle types and indicating off-screen sounds. We discuss the challenges presented by the dataset and how to use its annotations for the localization of vehicles in the wild through audio models.


Detalles Bibliográficos
2022
Location awareness
Training
Industries
Annotations
Conferences
Signal processing
Benchmark testing
Audio-visual
Urban research
Traffic
Dataset
Inglés
Universidad de la República
COLIBRI
https://ieeexplore.ieee.org/document/9747644
https://hdl.handle.net/20.500.12008/31397
Acceso abierto
Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
_version_ 1807522899299401728
author Fuentes, Magdalena
author2 Steers, Bea
Zinemanas, Pablo
Rocamora, Martín
Bondi, Luca
Wilkins, Julia
Shi, Qianyi
Hou, Yao
Das, Samarjit
Serra, Xavier
Bello, Juan Pablo
author2_role author
author
author
author
author
author
author
author
author
author
author_facet Fuentes, Magdalena
Steers, Bea
Zinemanas, Pablo
Rocamora, Martín
Bondi, Luca
Wilkins, Julia
Shi, Qianyi
Hou, Yao
Das, Samarjit
Serra, Xavier
Bello, Juan Pablo
author_role author
bitstream.checksum.fl_str_mv 6429389a7df7277b72b7924fdc7d47a9
a006180e3f5b2ad0b88185d14284c0e0
36c32e9c6da50e6d55578c16944ef7f6
1996b8461bc290aef6a27d78c67b6b52
30cc85dcb22591cf55d406360f46bb52
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
bitstream.url.fl_str_mv http://localhost:8080/xmlui/bitstream/20.500.12008/31397/5/license.txt
http://localhost:8080/xmlui/bitstream/20.500.12008/31397/2/license_url
http://localhost:8080/xmlui/bitstream/20.500.12008/31397/3/license_text
http://localhost:8080/xmlui/bitstream/20.500.12008/31397/4/license_rdf
http://localhost:8080/xmlui/bitstream/20.500.12008/31397/1/FSZRBWSHDSB22.pdf
collection COLIBRI
dc.contributor.filiacion.none.fl_str_mv Fuentes Magdalena, New York University, New York, NY
Steers Bea, New York University, New York, NY
Zinemanas Pablo, Universitat Pompeu Fabra, Barcelona, Spain
Rocamora Martín, Universidad de la República (Uruguay). Facultad de Ingeniería.
Bondi Luca, Bosch Research, Pittsburgh, PA, USA
Wilkins Julia, New York University, New York, NY
Shi Qianyi, New York University, New York, NY
Hou Yao, New York University, New York, NY
Das Samarjit, Bosch Research, Pittsburgh, PA, USA
Serra Xavier, Universitat Pompeu Fabra, Barcelona, Spain
Bello Juan Pablo, New York University, New York, NY
dc.creator.none.fl_str_mv Fuentes, Magdalena
Steers, Bea
Zinemanas, Pablo
Rocamora, Martín
Bondi, Luca
Wilkins, Julia
Shi, Qianyi
Hou, Yao
Das, Samarjit
Serra, Xavier
Bello, Juan Pablo
dc.date.accessioned.none.fl_str_mv 2022-05-03T12:01:35Z
dc.date.available.none.fl_str_mv 2022-05-03T12:01:35Z
dc.date.issued.none.fl_str_mv 2022
dc.description.abstract.none.fl_txt_mv Automatic audio-visual urban traffic understanding is a growing area of research with many potential applications of value to industry, academia, and the public sector. Yet, the lack of well-curated resources for training and evaluating models to research in this area hinders their development. To address this we present a curated audio-visual dataset, Urban Sound & Sight (Urbansas), developed for investigating the detection and localization of sounding vehicles in the wild. Urbansas consists of 12 hours of unlabeled data along with 3 hours of manually annotated data, including bounding boxes with classes and unique id of vehicles, and strong audio labels featuring vehicle types and indicating off-screen sounds. We discuss the challenges presented by the dataset and how to use its annotations for the localization of vehicles in the wild through audio models.
dc.format.mimetype.es.fl_str_mv application/pdf
dc.identifier.citation.es.fl_str_mv Fuentes, M., Steers, B., Zinemanas, P. y otros. Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding [en línea]. EN: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 may, pp 141-145. Piscataway, NJ : IEEE, 2022. DOI 10.1109/ICASSP43922.2022.9747644
dc.identifier.doi.none.fl_str_mv 10.1109/ICASSP43922.2022.9747644
dc.identifier.uri.none.fl_str_mv https://ieeexplore.ieee.org/document/9747644
https://hdl.handle.net/20.500.12008/31397
dc.language.iso.none.fl_str_mv en
eng
dc.publisher.es.fl_str_mv IEEE
dc.relation.ispartof.es.fl_str_mv ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 may 2022, pp. 141-145.
dc.rights.license.none.fl_str_mv Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.source.none.fl_str_mv reponame:COLIBRI
instname:Universidad de la República
instacron:Universidad de la República
dc.subject.es.fl_str_mv Location awareness
Training
Industries
Annotations
Conferences
Signal processing
Benchmark testing
Audio-visual
Urban research
Traffic
Dataset
dc.title.none.fl_str_mv Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
dc.type.es.fl_str_mv Ponencia
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
dc.type.version.none.fl_str_mv info:eu-repo/semantics/publishedVersion
description Automatic audio-visual urban traffic understanding is a growing area of research with many potential applications of value to industry, academia, and the public sector. Yet, the lack of well-curated resources for training and evaluating models to research in this area hinders their development. To address this we present a curated audio-visual dataset, Urban Sound & Sight (Urbansas), developed for investigating the detection and localization of sounding vehicles in the wild. Urbansas consists of 12 hours of unlabeled data along with 3 hours of manually annotated data, including bounding boxes with classes and unique id of vehicles, and strong audio labels featuring vehicle types and indicating off-screen sounds. We discuss the challenges presented by the dataset and how to use its annotations for the localization of vehicles in the wild through audio models.
eu_rights_str_mv openAccess
format conferenceObject
id COLIBRI_b40ec9819e3b28e2609b257d4e269a52
identifier_str_mv Fuentes, M., Steers, B., Zinemanas, P. y otros. Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding [en línea]. EN: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 may, pp 141-145. Piscataway, NJ : IEEE, 2022. DOI 10.1109/ICASSP43922.2022.9747644
10.1109/ICASSP43922.2022.9747644
instacron_str Universidad de la República
institution Universidad de la República
instname_str Universidad de la República
language eng
language_invalid_str_mv en
network_acronym_str COLIBRI
network_name_str COLIBRI
oai_identifier_str oai:colibri.udelar.edu.uy:20.500.12008/31397
publishDate 2022
reponame_str COLIBRI
repository.mail.fl_str_mv mabel.seroubian@seciu.edu.uy
repository.name.fl_str_mv COLIBRI - Universidad de la República
repository_id_str 4771
rights_invalid_str_mv Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
spelling Fuentes Magdalena, New York University, New York, NYSteers Bea, New York University, New York, NYZinemanas Pablo, Universitat Pompeu Fabra, Barcelona, SpainRocamora Martín, Universidad de la República (Uruguay). Facultad de Ingeniería.Bondi Luca, Bosch Research, Pittsburgh, PA, USAWilkins Julia, New York University, New York, NYShi Qianyi, New York University, New York, NYHou Yao, New York University, New York, NYDas Samarjit, Bosch Research, Pittsburgh, PA, USASerra Xavier, Universitat Pompeu Fabra, Barcelona, SpainBello Juan Pablo, New York University, New York, NY2022-05-03T12:01:35Z2022-05-03T12:01:35Z2022Fuentes, M., Steers, B., Zinemanas, P. y otros. Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding [en línea]. EN: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 may, pp 141-145. Piscataway, NJ : IEEE, 2022. DOI 10.1109/ICASSP43922.2022.9747644https://ieeexplore.ieee.org/document/9747644https://hdl.handle.net/20.500.12008/3139710.1109/ICASSP43922.2022.9747644Automatic audio-visual urban traffic understanding is a growing area of research with many potential applications of value to industry, academia, and the public sector. Yet, the lack of well-curated resources for training and evaluating models to research in this area hinders their development. To address this we present a curated audio-visual dataset, Urban Sound & Sight (Urbansas), developed for investigating the detection and localization of sounding vehicles in the wild. Urbansas consists of 12 hours of unlabeled data along with 3 hours of manually annotated data, including bounding boxes with classes and unique id of vehicles, and strong audio labels featuring vehicle types and indicating off-screen sounds. We discuss the challenges presented by the dataset and how to use its annotations for the localization of vehicles in the wild through audio models.Submitted by Ribeiro Jorge (jribeiro@fing.edu.uy) on 2022-04-28T23:05:08Z No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FSZRBWSHDSB22.pdf: 5680707 bytes, checksum: 30cc85dcb22591cf55d406360f46bb52 (MD5)Approved for entry into archive by Machado Jimena (jmachado@fing.edu.uy) on 2022-05-02T20:41:38Z (GMT) No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FSZRBWSHDSB22.pdf: 5680707 bytes, checksum: 30cc85dcb22591cf55d406360f46bb52 (MD5)Made available in DSpace by Luna Fabiana (fabiana.luna@seciu.edu.uy) on 2022-05-03T12:01:35Z (GMT). No. of bitstreams: 2 license_rdf: 23149 bytes, checksum: 1996b8461bc290aef6a27d78c67b6b52 (MD5) FSZRBWSHDSB22.pdf: 5680707 bytes, checksum: 30cc85dcb22591cf55d406360f46bb52 (MD5) Previous issue date: 2022application/pdfenengIEEEICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23-27 may 2022, pp. 141-145.Las obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad de la República.(Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)info:eu-repo/semantics/openAccessLicencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)Location awarenessTrainingIndustriesAnnotationsConferencesSignal processingBenchmark testingAudio-visualUrban researchTrafficDatasetUrban sound & sight : Dataset and benchmark for audio-visual urban scene understandingPonenciainfo:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionreponame:COLIBRIinstname:Universidad de la Repúblicainstacron:Universidad de la RepúblicaFuentes, MagdalenaSteers, BeaZinemanas, PabloRocamora, MartínBondi, LucaWilkins, JuliaShi, QianyiHou, YaoDas, SamarjitSerra, XavierBello, Juan PabloProcesamiento de SeñalesProcesamiento de AudioLICENSElicense.txtlicense.txttext/plain; charset=utf-84267http://localhost:8080/xmlui/bitstream/20.500.12008/31397/5/license.txt6429389a7df7277b72b7924fdc7d47a9MD55CC-LICENSElicense_urllicense_urltext/plain; charset=utf-850http://localhost:8080/xmlui/bitstream/20.500.12008/31397/2/license_urla006180e3f5b2ad0b88185d14284c0e0MD52license_textlicense_texttext/html; charset=utf-838616http://localhost:8080/xmlui/bitstream/20.500.12008/31397/3/license_text36c32e9c6da50e6d55578c16944ef7f6MD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-823149http://localhost:8080/xmlui/bitstream/20.500.12008/31397/4/license_rdf1996b8461bc290aef6a27d78c67b6b52MD54ORIGINALFSZRBWSHDSB22.pdfFSZRBWSHDSB22.pdfapplication/pdf5680707http://localhost:8080/xmlui/bitstream/20.500.12008/31397/1/FSZRBWSHDSB22.pdf30cc85dcb22591cf55d406360f46bb52MD5120.500.12008/313972024-07-24 17:25:46.471oai:colibri.udelar.edu.uy:20.500.12008/31397VGVybWlub3MgeSBjb25kaWNpb25lcyByZWxhdGl2YXMgYWwgZGVwb3NpdG8gZGUgb2JyYXMKCgpMYXMgb2JyYXMgZGVwb3NpdGFkYXMgZW4gZWwgUmVwb3NpdG9yaW8gc2UgcmlnZW4gcG9yIGxhIE9yZGVuYW56YSBkZSBsb3MgRGVyZWNob3MgZGUgbGEgUHJvcGllZGFkIEludGVsZWN0dWFsICBkZSBsYSBVbml2ZXJzaWRhZCBEZSBMYSBSZXDDumJsaWNhLiAoUmVzLiBOwrogOTEgZGUgQy5ELkMuIGRlIDgvSUlJLzE5OTQg4oCTIEQuTy4gNy9JVi8xOTk0KSB5ICBwb3IgbGEgT3JkZW5hbnphIGRlbCBSZXBvc2l0b3JpbyBBYmllcnRvIGRlIGxhIFVuaXZlcnNpZGFkIGRlIGxhIFJlcMO6YmxpY2EgKFJlcy4gTsK6IDE2IGRlIEMuRC5DLiBkZSAwNy8xMC8yMDE0KS4gCgpBY2VwdGFuZG8gZWwgYXV0b3IgZXN0b3MgdMOpcm1pbm9zIHkgY29uZGljaW9uZXMgZGUgZGVww7NzaXRvIGVuIENPTElCUkksIGxhIFVuaXZlcnNpZGFkIGRlIFJlcMO6YmxpY2EgcHJvY2VkZXLDoSBhOiAgCgphKSBhcmNoaXZhciBtw6FzIGRlIHVuYSBjb3BpYSBkZSBsYSBvYnJhIGVuIGxvcyBzZXJ2aWRvcmVzIGRlIGxhIFVuaXZlcnNpZGFkIGEgbG9zIGVmZWN0b3MgZGUgZ2FyYW50aXphciBhY2Nlc28sIHNlZ3VyaWRhZCB5IHByZXNlcnZhY2nDs24KYikgY29udmVydGlyIGxhIG9icmEgYSBvdHJvcyBmb3JtYXRvcyBzaSBmdWVyYSBuZWNlc2FyaW8gIHBhcmEgZmFjaWxpdGFyIHN1IHByZXNlcnZhY2nDs24geSBhY2Nlc2liaWxpZGFkIHNpbiBhbHRlcmFyIHN1IGNvbnRlbmlkby4KYykgcmVhbGl6YXIgbGEgY29tdW5pY2FjacOzbiBww7pibGljYSB5IGRpc3BvbmVyIGVsIGFjY2VzbyBsaWJyZSB5IGdyYXR1aXRvIGEgdHJhdsOpcyBkZSBJbnRlcm5ldCBtZWRpYW50ZSBsYSBwdWJsaWNhY2nDs24gZGUgbGEgb2JyYSBiYWpvIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMgc2VsZWNjaW9uYWRhIHBvciBlbCBwcm9waW8gYXV0b3IuCgoKRW4gY2FzbyBxdWUgZWwgYXV0b3IgaGF5YSBkaWZ1bmRpZG8geSBkYWRvIGEgcHVibGljaWRhZCBhIGxhIG9icmEgZW4gZm9ybWEgcHJldmlhLCAgcG9kcsOhIHNvbGljaXRhciB1biBwZXLDrW9kbyBkZSBlbWJhcmdvIHNvYnJlIGxhIGRpc3BvbmliaWxpZGFkIHDDumJsaWNhIGRlIGxhIG1pc21hLCBlbCBjdWFsIGNvbWVuemFyw6EgYSBwYXJ0aXIgZGUgbGEgYWNlcHRhY2nDs24gZGUgZXN0ZSBkb2N1bWVudG8geSBoYXN0YSBsYSBmZWNoYSBxdWUgaW5kaXF1ZSAuCgpFbCBhdXRvciBhc2VndXJhIHF1ZSBsYSBvYnJhIG5vIGluZnJpZ2UgbmluZ8O6biBkZXJlY2hvIHNvYnJlIHRlcmNlcm9zLCB5YSBzZWEgZGUgcHJvcGllZGFkIGludGVsZWN0dWFsIG8gY3VhbHF1aWVyIG90cm8uCgpFbCBhdXRvciBnYXJhbnRpemEgcXVlIHNpIGVsIGRvY3VtZW50byBjb250aWVuZSBtYXRlcmlhbGVzIGRlIGxvcyBjdWFsZXMgbm8gdGllbmUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCAgaGEgb2J0ZW5pZG8gZWwgcGVybWlzbyBkZWwgcHJvcGlldGFyaW8gZGUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCB5IHF1ZSBlc2UgbWF0ZXJpYWwgY3V5b3MgZGVyZWNob3Mgc29uIGRlIHRlcmNlcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIHkgcmVjb25vY2lkbyBlbiBlbCB0ZXh0byBvIGNvbnRlbmlkbyBkZWwgZG9jdW1lbnRvIGRlcG9zaXRhZG8gZW4gZWwgUmVwb3NpdG9yaW8uCgpFbiBvYnJhcyBkZSBhdXRvcsOtYSBtw7psdGlwbGUgL3NlIHByZXN1bWUvIHF1ZSBlbCBhdXRvciBkZXBvc2l0YW50ZSBkZWNsYXJhIHF1ZSBoYSByZWNhYmFkbyBlbCBjb25zZW50aW1pZW50byBkZSB0b2RvcyBsb3MgYXV0b3JlcyBwYXJhIHB1YmxpY2FybGEgZW4gZWwgUmVwb3NpdG9yaW8sIHNpZW5kbyDDqXN0ZSBlbCDDum5pY28gcmVzcG9uc2FibGUgZnJlbnRlIGEgY3VhbHF1aWVyIHRpcG8gZGUgcmVjbGFtYWNpw7NuIGRlIGxvcyBvdHJvcyBjb2F1dG9yZXMuCgpFbCBhdXRvciBzZXLDoSByZXNwb25zYWJsZSBkZWwgY29udGVuaWRvIGRlIGxvcyBkb2N1bWVudG9zIHF1ZSBkZXBvc2l0YS4gTGEgVURFTEFSIG5vIHNlcsOhIHJlc3BvbnNhYmxlIHBvciBsYXMgZXZlbnR1YWxlcyB2aW9sYWNpb25lcyBhbCBkZXJlY2hvIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCBlbiBxdWUgcHVlZGEgaW5jdXJyaXIgZWwgYXV0b3IuCgpBbnRlIGN1YWxxdWllciBkZW51bmNpYSBkZSB2aW9sYWNpw7NuIGRlIGRlcmVjaG9zIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCwgbGEgVURFTEFSICBhZG9wdGFyw6EgdG9kYXMgbGFzIG1lZGlkYXMgbmVjZXNhcmlhcyBwYXJhIGV2aXRhciBsYSBjb250aW51YWNpw7NuIGRlIGRpY2hhIGluZnJhY2Npw7NuLCBsYXMgcXVlIHBvZHLDoW4gaW5jbHVpciBlbCByZXRpcm8gZGVsIGFjY2VzbyBhIGxvcyBjb250ZW5pZG9zIHkvbyBtZXRhZGF0b3MgZGVsIGRvY3VtZW50byByZXNwZWN0aXZvLgoKTGEgb2JyYSBzZSBwb25kcsOhIGEgZGlzcG9zaWNpw7NuIGRlbCBww7pibGljbyBhIHRyYXbDqXMgZGUgbGFzIGxpY2VuY2lhcyBDcmVhdGl2ZSBDb21tb25zLCBlbCBhdXRvciBwb2Ryw6Egc2VsZWNjaW9uYXIgdW5hIGRlIGxhcyA2IGxpY2VuY2lhcyBkaXNwb25pYmxlczoKCgpBdHJpYnVjacOzbiAoQ0MgLSBCeSk6IFBlcm1pdGUgdXNhciBsYSBvYnJhIHkgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBzaWVtcHJlIHF1ZSBzZSByZWNvbm96Y2EgYWwgYXV0b3IuCgpBdHJpYnVjacOzbiDigJMgQ29tcGFydGlyIElndWFsIChDQyAtIEJ5LVNBKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgaW5jbHVzbyBjb24gZmluZXMgY29tZXJjaWFsZXMsIHBlcm8gbGEgZGlzdHJpYnVjacOzbiBkZSBsYXMgb2JyYXMgZGVyaXZhZGFzIGRlYmUgaGFjZXJzZSBtZWRpYW50ZSB1bmEgbGljZW5jaWEgaWTDqW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuCgpBdHJpYnVjacOzbiDigJMgTm8gQ29tZXJjaWFsIChDQyAtIEJ5LU5DKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzLCByZWNvbm9jaWVuZG8gYWwgYXV0b3IuCgpBdHJpYnVjacOzbiDigJMgU2luIERlcml2YWRhcyAoQ0MgLSBCeS1ORCk6IFBlcm1pdGUgZWwgdXNvIGRlIGxhIG9icmEsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBwZXJvIG5vIHNlIHBlcm1pdGUgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGRlYmllbmRvIHJlY29ub2NlciBhbCBhdXRvci4KCkF0cmlidWNpw7NuIOKAkyBObyBDb21lcmNpYWwg4oCTIENvbXBhcnRpciBJZ3VhbCAoQ0Mg4oCTIEJ5LU5DLVNBKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzIHkgbGEgZGlzdHJpYnVjacOzbiBkZSBsYXMgb2JyYXMgZGVyaXZhZGFzIHNlIGhhZ2EgbWVkaWFudGUgbGljZW5jaWEgaWTDqW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuCgpBdHJpYnVjacOzbiDigJMgTm8gQ29tZXJjaWFsIOKAkyBTaW4gRGVyaXZhZGFzIChDQyAtIEJ5LU5DLU5EKTogUGVybWl0ZSB1c2FyIGxhIG9icmEsIHBlcm8gbm8gc2UgcGVybWl0ZSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcyB5IG5vIHNlIHBlcm1pdGUgdXNvIGNvbiBmaW5lcyBjb21lcmNpYWxlcywgZGViaWVuZG8gcmVjb25vY2VyIGFsIGF1dG9yLgoKTG9zIHVzb3MgcHJldmlzdG9zIGVuIGxhcyBsaWNlbmNpYXMgaW5jbHV5ZW4gbGEgZW5hamVuYWNpw7NuLCByZXByb2R1Y2Npw7NuLCBjb211bmljYWNpw7NuLCBwdWJsaWNhY2nDs24sIGRpc3RyaWJ1Y2nDs24geSBwdWVzdGEgYSBkaXNwb3NpY2nDs24gZGVsIHDDumJsaWNvLiBMYSBjcmVhY2nDs24gZGUgb2JyYXMgZGVyaXZhZGFzIGluY2x1eWUgbGEgYWRhcHRhY2nDs24sIHRyYWR1Y2Npw7NuIHkgZWwgcmVtaXguCgpDdWFuZG8gc2Ugc2VsZWNjaW9uZSB1bmEgbGljZW5jaWEgcXVlIGhhYmlsaXRlIHVzb3MgY29tZXJjaWFsZXMsIGVsIGRlcMOzc2l0byBkZWJlcsOhIHNlciBhY29tcGHDsWFkbyBkZWwgYXZhbCBkZWwgamVyYXJjYSBtw6F4aW1vIGRlbCBTZXJ2aWNpbyBjb3JyZXNwb25kaWVudGUuCg==Universidadhttps://udelar.edu.uy/https://www.colibri.udelar.edu.uy/oai/requestmabel.seroubian@seciu.edu.uyUruguayopendoar:47712024-07-25T14:33:17.302737COLIBRI - Universidad de la Repúblicafalse
spellingShingle Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
Fuentes, Magdalena
Location awareness
Training
Industries
Annotations
Conferences
Signal processing
Benchmark testing
Audio-visual
Urban research
Traffic
Dataset
status_str publishedVersion
title Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
title_full Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
title_fullStr Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
title_full_unstemmed Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
title_short Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
title_sort Urban sound & sight : Dataset and benchmark for audio-visual urban scene understanding
topic Location awareness
Training
Industries
Annotations
Conferences
Signal processing
Benchmark testing
Audio-visual
Urban research
Traffic
Dataset
url https://ieeexplore.ieee.org/document/9747644
https://hdl.handle.net/20.500.12008/31397