Data freshness and data accuracy :a state of the art

Peralta Costabel, Veronika del Carmen

Resumen:

In a context of Data Integration Systems (DIS) providing access to large amounts of data extracted and integrated from autonomous data sources, users are highly concerned about data quality. Traditionally, data quality is characterized via multiple quality factors. Among the quality dimensions that have been proposed in the literature, this report analyzes two main ones: data freshness and data accuracy. Concretely, we analyze the various definitions of both quality dimensions, their underlying metrics and the features of DIS that impact their evaluation. We present a taxonomy of existing works proposed for dealing with both quality dimensions in several kinds of DIS and we discuss open research problems.


Detalles Bibliográficos
2006
Data freshness
Data accuracy
Data quality evaluation
Data integration systems
Universidad de la República
COLIBRI
http://hdl.handle.net/20.500.12008/3532
Acceso abierto
Licencia Creative Commons Atribución – No Comercial – Sin Derivadas (CC BY-NC-ND 4.0)
_version_ 1807522944297992192
author Peralta Costabel, Veronika del Carmen
author_facet Peralta Costabel, Veronika del Carmen
author_role author
bitstream.checksum.fl_str_mv 528b6a3c8c7d0c6e28129d576e989607
9833653f73f7853880c94a6fead477b1
4afdbb8c545fd630ea7db775da747b2f
9da0b6dfac957114c6a7714714b86306
4363be81deb28ba419588a47e23a8d8b
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
bitstream.url.fl_str_mv http://localhost:8080/xmlui/bitstream/20.500.12008/3532/5/license.txt
http://localhost:8080/xmlui/bitstream/20.500.12008/3532/2/license_text
http://localhost:8080/xmlui/bitstream/20.500.12008/3532/3/license_url
http://localhost:8080/xmlui/bitstream/20.500.12008/3532/4/license_rdf
http://localhost:8080/xmlui/bitstream/20.500.12008/3532/1/TR0613.pdf
collection COLIBRI
dc.creator.none.fl_str_mv Peralta Costabel, Veronika del Carmen
dc.date.accessioned.none.fl_str_mv 2014-12-02T16:07:39Z
dc.date.available.none.fl_str_mv 2014-12-02T16:07:39Z
dc.date.issued.es.fl_str_mv 2006
dc.date.submitted.es.fl_str_mv 20141202
dc.description.abstract.none.fl_txt_mv In a context of Data Integration Systems (DIS) providing access to large amounts of data extracted and integrated from autonomous data sources, users are highly concerned about data quality. Traditionally, data quality is characterized via multiple quality factors. Among the quality dimensions that have been proposed in the literature, this report analyzes two main ones: data freshness and data accuracy. Concretely, we analyze the various definitions of both quality dimensions, their underlying metrics and the features of DIS that impact their evaluation. We present a taxonomy of existing works proposed for dealing with both quality dimensions in several kinds of DIS and we discuss open research problems.
dc.format.extent.es.fl_str_mv 36 p.
dc.format.mimetype.es.fl_str_mv application/pdf
dc.identifier.citation.es.fl_str_mv PERALTA COSTABEL, V. "Data freshness and data accuracy : a state of the art". Reportes Técnicos 06-13. UR. FI – INCO, 2006.
dc.identifier.issn.es.fl_str_mv 0797-6410
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/20.500.12008/3532
dc.language.iso.none.fl_str_mv in
dc.publisher.es.fl_str_mv UR. FI – INCO.
dc.relation.ispartof.es.fl_str_mv Reportes Técnicos 06-13
dc.rights.license.none.fl_str_mv Licencia Creative Commons Atribución – No Comercial – Sin Derivadas (CC BY-NC-ND 4.0)
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.source.none.fl_str_mv reponame:COLIBRI
instname:Universidad de la República
instacron:Universidad de la República
dc.subject.es.fl_str_mv Data freshness
Data accuracy
Data quality evaluation
Data integration systems
dc.title.none.fl_str_mv Data freshness and data accuracy :a state of the art
dc.type.es.fl_str_mv Reporte técnico
dc.type.none.fl_str_mv info:eu-repo/semantics/report
dc.type.version.none.fl_str_mv info:eu-repo/semantics/publishedVersion
description In a context of Data Integration Systems (DIS) providing access to large amounts of data extracted and integrated from autonomous data sources, users are highly concerned about data quality. Traditionally, data quality is characterized via multiple quality factors. Among the quality dimensions that have been proposed in the literature, this report analyzes two main ones: data freshness and data accuracy. Concretely, we analyze the various definitions of both quality dimensions, their underlying metrics and the features of DIS that impact their evaluation. We present a taxonomy of existing works proposed for dealing with both quality dimensions in several kinds of DIS and we discuss open research problems.
eu_rights_str_mv openAccess
format report
id COLIBRI_d1c641bb82f081aa16c99a2f6de4d330
identifier_str_mv PERALTA COSTABEL, V. "Data freshness and data accuracy : a state of the art". Reportes Técnicos 06-13. UR. FI – INCO, 2006.
0797-6410
instacron_str Universidad de la República
institution Universidad de la República
instname_str Universidad de la República
language_invalid_str_mv in
network_acronym_str COLIBRI
network_name_str COLIBRI
oai_identifier_str oai:colibri.udelar.edu.uy:20.500.12008/3532
publishDate 2006
reponame_str COLIBRI
repository.mail.fl_str_mv mabel.seroubian@seciu.edu.uy
repository.name.fl_str_mv COLIBRI - Universidad de la República
repository_id_str 4771
rights_invalid_str_mv Licencia Creative Commons Atribución – No Comercial – Sin Derivadas (CC BY-NC-ND 4.0)
spelling 2014-12-02T16:07:39Z2014-12-02T16:07:39Z200620141202PERALTA COSTABEL, V. "Data freshness and data accuracy : a state of the art". Reportes Técnicos 06-13. UR. FI – INCO, 2006.0797-6410http://hdl.handle.net/20.500.12008/3532In a context of Data Integration Systems (DIS) providing access to large amounts of data extracted and integrated from autonomous data sources, users are highly concerned about data quality. Traditionally, data quality is characterized via multiple quality factors. Among the quality dimensions that have been proposed in the literature, this report analyzes two main ones: data freshness and data accuracy. Concretely, we analyze the various definitions of both quality dimensions, their underlying metrics and the features of DIS that impact their evaluation. We present a taxonomy of existing works proposed for dealing with both quality dimensions in several kinds of DIS and we discuss open research problems.Made available in DSpace on 2014-12-02T16:07:39Z (GMT). No. of bitstreams: 5 TR0613.pdf: 577230 bytes, checksum: 4363be81deb28ba419588a47e23a8d8b (MD5) license_text: 21936 bytes, checksum: 9833653f73f7853880c94a6fead477b1 (MD5) license_url: 49 bytes, checksum: 4afdbb8c545fd630ea7db775da747b2f (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) license.txt: 4244 bytes, checksum: 528b6a3c8c7d0c6e28129d576e989607 (MD5) Previous issue date: 200636 p.application/pdfinUR. FI – INCO.Reportes Técnicos 06-13Las obras depositadas en el Repositorio se rigen por la Ordenanza de los Derechos de la Propiedad Intelectual de la Universidad De La República. (Res. Nº 91 de C.D.C. de 8/III/1994 – D.O. 7/IV/1994) y por la Ordenanza del Repositorio Abierto de la Universidad de la República (Res. Nº 16 de C.D.C. de 07/10/2014)info:eu-repo/semantics/openAccessLicencia Creative Commons Atribución – No Comercial – Sin Derivadas (CC BY-NC-ND 4.0)Data freshnessData accuracyData quality evaluationData integration systemsData freshness and data accuracy :a state of the artReporte técnicoinfo:eu-repo/semantics/reportinfo:eu-repo/semantics/publishedVersionreponame:COLIBRIinstname:Universidad de la Repúblicainstacron:Universidad de la RepúblicaPeralta Costabel, Veronika del CarmenLICENSElicense.txttext/plain4244http://localhost:8080/xmlui/bitstream/20.500.12008/3532/5/license.txt528b6a3c8c7d0c6e28129d576e989607MD55CC-LICENSElicense_textapplication/octet-stream21936http://localhost:8080/xmlui/bitstream/20.500.12008/3532/2/license_text9833653f73f7853880c94a6fead477b1MD52license_urlapplication/octet-stream49http://localhost:8080/xmlui/bitstream/20.500.12008/3532/3/license_url4afdbb8c545fd630ea7db775da747b2fMD53license_rdfapplication/octet-stream23148http://localhost:8080/xmlui/bitstream/20.500.12008/3532/4/license_rdf9da0b6dfac957114c6a7714714b86306MD54ORIGINALTR0613.pdfapplication/pdf577230http://localhost:8080/xmlui/bitstream/20.500.12008/3532/1/TR0613.pdf4363be81deb28ba419588a47e23a8d8bMD5120.500.12008/35322016-05-03 13:27:25.664oai:colibri.udelar.edu.uy:20.500.12008/3532VGVybWlub3MgeSBjb25kaWNpb25lcyByZWxhdGl2YXMgYWwgZGVwb3NpdG8gZGUgb2JyYXMNCg0KDQpMYXMgb2JyYXMgZGVwb3NpdGFkYXMgZW4gZWwgUmVwb3NpdG9yaW8gc2UgcmlnZW4gcG9yIGxhIE9yZGVuYW56YSBkZSBsb3MgRGVyZWNob3MgZGUgbGEgUHJvcGllZGFkIEludGVsZWN0dWFsICBkZSBsYSBVbml2ZXJzaWRhZCBEZSBMYSBSZXDvv71ibGljYS4gKFJlcy4gTu+/vSA5MSBkZSBDLkQuQy4gZGUgOC9JSUkvMTk5NCDvv70gRC5PLiA3L0lWLzE5OTQpIHkgIHBvciBsYSBPcmRlbmFuemEgZGVsIFJlcG9zaXRvcmlvIEFiaWVydG8gZGUgbGEgVW5pdmVyc2lkYWQgZGUgbGEgUmVw77+9YmxpY2EgKFJlcy4gTu+/vSAxNiBkZSBDLkQuQy4gZGUgMDcvMTAvMjAxNCkuIA0KDQpBY2VwdGFuZG8gZWwgYXV0b3IgZXN0b3MgdO+/vXJtaW5vcyB5IGNvbmRpY2lvbmVzIGRlIGRlcO+/vXNpdG8gZW4gQ09MSUJSSSwgbGEgVW5pdmVyc2lkYWQgZGUgUmVw77+9YmxpY2EgcHJvY2VkZXLvv70gYTogIA0KDQphKSBhcmNoaXZhciBt77+9cyBkZSB1bmEgY29waWEgZGUgbGEgb2JyYSBlbiBsb3Mgc2Vydmlkb3JlcyBkZSBsYSBVbml2ZXJzaWRhZCBhIGxvcyBlZmVjdG9zIGRlIGdhcmFudGl6YXIgYWNjZXNvLCBzZWd1cmlkYWQgeSBwcmVzZXJ2YWNp77+9bg0KYikgY29udmVydGlyIGxhIG9icmEgYSBvdHJvcyBmb3JtYXRvcyBzaSBmdWVyYSBuZWNlc2FyaW8gIHBhcmEgZmFjaWxpdGFyIHN1IHByZXNlcnZhY2nvv71uIHkgYWNjZXNpYmlsaWRhZCBzaW4gYWx0ZXJhciBzdSBjb250ZW5pZG8uDQpjKSByZWFsaXphciBsYSBjb211bmljYWNp77+9biBw77+9YmxpY2EgeSBkaXNwb25lciBlbCBhY2Nlc28gbGlicmUgeSBncmF0dWl0byBhIHRyYXbvv71zIGRlIEludGVybmV0IG1lZGlhbnRlIGxhIHB1YmxpY2Fjae+/vW4gZGUgbGEgb2JyYSBiYWpvIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMgc2VsZWNjaW9uYWRhIHBvciBlbCBwcm9waW8gYXV0b3IuDQoNCg0KRW4gY2FzbyBxdWUgZWwgYXV0b3IgaGF5YSBkaWZ1bmRpZG8geSBkYWRvIGEgcHVibGljaWRhZCBhIGxhIG9icmEgZW4gZm9ybWEgcHJldmlhLCAgcG9kcu+/vSBzb2xpY2l0YXIgdW4gcGVy77+9b2RvIGRlIGVtYmFyZ28gc29icmUgbGEgZGlzcG9uaWJpbGlkYWQgcO+/vWJsaWNhIGRlIGxhIG1pc21hLCBlbCBjdWFsIGNvbWVuemFy77+9IGEgcGFydGlyIGRlIGxhIGFjZXB0YWNp77+9biBkZSBlc3RlIGRvY3VtZW50byB5IGhhc3RhIGxhIGZlY2hhIHF1ZSBpbmRpcXVlIC4NCg0KRWwgYXV0b3IgYXNlZ3VyYSBxdWUgbGEgb2JyYSBubyBpbmZyaWdlIG5pbmfvv71uIGRlcmVjaG8gc29icmUgdGVyY2Vyb3MsIHlhIHNlYSBkZSBwcm9waWVkYWQgaW50ZWxlY3R1YWwgbyBjdWFscXVpZXIgb3Ryby4NCg0KRWwgYXV0b3IgZ2FyYW50aXphIHF1ZSBzaSBlbCBkb2N1bWVudG8gY29udGllbmUgbWF0ZXJpYWxlcyBkZSBsb3MgY3VhbGVzIG5vIHRpZW5lIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgIGhhIG9idGVuaWRvIGVsIHBlcm1pc28gZGVsIHByb3BpZXRhcmlvIGRlIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgeSBxdWUgZXNlIG1hdGVyaWFsIGN1eW9zIGRlcmVjaG9zIHNvbiBkZSB0ZXJjZXJvcyBlc3Tvv70gY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8geSByZWNvbm9jaWRvIGVuIGVsIHRleHRvIG8gY29udGVuaWRvIGRlbCBkb2N1bWVudG8gZGVwb3NpdGFkbyBlbiBlbCBSZXBvc2l0b3Jpby4NCg0KRW4gb2JyYXMgZGUgYXV0b3Lvv71hIG3vv71sdGlwbGUgL3NlIHByZXN1bWUvIHF1ZSBlbCBhdXRvciBkZXBvc2l0YW50ZSBkZWNsYXJhIHF1ZSBoYSByZWNhYmFkbyBlbCBjb25zZW50aW1pZW50byBkZSB0b2RvcyBsb3MgYXV0b3JlcyBwYXJhIHB1YmxpY2FybGEgZW4gZWwgUmVwb3NpdG9yaW8sIHNpZW5kbyDvv71zdGUgZWwg77+9bmljbyByZXNwb25zYWJsZSBmcmVudGUgYSBjdWFscXVpZXIgdGlwbyBkZSByZWNsYW1hY2nvv71uIGRlIGxvcyBvdHJvcyBjb2F1dG9yZXMuDQoNCkVsIGF1dG9yIHNlcu+/vSByZXNwb25zYWJsZSBkZWwgY29udGVuaWRvIGRlIGxvcyBkb2N1bWVudG9zIHF1ZSBkZXBvc2l0YS4gTGEgVURFTEFSIG5vIHNlcu+/vSByZXNwb25zYWJsZSBwb3IgbGFzIGV2ZW50dWFsZXMgdmlvbGFjaW9uZXMgYWwgZGVyZWNobyBkZSBwcm9waWVkYWQgaW50ZWxlY3R1YWwgZW4gcXVlIHB1ZWRhIGluY3VycmlyIGVsIGF1dG9yLg0KDQpBbnRlIGN1YWxxdWllciBkZW51bmNpYSBkZSB2aW9sYWNp77+9biBkZSBkZXJlY2hvcyBkZSBwcm9waWVkYWQgaW50ZWxlY3R1YWwsIGxhIFVERUxBUiAgYWRvcHRhcu+/vSB0b2RhcyBsYXMgbWVkaWRhcyBuZWNlc2FyaWFzIHBhcmEgZXZpdGFyIGxhIGNvbnRpbnVhY2nvv71uIGRlIGRpY2hhIGluZnJhY2Np77+9biwgbGFzIHF1ZSBwb2Ry77+9biBpbmNsdWlyIGVsIHJldGlybyBkZWwgYWNjZXNvIGEgbG9zIGNvbnRlbmlkb3MgeS9vIG1ldGFkYXRvcyBkZWwgZG9jdW1lbnRvIHJlc3BlY3Rpdm8uDQoNCkxhIG9icmEgc2UgcG9uZHLvv70gYSBkaXNwb3NpY2nvv71uIGRlbCBw77+9YmxpY28gYSB0cmF277+9cyBkZSBsYXMgbGljZW5jaWFzIENyZWF0aXZlIENvbW1vbnMsIGVsIGF1dG9yIHBvZHLvv70gc2VsZWNjaW9uYXIgdW5hIGRlIGxhcyA2IGxpY2VuY2lhcyBkaXNwb25pYmxlczoNCg0KDQpBdHJpYnVjae+/vW4gKENDIC0gQnkpOiBQZXJtaXRlIHVzYXIgbGEgb2JyYSB5IGdlbmVyYXIgb2JyYXMgZGVyaXZhZGFzLCBpbmNsdXNvIGNvbiBmaW5lcyBjb21lcmNpYWxlcywgc2llbXByZSBxdWUgc2UgcmVjb25vemNhIGFsIGF1dG9yLg0KDQpBdHJpYnVjae+/vW4g77+9IENvbXBhcnRpciBJZ3VhbCAoQ0MgLSBCeS1TQSk6IFBlcm1pdGUgdXNhciBsYSBvYnJhIHkgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBwZXJvIGxhIGRpc3RyaWJ1Y2nvv71uIGRlIGxhcyBvYnJhcyBkZXJpdmFkYXMgZGViZSBoYWNlcnNlIG1lZGlhbnRlIHVuYSBsaWNlbmNpYSBpZO+/vW50aWNhIGEgbGEgZGUgbGEgb2JyYSBvcmlnaW5hbCwgcmVjb25vY2llbmRvIGEgbG9zIGF1dG9yZXMuDQoNCkF0cmlidWNp77+9biDvv70gTm8gQ29tZXJjaWFsIChDQyAtIEJ5LU5DKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzLCByZWNvbm9jaWVuZG8gYWwgYXV0b3IuDQoNCkF0cmlidWNp77+9biDvv70gU2luIERlcml2YWRhcyAoQ0MgLSBCeS1ORCk6IFBlcm1pdGUgZWwgdXNvIGRlIGxhIG9icmEsIGluY2x1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBwZXJvIG5vIHNlIHBlcm1pdGUgZ2VuZXJhciBvYnJhcyBkZXJpdmFkYXMsIGRlYmllbmRvIHJlY29ub2NlciBhbCBhdXRvci4NCg0KQXRyaWJ1Y2nvv71uIO+/vSBObyBDb21lcmNpYWwg77+9IENvbXBhcnRpciBJZ3VhbCAoQ0Mg77+9IEJ5LU5DLVNBKTogUGVybWl0ZSB1c2FyIGxhIG9icmEgeSBnZW5lcmFyIG9icmFzIGRlcml2YWRhcywgc2llbXByZSB5IGN1YW5kbyBlc29zIHVzb3Mgbm8gdGVuZ2FuIGZpbmVzIGNvbWVyY2lhbGVzIHkgbGEgZGlzdHJpYnVjae+/vW4gZGUgbGFzIG9icmFzIGRlcml2YWRhcyBzZSBoYWdhIG1lZGlhbnRlIGxpY2VuY2lhIGlk77+9bnRpY2EgYSBsYSBkZSBsYSBvYnJhIG9yaWdpbmFsLCByZWNvbm9jaWVuZG8gYSBsb3MgYXV0b3Jlcy4NCg0KQXRyaWJ1Y2nvv71uIO+/vSBObyBDb21lcmNpYWwg77+9IFNpbiBEZXJpdmFkYXMgKENDIC0gQnktTkMtTkQpOiBQZXJtaXRlIHVzYXIgbGEgb2JyYSwgcGVybyBubyBzZSBwZXJtaXRlIGdlbmVyYXIgb2JyYXMgZGVyaXZhZGFzIHkgbm8gc2UgcGVybWl0ZSB1c28gY29uIGZpbmVzIGNvbWVyY2lhbGVzLCBkZWJpZW5kbyByZWNvbm9jZXIgYWwgYXV0b3IuDQoNCkxvcyB1c29zIHByZXZpc3RvcyBlbiBsYXMgbGljZW5jaWFzIGluY2x1eWVuIGxhIGVuYWplbmFjae+/vW4sIHJlcHJvZHVjY2nvv71uLCBjb211bmljYWNp77+9biwgcHVibGljYWNp77+9biwgZGlzdHJpYnVjae+/vW4geSBwdWVzdGEgYSBkaXNwb3NpY2nvv71uIGRlbCBw77+9YmxpY28uIExhIGNyZWFjae+/vW4gZGUgb2JyYXMgZGVyaXZhZGFzIGluY2x1eWUgbGEgYWRhcHRhY2nvv71uLCB0cmFkdWNjae+/vW4geSBlbCByZW1peC4NCg0KQ3VhbmRvIHNlIHNlbGVjY2lvbmUgdW5hIGxpY2VuY2lhIHF1ZSBoYWJpbGl0ZSB1c29zIGNvbWVyY2lhbGVzLCBlbCBkZXDvv71zaXRvIGRlYmVy77+9IHNlciBhY29tcGHvv71hZG8gZGVsIGF2YWwgZGVsIGplcmFyY2Egbe+/vXhpbW8gZGVsIFNlcnZpY2lvIGNvcnJlc3BvbmRpZW50ZS4NCg0KDQoNCg0KDQoNCg0KDQo=Universidadhttps://udelar.edu.uy/https://www.colibri.udelar.edu.uy/oai/requestmabel.seroubian@seciu.edu.uyUruguayopendoar:47712024-07-25T14:33:59.833659COLIBRI - Universidad de la Repúblicafalse
spellingShingle Data freshness and data accuracy :a state of the art
Peralta Costabel, Veronika del Carmen
Data freshness
Data accuracy
Data quality evaluation
Data integration systems
status_str publishedVersion
title Data freshness and data accuracy :a state of the art
title_full Data freshness and data accuracy :a state of the art
title_fullStr Data freshness and data accuracy :a state of the art
title_full_unstemmed Data freshness and data accuracy :a state of the art
title_short Data freshness and data accuracy :a state of the art
title_sort Data freshness and data accuracy :a state of the art
topic Data freshness
Data accuracy
Data quality evaluation
Data integration systems
url http://hdl.handle.net/20.500.12008/3532