Extensible framework for data cleaning

Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Data quality concerns arise when one wants to correct anomalies in a single data source, or when one wants to integrate data coming from multiple sources into a single new data source. The main quality problem that arises is that the same real object is modeled by different data records. This is called the Object Identity Problem and may result from several factors. Correcting the Object Identity Problem is ensured by a set of software solutions called data cleaning tools. A new tool, called AJAX, is proposed whose main goal is to facilitate the specification and execution of data cleaning programs either for a single source or for integrating multiple data sources.

Original languageEnglish (US)
Title of host publicationProceedings - International Conference on Data Engineering
Number of pages1
StatePublished - 2000
Event2000 IEEE 16th International Conference on Data Engineering (ICDE'00) - San Diego, CA, USA
Duration: Feb 29 2000Mar 3 2000


Other2000 IEEE 16th International Conference on Data Engineering (ICDE'00)
CitySan Diego, CA, USA

ASJC Scopus subject areas

  • Software
  • General Engineering
  • Engineering (miscellaneous)


Dive into the research topics of 'Extensible framework for data cleaning'. Together they form a unique fingerprint.

Cite this