Volume 5, Issue 10, 2015

By | August 11, 2018

Data Quality Based Data Integration Approach

Mohamed Samir Abdel-Moneim, College of Computing and Information Technology, Arab Academy for Science Technology & Maritime Transport, Egypt.
Ali Hamed El-Bastawissy, Faculty of Computer Science, MSA University, Egypt.
Mohamed Hamed Kholief, College of Computing and Information Technology, Arab Academy for Science Technology & Maritime Transport, Egypt.

Abstract—Data integration systems (DIS) are systems where query answers are collected from a set of heterogeneous and autonomous data sources. Data integration systems can improve results by detecting the quality of the data sources and retrieve answers from the significant ones only. The quality measures of the data in the data sources not only help in determining the significant data sources for a given query but also help data integration systems produce results in a reasonable amount of time and with less errors. In this paper, we perform an experiment that shows a mechanism used to calculate and store a set of quality measures on data sources. The quality measures are, then, interactively used in selecting the most significant candidates of data sources to answer users’ queries. The justification and evaluations are done using amalgam and THALIA benchmarks. We show that our approach dramatically improves query’s answers.

Keywords-component data integration; quality measures; data sources; query answers; user preferences.

Leave a Reply

Your email address will not be published. Required fields are marked *