ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment

Journal: Journal on Today's Ideas - Tomorrow's Technologies (Vol.2, No. 2)

Publication Date:

Authors : ; ;

Page : 153-160

Keywords : Data inconsistency; identification of errors; organization growth; ETL; data quality;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In today’s scenario, Extraction?transformation? loading (ETL) tools have become important pieces of software responsible for integrating heterogeneous information from several sources. The task of carrying out the ETL process is potentially a complex, hard and time consuming. Organisations now ?a-days are concerned about vast qualities of data. The data quality is concerned with technical issues in data warehouse environment. Research in last few decades has laid more stress on data quality issues in a data warehouse ETL process. The data quality can be ensured cleaning the data prior to loading the data into a warehouse. Since the data is collected from various sources, it comes in various formats. The standardization of formats and cleaning such data becomes the need of clean data warehouse environment. Data quality attributes like accuracy, correctness, consistency, timeliness are required for a Knowledge discovery process. The present state -of ?the- art purpose of the research work is to deal on data quality issues at all the aforementioned stages of data warehousing 1) Data sources, 2) Data integration 3) Data staging, 4) Data warehouse modelling and schematic design and to formulate descriptive classification of these causes. The discovered knowledge is used to repair the data deficiencies. This work proposes a framework for quality of extraction transformation and loading of data into a warehouse.

Last modified: 2016-04-27 18:22:43