ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A SURVEY ON HADOOP TECHNOLOGY TO DEVELOP ETL FOR EFFICIENT DATAWAREHOUSE

Journal: International Journal of Information Technology and Management information System (IJITMIS) (Vol.6, No. 1)

Publication Date:

Authors : ; ;

Page : 11-17

Keywords : Datawarehouse; ETL; Hadoop; Hive; MapReduce;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In recent years, there is a tremendous growth in data volume. Also many other sources than traditional structured data ex. log files, web data, stream data and sensor data need to be stored in the OLTP. It is not suitable for an organization to neglect valuable information from these sources. ETL tools extract meaningful information from various data sources, various transformations of data are carried out in transformation phase and then load into the data warehouse. Traditionally, commercial ETL (Extract ?transform-load) tools ex. Informatica, Micro strategy ,Pentho etc. used to transfer OLTP data to other database known as data warehouse . MapReduce technology is becoming popular among people with specialty of ETL task compelling organization to gain benefits from it. Hadoop, an open source MapReduce framework, is capable of handling massive data, provide cheap storage, process structured as well as unstructured data and has massive scalability. It can be seen as viable alternative for migrating ETL job. Although Hadoop is beneficial for large scale industries, many small organizations having small amount of data is also looking for leveraging their business intelligence on it. In this paper, we will explore the new opportunities of utilizing Hadoop for performing business intelligence with specifically ETL phase of datawarehouse.

Last modified: 2015-06-17 16:29:39