ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Hadoop Distributed File System and Map Reduce Processing on Multi-Node Cluster

Journal: International Journal of Science and Research (IJSR) (Vol.4, No. 8)

Publication Date:

Authors : ; ;

Page : 1424-1430

Keywords : Apache Hadoop; Hadoop Distributed File System; Map Reduce;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Big Data relates to large-volume of growing data which are stored at multiple and autonomous sources. It is a collection of both structured and unstructured data that is too large, fast and distinct to be managed by traditional database management tools or traditional data processing application models. The most fundamental challenges for Big Data applications is to store the large volumes of data on multiple sources and to process it for extracting useful information or knowledge for future actions. Apache Hadoop [1] is a framework that provides reliable shared storage and distributed processing of large data sets across clusters of commodity computers using a programming model. Data Storage is provided by Hadoop Distributed File System (HDFS) [3] and data processing is provided by Map Reduce [2]. The main goal of the project is to implement the core components of Hadoop by designing a multimode cluster and build a common base platform HDFS for storing of huge data at multiple sources and perform Map Reduce processing model on data stored at these multiple nodes.

Last modified: 2021-06-30 21:52:09