MapReduce: Simplified Data Processing on Large Cluster
Journal: International Journal of Research and Engineering (Vol.5, No. 5)Publication Date: 2018-06-07
Authors : Muthu Dayalan;
Page : 399-403
Keywords : MapReduce; Large Cluster; Data Mining; Big Data; Reducers; Partitioners;
Abstract
MapReduce is a data processing approach, where a single machine acts as a master, assigning map/reduce tasks to all the other machines attached in the cluster. Technically, it could be considered as a programming model, which is applied in generating, implementation and generating large data sets. The key concept behind MapReduce is that the programmer is required to state the current problem in two basic functions, map and reduce. The scalability is handles within the system, rather than being handled by the concerned programmer. By applying various restrictions on the applied programming style, MapReduce performs several moderated functions such fault tolerance, locality optimization, load balancing as well as massive parallelization. Intermediate k/v pairs are generated by the Map, and then fed o the reduce workers by the use of the incorporated file system. The data received by the reduce workers is then merged using the same key, to produce multiple output file to the concerned user (Dean & Ghemawat, 2008). Additionally, the programmer is only required to master and write the codes regarding the easy to understand functionality.
Other Latest Articles
- Determining the Impact of TQM Principles on Strategic Performance
- Optimal Sizing and Location of Shunt Capacitors in Medium Voltage Underground Power Cables: A Case of Minimum Cost
- Influence of Allelochemicals Substances in Eucalyptus Species on Agricultural Crops: A Review
- A Perspective Review on E-Learning Promotions and Development
- A Review on Product Rating Behaviour Analytics
Last modified: 2018-06-08 07:25:23