ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Performance Analysis of Multi-Node Hadoop Clusters using Amazon EC2 Instances

Journal: International Journal of Science and Research (IJSR) (Vol.4, No. 10)

Publication Date:

Authors : ; ;

Page : 1646-1650

Keywords : Cloud Computing; Hadoop; MapReduce; Multi-node cluster;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Hadoop, an open source implementation of MapReduce model, is an effective tool for handling, processing and analyzing unstructured data generated these days by different cloud applications. Hadoop considers its nodes to be homogeneous in terms of their processing capability in a cluster. But in real word applications nodes in a cluster are heterogeneous in terms of their processing capability. In such cases, Hadoop does not yields effective performance levels In this paper, we had evaluated and analyzed the performance of WordCount MapReduce application using Hadoop on Amazon EC2 using different Ubuntu instances. The performance has been evaluated both on single node and multi-node clusters. Multi-node clusters include both the homogeneous and the heterogeneous clusters. The performance is evaluated in terms of execution time of the application on different file sizes.

Last modified: 2021-07-01 14:25:16