Survey on Schedulers Optimization to Handle Multiple Jobs in Hadoop Cluster

Journal: International Journal of Science and Research (IJSR) (Vol.4, No. 3)

Publication Date: 2015-03-05

Authors : Shivaraj B. G.; Nagaraj Naik;

Page : 1179-1184

Keywords : Big Data; Hadoop; HDFS; MapReduce; Schedulers optimization;

Source : Download Find it from : Google Scholar

Abstract

An apache Hadoop project is a good platform that supports cost-effective such as commodity hardware implementation, with scalable infrastructure called Hadoop Distributed File System HDFS with parallel processing mechanism called MapReduce. Hadoop is well known for Big Data analytics requires more resources for collecting, storing, processing petabytes of large data requires one meaningful resource management. Hadoop handles jobs in batch modes, allocating resources, scheduling to these modes is an important issue in terms of Network Bandwidth, CPU time and Memory. Resources handled by MapReduce schedulers assigning resources in the form of MapReduceTasks. The MapReduceTasks are carefully handled by MapReduce schedulers by setting some benchmarks for individual namely MapCombineReducer and different Block Sizes, which ensures schedulers are optimized to achieve maximum efficiency in storage capacity, time and cost for handling multiple batch jobs in multiple cluster while guaranteeing the effectiveness of individual scheduler in terms of job execution.

Main Menu

Searching By

PARTNERS

Survey on Schedulers Optimization to Handle Multiple Jobs in Hadoop Cluster

Abstract

Advertisement