ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Survey Paper on Load Rebalancing for Distributed File Systems in Clouds

Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 11)

Publication Date:

Authors : ; ;

Page : 2993-2996

Keywords : Load balance; Distributed File Systems; Cloud; Distributed Hash Table; MapReduce;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Distributed file systems (DFS) are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In Distributed file systems (DFS), nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks and allocated in distinct nodes so MapReduce tasks can be performed in parallel over the nodes. And in a cloud failure is the norms/files, and nodes/files may be upgraded, replaced, added in the system. Files can also be dynamically created, deleted, and updated/appended. However, this results in load imbalance in a distributed file system (DFS) ; that is; the file chunks are not distributed as uniformly as possible among the nodes/files. Emerging distributed file systems in production systems strongly depend on the central node for chunk reallocation/migration. And this dependence is clearly inadequate in a large-scale, failure-prone environment because of the central load balancer is put under considerable workload that is linearly scaled with the system size. This may thus become the performance bottleneck and the single point of failure in DFS. In this paper, a fully distributed load rebalancing algorithm is used to present to cope with the load imbalance problem. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature (related work). The simulation results indicate that our proposal system is comparable with the existing centralized approach to and considerably outperforms the prior distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead.

Last modified: 2021-06-30 21:12:54