ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Dynamic Data Deduuplication and Replication with HDFS: using Big Data Flow Test to Analyze Fuel Filter Element of an Aircraft

Journal: International Journal for Modern Trends in Science and Technology (IJMTST) (Vol.3, No. 10)

Publication Date:

Authors : ; ;

Page : 133-137

Keywords : IJMTST;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Now-a-days Apache Hadoop could be very famous applications based totally on Apache Hadoop are growing in nowadays due to its lively and different functions. Hadoop dispensed file gadget(HDFS) is the heart of Apache Hadoop which is reliable and exceptionally to be had.it's far having static Replication approach by way of default.Therefore a more sophisticated framework is required to handle these data. Apache Hadoop is one of the best known platforms for distributed storing and processing of big data across clusters of computers. The storage component of Hadoop, Hadoop Distributed File System (HDFS) maintains a default replication factor for each file as three, which is placed in separate nodes. HDFS provides high performance access to data by applying a static and default replication strategy. Though HDFS ensures high reliability, scalability and high availability, its static and default approach in data replication requires large amount of storage space. With a replication factor of three, a file is copied three times in different nodes. In this paper we propose an efficient dynamic data replication management system, which consider the popularity of files stored in HDFS before replication. This strategy dynamically classifies the files to hot data or cold data based on its popularity and increases the replica of hot data by applying erasure coding for cold data. The experiment results show that the proposed method effectively reduces the storage utilization up to 50% without affecting the availability and fault tolerance in HDFS. Because of this dynamic technique, specific Replication approach and Erasure Code Mechanism to improve Availability and Reliability.

Last modified: 2017-10-31 23:33:31