ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

COMPARISON OF MICRO-BATCH AND STREARMING ENGINE ON REAL TIME DATA

Journal: International Journal of Engineering Sciences & Research Technology (IJESRT) (Vol.6, No. 4)

Publication Date:

Authors : ; ; ;

Page : 756-761

Keywords : ;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Big Data analytics has recently gained increasing popularity as a tool to process large amounts of data on-demand. Spark and Flink are two Apache-hosted data analytics frameworks that facilitate the development of multi-step data pipelines using directly acyclic graph patterns. Making the most out of these frameworks is challenging because efficient executions strongly rely on complex parameter configurations and on an in-depth understanding of the underlying architectural choices. Although extensive research has been devoted to improving and evaluating the performance of such analytics frameworks, most of them benchmark the platforms against Hadoop, as a baseline, a rather unfair comparison considering the fundamentally different design principles. This paper aims to bring some justice in this respect, by directly evaluating the performance of Spark and Flink. Our goal is to identify and explain the impact of the different architectural choices and the parameter configurations on the perceived end-to-end performance. To compare the performance of Flink and Spark streaming using E-commerce data. Flink and Spark are both general-purpose data processing platforms and top level projects of the Apache Software Foundation (ASF). They have a wide field of application and are usable for dozens of big data scenarios

Last modified: 2017-05-01 21:16:21