ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Execution of an Advanced Data Analytics by Integrating Spark with MongoDB

Proceeding: The International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA2016)

Publication Date:

Authors : ; ; ;

Page : 39-48

Keywords : Spark SQL; MongoDB; NoSQL Databases; MongoDB Connector for Apache Spark; Data Analytics with Spark SQL and MongoDB; Data Mining on NoSQL Data;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Spark has several advantages compared to other big data and MapReduce technologies like Hadoop and Storm. Spark provides a comprehensive, unified framework to manage big data processing requirements with a variety of data sets that are diverse in nature (text data, graph data etc.) as well as the source of data (batch vs. real-time streaming data). Spark SQL is an easy-to-use and power API provided by Apache Spark. Spark SQL makes it much easier reading and writing data to do analysis. MongoDB Connector for Apache Spark is a powerful integration that enables developers and data scientists to create new insights and drive real-time action on live, operational, and streaming data. This paper demonstrates some experimentation on the MongoDB Connector for Apache Spark that how Spark SQL library can be used to store, retrieve and execute the structured/semi-structured datasets such as BSON against the Non-Relational database MongoDB, an open-source and leading NoSQL database.

Last modified: 2016-09-21 00:18:55