A Review of Big Data Clustering Methods and Research Issues

Journal: International Journal of Science and Research (IJSR) (Vol.9, No. 5)

Publication Date: 2020-05-05

Authors : Nweso Emmanuel Nwogbaga;

Page : 253-264

Keywords : Big Data; Big Data Mining; Clustering; IoT Big Data Clustering; Distance/Similarity Measures; Unsupervised Learning;

Source : Download Find it from : Google Scholar

Abstract

Data mining is a method for knowledge discovery from a dataset. The world today is moving toward data-driven in all ramifications, ranging from education, health care, security, customers’ management, smart city, etc. Unsupervised learning like clustering is the most big-data mining technique used for grouping large dataset when there is no prior information about the classes in the dataset. The use of the internet of things (wearable, sensors, RFID) and social networks has drastically increased data in the cyber-physical world resulting in what is called Big Data. With the increase in big data as a result of cloud computing, it has proliferated research on knowledge discovery on these avalanche of big data. Clustering is used to extract valuable hidden information from massive complex data. Clustering as unsupervised learning has an advantage over supervised learning when it comes to knowledge discovery in a huge dataset without a prior knowledge of the groups. In this review, we discussed big data mining techniques and narrowed it to clustering method. We also discussed different clustering approaches, and similarities measures used in clustering algorithms. Finally, we discussed the strength and weaknesses of clustering approaches and the research issues in clustering big data for information discovery.

Main Menu

Searching By

PARTNERS

A Review of Big Data Clustering Methods and Research Issues

Abstract

Advertisement