ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Review of Big Data Clustering Methods and Research Issues

Journal: International Journal of Science and Research (IJSR) (Vol.9, No. 5)

Publication Date:

Authors : ;

Page : 253-264

Keywords : Big Data; Big Data Mining; Clustering; IoT Big Data Clustering; Distance/Similarity Measures; Unsupervised Learning;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Data mining is a method for knowledge discovery from a dataset. The world today is moving toward data-driven in all ramifications, ranging from education, health care, security, customers’ management, smart city, etc. Unsupervised learning like clustering is the most big-data mining technique used for grouping large dataset when there is no prior information about the classes in the dataset. The use of the internet of things (wearable, sensors, RFID) and social networks has drastically increased data in the cyber-physical world resulting in what is called Big Data. With the increase in big data as a result of cloud computing, it has proliferated research on knowledge discovery on these avalanche of big data. Clustering is used to extract valuable hidden information from massive complex data. Clustering as unsupervised learning has an advantage over supervised learning when it comes to knowledge discovery in a huge dataset without a prior knowledge of the groups. In this review, we discussed big data mining techniques and narrowed it to clustering method. We also discussed different clustering approaches, and similarities measures used in clustering algorithms. Finally, we discussed the strength and weaknesses of clustering approaches and the research issues in clustering big data for information discovery.

Last modified: 2021-06-28 17:06:43