ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

A Hybrid Algorithm for Document Clustrering Using Concept Factorization

Journal: International Journal of Engineering Sciences & Research Technology (IJESRT) (Vol.3, No. 7)

Publication Date:

Authors : ; ;

Page : 269-275

Keywords : http://www.ijesrt.com/issues%20pdf%20file/Archives-2014/July-2014/40.pdf;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Massive amount of assorted information is available on the web. Clustering is one of the techniques to deal with huge amount of information. Clustering partitions a data set into groups where data objects in each group should exhibit large measure of resemblance. Objects with high resemblance measure should be placed in a cluster (intra cluster). Resemblance between the objects of different clusters should be less (inter cluster). The most commonly used partitioning-based clustering algorithm, the K-means algorithm, is more suitable for bulky datasets. K-means algorithm is simple, straightforward, easy to implement and works well in many applications. K means algorithm has the limitation of generating local optimal solution. Harmony Search Method (HSM) is a new meta- heuristic optimization method which imitates the music improvisation process. HSM has been a successful technique in a wide variety of optimization problem. Better results can be obtained by hybridizing K-means with HSM. In conventional clustering methods, Term Frequency and Inverse Document Frequency(TF-IDF) of a feature can be calculated and the documents are clustered. In, the projected work an effort has been made to apply the concept factorization method for document clustering problem, to find optimal clusters in sufficient amount of time.

Last modified: 2014-08-04 18:22:40