AUTOMATIC DOCUMENT CLUSTERING
Journal: International Journal of Computer Engineering and Technology (IJCET) (Vol.6, No. 5)Publication Date: 2015-06-17
Authors : Mona Pardeshi; Neha Puranik; Aishwarya Tiwari; P.Y.Pawar;
Page : 8-12
Keywords : Document Clustering; Stemmer; Stop words removal; TF*IDF; Tokenization;
Abstract
Automatic document clustering has played an important role in the field of information retrieval. The aim of the developed this system is to store documents in clusters and to improve its retrieval efficiently. Clustering is a technique aimed at grouping a set of objects into clusters. Document clustering is the task of combining a set of documents into clusters so that similar type of documents will be store in one cluster. We applied non overlapping method to store document into cluster. In this project, we write an algorithm which will calculate similarity of document’s keywords and according to its similarity points it will either put into existing cluster or new cluster is created and stored into that cluster. To find keywords from document various techniques are used like tokenization, stop word removal, stemmer, TF*IDF calculation.
Other Latest Articles
- RESULTS OF EXPERIENCE ON WATER OF EXCRESCENCE PONDS TREATMENT BY ZEOLITE
- RESEARCH CHALLENGES AND ISSUES IN WEB SECURITY
- A RESEARCH STUDY ON RECENT SKIN COLOR BASED STATISTICAL SEGMENTATION MODELING TECHNIQUES
- STABILITY STUDY OF A MULTI MACHINE SYSTEM USING MODIFIED ROBUST CO-ORDINATE AVR AND POWER SYSTEM STABILIZER
- ANALYSIS OF DISTANCE PROTECTION RELAY IN PRESENCE OF STATIC SYNCHRONOUS COMPENSATOR (STATCOM)
Last modified: 2015-06-17 17:02:42