Document Clustering Using Concept Weight?Journal: International Journal of Computer Science and Mobile Computing - IJCSMC (Vol.3, No. 5)
Publication Date: 2014-05-30
Authors : Sapna Gupta; Vikrant Chole;
Page : 1207-1210
Keywords : Document Clustering; semantic similarity; Ontology; Concept weight;
Traditional techniques for clustering the document are mostly based on the number of occurrences and the existence of keywords. Similarly Phrase based clustering technique ignores the semantics behind the words, only captures the order in which the words appear in a sentence. The term frequency based clustering techniques takes the documents as bag-of words while ignoring the semantic relationship between the words. Considering the drawbacks of such system this paper proposes a concept based clustering technique. The ideology behind this concept is, it uses Medical Subject Headings MeSH ontology for extracting the concept and the concept weight calculation is done by its identity and relationship with its synonym. K-medoid algorithm is used for clustering documents on Semantic through which the results are analyzed.
Other Latest Articles
Last modified: 2014-06-01 00:17:27