ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Enhancing MEDLINE Document Clustering using SSNCUT With MS and GC Constraints.

Journal: International Journal of Engineering Sciences & Research Technology (IJESRT) (Vol.3, No. 3)

Publication Date:

Authors : ; ; ; ;

Page : 1256-1262

Keywords : Biomedical text mining; document clustering; semi supervised clustering.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The Global content and Mesh Semantic information are considered for clustering the biomedical documents from whole MEDLER collection and Mesh Semantic information. Previously by using Semi supervised Non Negative Matrix Factorization for clustering biomedical documents are not efficient for integrating more information and inefficacious because of limited space representation for combining different analogies. To overcome this limitation a Semi supervised Normalized cut and MPCKmeans algorithm is proposed over this analogies with two constraints ML and CL constraints. And the performance of the above algorithms are demonstrated on MEDLINE document clustering.Another interesting finding was that ML constraints more effectively worked than CL constraints. We evaluate the proposed method on benchmark datasets and the results demonstrate consistent and substantial improvements over the current state. Experimental results show that integrating the semantic and content similarities outperforms the case of using only one of the two similarities, being statistically significant. We further find the best parameter setting that is consistent over all experimental conditions conducted. And finally show a typical example of resultant clusters, confirming the effectiveness of our strategy in improving MEDLINE document clustering.

Last modified: 2014-05-22 18:06:49