Text Clustering and Classification on the Use of Side InformationJournal: International Journal of Science and Research (IJSR) (Vol.3, No. 10)
Publication Date: 2014-10-05
Authors : Shilpa S. Raut; V. B. Maral;
Page : 2135-2136
Keywords : clustering; classifiers information; text mining; text collection; clustering methods;
Side-information is present with the text document in many text mining applications. An user-access behavior from web logs, or other non-textual attributes embedded into the text document, the links in the document, document provenance information etc are nothing but side information. These attributes contains a vast amount of information for clustering purposes. But it is difficult to estimate the relative importance when some information is noisy. In that case, it will be risky to incorporate side-information into mining process as there is possibility that it will increase the quality of the representation for the mining process or may add a noise to process. Thus a proper way to carry out the mining process is needed such that it will maximize the advantages form using side information. So in this topic, an algorithm is designed, in order to give an effective clustering algorithm. This algorithm combines classical partitioning algorithms with probabilistic models, then show how to extend the approach to the classification problem.
Other Latest Articles
Last modified: 2021-06-30 21:10:56