Text Clustering and Classification on the Use of Side Information
Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 10)Publication Date: 2014-10-05
Authors : Shilpa S. Raut; V. B. Maral;
Page : 2135-2136
Keywords : clustering; classifiers information; text mining; text collection; clustering methods;
Abstract
Side-information is present with the text document in many text mining applications. An user-access behavior from web logs, or other non-textual attributes embedded into the text document, the links in the document, document provenance information etc are nothing but side information. These attributes contains a vast amount of information for clustering purposes. But it is difficult to estimate the relative importance when some information is noisy. In that case, it will be risky to incorporate side-information into mining process as there is possibility that it will increase the quality of the representation for the mining process or may add a noise to process. Thus a proper way to carry out the mining process is needed such that it will maximize the advantages form using side information. So in this topic, an algorithm is designed, in order to give an effective clustering algorithm. This algorithm combines classical partitioning algorithms with probabilistic models, then show how to extend the approach to the classification problem.
Other Latest Articles
- A Global Earthquake and Flood Alerting System using MEMS and GSM
- NLO, XRD, FTIR, Studies of Paranitroaniline Mixed KDP
- Medical Image Compression Using ISPIHT&JPEG2000 Hybrid
- Efficacy of Carbendazim and other Fungicides on the Development of Resistance during Passage in Alternaria Alternata Causing Root Rot to Fenugreek
- Role of Fraud Prevention in Enhancing Effective Financial Reporting in County Governments in Kenya: Case of Nakuru County, Kenya
Last modified: 2021-06-30 21:10:56