ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Enhanced Automatically Mining Facets for Queries and Clustering with Side Information Model

Journal: Bonfring International Journal of Software Engineering and Soft Computing (Vol.8, No. 2)

Publication Date:

Authors : ; ;

Page : 01-06

Keywords : Data Mining; Classification; TF-IDF; K-Mean Clustering; Statistical Mean Validation.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In this paper describe a specific type of summaries that Query facet the main topic of given text. Existing summarization algorithms are classified into different categories in terms of their summary construction methods (abstractive or extractive), the number of sources for the summary (single document or multiple documents), types of information in the summary (indicative or informative), and the relationship between summary and query (generic or query-based). QD Miner aims to offer the possibility of finding the main points of multiple documents and thus save users? time on reading whole documents. The difference is that most existing summarization systems dedicate themselves to generating summaries using sentences extracted from documents. In addition, return multiple groups of semantically related items, while they return a flat list of sentences. In this paper, adding these lists may improve both accuracy and recall of query facets. Part-of-speech information can be used to check the homogeneity of lists and improve the quality of query facets. The side-information could not be incorporate into the mining process, because it can either improve the quality of the representation for the mining process, or can add noise to the process. Therefore, a principle way is required to perform the mining process, so as to maximize the advantages from using this side information. This dissertation proposes an algorithm which combines classical partitioning algorithms with probabilistic models in order to create an effective clustering approach.

Last modified: 2018-10-27 15:46:41