ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Analysis of methods of determining the term weight at textual documents

Journal: Scientific review, Науковий огляд, Научное обозрение (Vol.3, No. 46)

Publication Date:

Authors : ;

Page : 112-123

Keywords : data mining; classification of textual information; content analysis; machine learning; classification algorithms;

Source : Downloadexternal Find it from : Google Scholarexternal


The work is devoted to the development of methods for determining the term weight of the document during automatic classification of text information. The influence of diminishing the dimension of a document terms on the work of vector classifier is considered. In the quality of the proposed methods are considered such methods as TF-IDF, TF-SLF, pointwise mutual information, conditional random fields. The purpose of this work is to improve the quality of the classification of textual information due to the fact that the appropriate method for determining the weight of the document is documented, and their combination with the method will induce the beginning of the classifier. The comparative analysis of methods on characteristics such as precision, recall and F-measure were performed. The considered methods are part of solution of determining the thematic belonging of texts, determining the author of the document, determining the emotional color of the document, spam filtering, etc.

Last modified: 2018-07-03 21:19:10