ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Comparison of Text Classification Models and Methods

Journal: RUDN Journal of Engineering Researches (Vol.26, No. 3)

Publication Date:

Authors : ; ; ;

Page : 298-309

Keywords : natural language processing; NLP; text preprocessing; text representation; machine learning; neural networks;

Source : Download Find it from : Google Scholarexternal

Abstract

The study considers the process of automatic text classification and its components. The relevance of this topic is due to the rapid growth of data and the development of machine learning technologies. The purpose of the study is to determine the best methods and models for automatic text classification. The scientific articles written over the past four years that are most suitable for the topic were selected as material for analysis. Consequently, it was determined that effective preprocessing of text data should consist of normalization, tokenization, removal of stop words and stemming or lemmatization. The BERT model is recommended to be used to represent the text. However, it is worth starting from the conditions of a specific task, in which alternative approaches may be preferable. The most effective methods of direct text classification are the logistic regression method, convolutional neural networks, and RoBERTa. The selection of a particular model is determined by the intended application and the technological capabilities available.

Last modified: 2025-11-12 06:00:54