ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

T5 language models for text simplification

Journal: Software & Systems (Vol.36, No. 2)

Publication Date:

Authors : ; ;

Page : 228-236

Keywords : T5 model; deep learning; text simplification; natural language processing;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The problem of text readability in natural Russian is relevant for people with various cognitive impairments and for people with poor language skills, such as labor migrants or children. Texts constantly surround us in real life, such as various instructions, directions, and recommendations. Increasing the availability of these texts for these categories of citizens is possible by using an automated text simplification algorithm. This article used deep neural architecture transformers as an automated simplification algorithm. The following language models were applied: ruT5-base-absum, ruT5-base-paraphraser, ruT5_base_sum_gazeta, ruT5-base. Experimental studies used two data sets: a data set from the Institute of Philology and Language Communication and data from the open Github repository. The following set of metrics was used to evaluate the models: BLEU, Flesh Readability Index, Automatic Readability Index, and Sentence Length Difference. Further, using a test data set, statistical indicators were extracted from the listed metrics, which became the basis for comparing algorithms with different training parameters. The authors carried out several experiments with these models that used different values of the learning rate parameter for each dataset, batch sizes, and the exclusion of an additional dataset from training. Despite the different metrics, the models outputs did not differ much from each other during manual comparison. The results of experimental studies show the need to increase the data set for model training, as well as the change in the parameters of model training, or the use other algorithms. This study is the first step towards creating a decision support system for automatic text simplification and requires further development.

Last modified: 2023-08-11 17:21:51