Lexical enrichment of philological textbooks: corpus and statistical approaches
Journal: Russian Language Studies (Vol.22, No. 4)Publication Date: 2025-02-22
Authors : Khalida Galimova; Ekaterina Martynova; Svetlana Moskvitcheva;
Page : 579-597
Keywords : lemma; frequency dictionary; frequency lists; Academic corpus of the Russian language; term; Philology; lexical coverage; lexical enrichment;
Abstract
The relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological textbooks based on the linguistic corpus of textbooks on the Russian language and literature for 5-7 grades. Philological textbooks present an average model of the Russian language and literature, reflecting topics relevant to the student and gradually increasing the volume of lexical complexity. The aim of the article is to assess lexical enrichment in philological textbooks for 5-7 grades and to improve the methodology for compiling frequency lists. The study was carried out on the material of a corpus including 66 textbooks on the Russian language and Literature with the total size of 1,553,224 tokens. Methods of corpus and computational linguistics methods, comparative-contrastive, and statistical methods (IKSWEB program, the Google Colab environment, the Pandas, NLTK and Pymorphy libraries) revealed that the frequency list of the 5th grade comprises 8984 lemmas; the 6th grade, 7572 lemmas; the 7th grade, 7321 lemmas. Vocabulary “enrichment” in the 6th grade consists of 258 lexemes, and in the 7th grade, 150 lexemes. The lexical core of the three frequency lists are words of the thematic groups “Philological terms”, “Verbs denoting educational actions”, “Nature”, “Family and friendly relations”, “Art”, and “Time”. The 6th grade vocabulary “enrichment” includes archaisms and historicisms, terms denoting forms of the national language, and word-formation terms. The 7th grade “enrichment” comprises of linguistic terms on the themes “Names of verb forms”, “Religion”, and socio-political vocabulary. The frequency lists confirmed the hypothesis about the thematic balance of texts in modern textbooks on the Russian language and Literature and linguistics terminology being the core in the textbooks. The prospects of the study are seen in conducting a similar research of educational texts in Philology and other subjects form the textbooks for senior school in order to define intra- and meta-subject links.
Other Latest Articles
- Linguistic profiling of educational and artistic texts
- Russian language textbook as agent of change: from USSR to the new century
- Predicative potential of lexical parameters: text complexity assessment in Russian language textbooks for 5-7 grades
- Approaches and tools for Russian text linguistic profiling
- Numerical simulation of physica lparameters in the atmosphere applying the WRF model to analyze the weather with inTiquipaya municipality (Bolivia)
Last modified: 2025-02-22 07:43:18