Verb database: Structure, clusters and options
Journal: Russian Journal of Linguistics (Vol.27, No. 4)Publication Date: 2024-04-01
Authors : Nadezhda Buntman; Anna Borisova; Yulia Darovskikh;
Page : 981-1004
Keywords : supracorpora verb database; clusters; manual annotation; comparative analysis; translation variant;
Abstract
The content and volume of language corpora provide an opportunity to obtain reliable information about the real use of a particular linguistic unit. Nowadays, there is a large number of corpora in different languages, their formation technologies are being improved. Nevertheless, some problems and limitations arise when using these resources in comparative studies. Corpora users need to work with annotated data submitted to tagging through annotation protocols. The article presents the structure and functionality of the supracorpora verb database (SVD) developed on the basis of a parallel Russian-French subcorpus of the Russian National Corpus (RNC) and reveals the difference in their potentials. The described database is a pilot version of the final software, which is currently under development and is being tested. It consists of several clusters focused on solving such linguistic tasks as studying the grammatical semantics specifics and the distribution of verb forms in Russian and French; identifying the polysemantic structure in the two languages, which in turn verifies the understanding of the linguistic worldview of the speakers of Russian and French. It has been found that the mechanism of functioning of SVD cluster formations allows us to study both individual characteristics of verbs and the semantics of verbal lexemes and collocations. The manual annotation enables users to identify the systematic asymmetry of verb forms and cases of contextual and low-frequency asymmetry. Thus, SVD can be used in language pedagogy, teaching and studying discursive grammar, as well as the analysis of translation models variability.
Other Latest Articles
- Ways of expressing the category of instrumentality in retranslated texts
- Text content variables as a function of comprehension: Propositional discourse analysis
- Discursive designing of autobiographical memories in speech ontogeny: Longitudinal survey
- Towards a Multimodal Hermeneutic Model: The case of Uber-Blog-mediated advertising discourse order of ‘Saudization’
- Internal migration and changes in language repertoire among Sindhi youth
Last modified: 2024-04-01 17:46:35