Morphological analyzer (morfoAnalyse) Python package for Turkic language
Journal: Science and Education (Vol.3, No. 9)Publication Date: 2022-09-25
Authors : Nilufar Abdurakhmonova; Ismailov Alisher Shakirovich; Kimsanboev Nodirbek Shoirbek o‘g‘li;
Page : 146-156
Keywords : python package; morphological analyzer; lexicon; Uzbek language; Natural Language processing; machine learning;
Abstract
The Turkic family languages are an agglutinative language in that
words are derived from stems (root) by concatenating affixes to it. This property makes a large number of combinations of morphemes, and greatly increases the wordvocabulary size. Therefore, words are split into certain sub-word units and applied to text and speech processing applications. Proper sub-word units not only provide high coverage and smaller lexicon size, but also provide semantic and syntactic information that is necessary for downstream applications. This paper discusses a morphological analyzer package for natural language processing and machine learning purpose. The package is named morphAnalyse, which can split a text of words into a sequence of morphemes. Morphological analyzer is one of the main part of the natural language processing. morphAnalyse package is downloadable through pip for any python applications. morphAnalyse package can be used for Turkic family languages such as Uzbek, Turkish, Kazakh, and Tatar etc.
Other Latest Articles
- Geoaxborot tizimida almashlab ekish tizimini tahlil qilish
- Foreign experience in application of high-strength expanded clay concrete in buildings and structures (review of published studies)
- Circuits and operating principle of DC converters
- Spectroscopic analysis of cottonseed oil color and acceleration of the extraction process
- Automating the process of registration of products produced on the assembly conveyor
Last modified: 2022-09-30 01:05:14