ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Morphological analyzer (morfoAnalyse) Python package for Turkic language

Journal: Science and Education (Vol.3, No. 9)

Publication Date:

Authors : ; ; ;

Page : 146-156

Keywords : python package; morphological analyzer; lexicon; Uzbek language; Natural Language processing; machine learning;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The Turkic family languages are an agglutinative language in that words are derived from stems (root) by concatenating affixes to it. This property makes a large number of combinations of morphemes, and greatly increases the wordvocabulary size. Therefore, words are split into certain sub-word units and applied to text and speech processing applications. Proper sub-word units not only provide high coverage and smaller lexicon size, but also provide semantic and syntactic information that is necessary for downstream applications. This paper discusses a morphological analyzer package for natural language processing and machine learning purpose. The package is named morphAnalyse, which can split a text of words into a sequence of morphemes. Morphological analyzer is one of the main part of the natural language processing. morphAnalyse package is downloadable through pip for any python applications. morphAnalyse package can be used for Turkic family languages such as Uzbek, Turkish, Kazakh, and Tatar etc.

Last modified: 2022-09-30 01:05:14