Sindhi Stemmer using Affix Removal Method

Journal: International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE) (Vol.10, No. 3)

Publication Date: 2021-06-11

Authors : Ambreen A. Sattar Suhni Abbasi Mutee U Rahman Amber Baig Masroor Nizamani;

Page : 2447-2451

Keywords : Natural language processing; Information Retrieval; Sindhi; Lexicon; Yet another Suffix Stripper; Hidden Markov Model.;

Source : Download Find it from : Google Scholar

Abstract

Stemming is the process of mapping various inflections of a word to its base form. Stemmer is an essential component of Information Retrieval (IR) systems and different Natural Language Processing pipelines. This research reports the development and evaluation of stemmer for a resource poor language Sindhi. The stemmer is using a lexicon-based affix removal technique for stemming. The developed lexicon represents the base forms and the algorithm uses this lexicon during the affix removal process. The overall performance accuracy is evaluated, and stemmed error rate is calculated. The results show 89.57% overall performance accuracy.

Main Menu

Searching By

PARTNERS

Sindhi Stemmer using Affix Removal Method

Abstract

Advertisement