Sindhi Stemmer using Affix Removal Method
Journal: International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE) (Vol.10, No. 3)Publication Date: 2021-06-11
Authors : Ambreen A. Sattar Suhni Abbasi Mutee U Rahman Amber Baig Masroor Nizamani;
Page : 2447-2451
Keywords : Natural language processing; Information Retrieval; Sindhi; Lexicon; Yet another Suffix Stripper; Hidden Markov Model.;
Abstract
Stemming is the process of mapping various inflections of a word to its base form. Stemmer is an essential component of Information Retrieval (IR) systems and different Natural Language Processing pipelines. This research reports the development and evaluation of stemmer for a resource poor language Sindhi. The stemmer is using a lexicon-based affix removal technique for stemming. The developed lexicon represents the base forms and the algorithm uses this lexicon during the affix removal process. The overall performance accuracy is evaluated, and stemmed error rate is calculated. The results show 89.57% overall performance accuracy.
Other Latest Articles
- Impact of Ultra-Wideband Antenna Application on Underground Object Detection
- Contemporary Fiqh Study: South Korea as a Country of Appearance-Oriented Views (외모 지상 주의) on Trend of Cosmetic Plastic Surgery
- An Analysis on Measuring Graph patterns in Social Networks
- Pizza Dough Service Provider at Various Area Followed by Depth-First Search Algorithm
- Researching References on Interpretation of Personal Data in the Indonesian Constitution
Last modified: 2021-08-05 14:21:36