Exploring the Potential of Schemes in Building NLP Tools for Arabic Language
Journal: The International Arab Journal of Information Technology (Vol.12, No. 6)Publication Date: 2015-11-01
Authors : Mohamed Achraf Ben Mohamed; Souheyl Mallat; Mohamed Amine Nahdi; Mounir Zrigui;
Page : 566-572
Keywords : Arabic language; schemes; roots; derivation; text classification; PCFG; parsing.;
Abstract
Arabic is known for its sparseness, which explains the difficulty of its automatic processing. The Arabic language is based on schemes; lemmas are produced using derivation based on roots and schemes. This latter character presents two major advantages: First, this “hidden side” of the Arabic language composed of schemes suffers much less from sparseness since it represents a finite set, second, schemes keep a large number of features of the language in a much reduced vocabulary size. Schemes present a very great perspective and have great potential in building accurate natural language processing tools for Arabic. In this work we tried to explore this potential by building some NLP tools while relying entirely on schemes. The work is related to text classification and a Probabilistic Context Free Grammar (PCFG) parsing
Other Latest Articles
Last modified: 2019-11-17 17:13:50