Universal Dependencies for Urdu Noisy Text
Journal: International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE) (Vol.10, No. 3)Publication Date: 2021-06-11
Authors : Amber Baig Mutee U Rahman Abdul Salam Shah Suhni Abbasi;
Page : 1751-1757
Keywords : ;
Abstract
In this paper, the process of creating a Dependency Treebank for tweetsin Urdu,a morphologically rich and less-resourced languageis described. The 500 Urdu tweets treebank iscreated by manually annotating the treebank withlemma, POS tags, morphological and syntacticrelations using the Universal Dependencies annotation scheme, adopted to the peculiarities of Urdu social media text. annotation process is evaluated through Inter-annotator agreement for dependency relations and total agreement of 94.5% and resultant weighted Kappa = 0.876was observed. The treebank is evaluated through 10-fold cross validation using Maltparserwith various feature settings. Results show average UAS score of 74%, LAS score of 62.9% and LA score of 69.8%
Other Latest Articles
- Automatic Segmentation of Hippocampus and Classification of brain MRI for Alzheimer’s Detection
- THIRD EYE 360° Object Detection and Assistance for Visually Impaired People
- Smart Learning Tools for Enhancing Basic Education System
- Skin Lesions Detection and Classification Using Deep Learning
- Code Smell Identification As The Basis For Code Refactoring in The Agricultural Information System Portal Case Study at: Gilangharjo Village, Bantul Regency, Indonesia
Last modified: 2021-06-11 19:47:00