Enhanced Clustering-Based Topic Identification of Transcribed Arabic Broadcast News
Journal: The International Arab Journal of Information Technology (Vol.14, No. 5)Publication Date: 2017-09-01
Authors : Ahmed Jafar; Mohamed Fakhr; Mohamed Farouk;
Page : 721-728
Keywords : Arabic speech transcription; topic clustering.;
Abstract
This research presents an enhanced topic identification of transcribed Arabic broadcast news using clustering techniques. The enhancement includes applying new stemming technique “rule-based light stemming” to balance the negative effects of the stemming errors associated with light stemming and root-based stemming. New possibilistic-based clustering technique is also applied to evaluate the degree of membership that every transcribed document has in regard to every predefined topic, hence detecting documents causing topic confusions that negatively affect the accuracy of the topic-clustering process. The evaluation has showed that using rule-based light stemming in combination of spectral clustering technique achieved the highest accuracy, and this accuracy is further increased after excluding confusing documents.
Other Latest Articles
- Combination of Multiple Classifiers for Off-Line Handwritten Arabic Word Recognition
- A Novel Approach for Sentiment Analysis of Punjabi Text using SVM
- Generalization of Impulse Noise Removal
- Service-Oriented Process Modelling for Device Control in Future Networks
- Features Modelling in Discrete and Continuous Hidden Markov Models for Handwritten Arabic Words Recognition
Last modified: 2019-05-09 17:04:07