Enhanced Clustering-Based Topic Identification of Transcribed Arabic Broadcast NewsJournal: The International Arab Journal of Information Technology (Vol.14, No. 5)
Publication Date: 2017-09-01
Authors : Ahmed Jafar; Mohamed Fakhr; Mohamed Farouk;
Page : 721-728
Keywords : Arabic speech transcription; topic clustering.;
This research presents an enhanced topic identification of transcribed Arabic broadcast news using clustering techniques. The enhancement includes applying new stemming technique “rule-based light stemming” to balance the negative effects of the stemming errors associated with light stemming and root-based stemming. New possibilistic-based clustering technique is also applied to evaluate the degree of membership that every transcribed document has in regard to every predefined topic, hence detecting documents causing topic confusions that negatively affect the accuracy of the topic-clustering process. The evaluation has showed that using rule-based light stemming in combination of spectral clustering technique achieved the highest accuracy, and this accuracy is further increased after excluding confusing documents.
Other Latest Articles
Last modified: 2019-05-09 17:04:07