ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Enhanced Clustering-Based Topic Identification of Transcribed Arabic Broadcast News

Journal: The International Arab Journal of Information Technology (Vol.14, No. 5)

Publication Date:

Authors : ; ; ;

Page : 721-728

Keywords : Arabic speech transcription; topic clustering.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

This research presents an enhanced topic identification of transcribed Arabic broadcast news using clustering techniques. The enhancement includes applying new stemming technique “rule-based light stemming” to balance the negative effects of the stemming errors associated with light stemming and root-based stemming. New possibilistic-based clustering technique is also applied to evaluate the degree of membership that every transcribed document has in regard to every predefined topic, hence detecting documents causing topic confusions that negatively affect the accuracy of the topic-clustering process. The evaluation has showed that using rule-based light stemming in combination of spectral clustering technique achieved the highest accuracy, and this accuracy is further increased after excluding confusing documents.

Last modified: 2019-05-09 17:04:07