Keyword extraction from single documents using mean word intermediate distance

Journal: International Journal of Advanced Computer Research (IJACR) (Vol.6, No. 25)

Publication Date: 2016-07-31

Authors : Sifatullah Siddiqi; Aditi Sharan;

Page : 138-145

Keywords : Keyword extraction; Word means intermediate distance; Clustering; Standard deviation.;

Source : Download Find it from : Google Scholar

Abstract

Keyword extraction is an important task in text mining. In this paper a novel, unsupervised, domain independent and language independent approach for automatic keyword extraction from single documents have been proposed. We have used the word intermediate distance vector and its mean value to extract keywords. We have compared our approach with results from the standard deviation of intermediate distances approach as standard and found that there is heavy overlapping between the results of both approaches with the advantage that our approach is faster, especially in case of long documents as it removes the need to compute the standard deviation of word intermediate distance vector. Two famous works viz. “Origin of Species” and “A Brief History of Time” to demonstrate the experimental results have been used. Experiments show that the proposed approach works almost as better as the standard deviation approach and the percentage overlap between top 30 extracted keywords is more than 50%.

Main Menu

Searching By

PARTNERS

Keyword extraction from single documents using mean word intermediate distance

Abstract

Advertisement