Effect of Near-orthogonality on Random Indexing Based Extractive Text Summarization
Journal: International Journal of Innovation and Applied Studies (Vol.3, No. 3)Publication Date: 2013-07-02
Authors : Chatterjee Niladri; Sahoo Pramod K.;
Page : 701-713
Keywords : Word Space; Random Indexing; Index vector; Context vector; Near-orthogonal; PageRank;
Abstract
Application of Random Indexing (RI) to extractive text summarization has already been proposed in literature. RI is an approximating technique to deal with high-dimensionality problem of Word Space Models (WSMs). However, the distinguishing feature of RI from other WSMs (e.g. Latent Semantic Analysis (LSA)) is the near-orthogonality of the word vectors (index vectors). The near-orthogonality property of the index vectors helps in reducing the dimension of the underlying Word Space. The present work focuses on studying in detail the near-orthogonality property of random index vectors, and its effect on extractive text summarization. A probabilistic definition of near-orthogonality of RI-based Word Space is presented, and a thorough discussion on the subject is conducted in this paper. Our experiments on DUC 2002 data show that while quality of summaries produced by RI with Euclidean distance measure is almost invariant to near-orthogonality of the underlying Word Space; the quality of summaries produced by RI with cosine dissimilarity measure is strongly affected by near-orthogonality. Also, it is found that RI with Euclidean distance measure performs much better than many LSA-based summarization techniques. This improved performance of RI-based summarizer over LSA-based summarizer is significant because RI is computationally inexpensive as compared to LSA which uses Singular Value Decomposition (SVD) - a computationally complex algebraic technique for dimension reduction of the underlying Word Space.
Other Latest Articles
- A classification approach using SVM to detect magnetic inrush in power transformers
- Online Tracking of Maximum Panel Power Output in Photovoltaic Stand Alone System with Different Insolation
- Ultra Wideband Slotted Microstrip Patch Antenna for Downlink and Uplink Satellite Application in C band
- Metallurgical Characterisation of Recovered Aluminium Alloys in Cameroon
- Analysis of Psychological Well-being and Turnover intentions of Hotel Employees: An Empirical Study
Last modified: 2013-08-21 22:28:24