Using Lift as a Practical Measure of Surprise in a Document Stream
Proceeding: Third International Conference on E-Technologies and Business on the Web (EBW)Publication Date: 2015-03-26
Authors : Sean Rooney;
Page : 7-12
Keywords : Information-Retrieval; Lift; Mutual-Information; Sketching; Suprisal;
Abstract
We describe how the concept of Lift can be generalized to order small documents in a corpus by their degree of similarity. This surprisal norm can be used in conjunction with other features to search over the corpus. From an information theoretic point of view surprisal is the combination of the Mutual-information of all word pairs in a documents. We show how the calculation of surprisals can be performed efficiently on a document stream using sketching techniques.
Other Latest Articles
- Comments Analysis and Visualization Based on Topic Modeling and Topic Phrase Mining
- Local Rural Gastronomic Traditional Tourism. A Strategy for Local and Regional Development, at the South of the State of México
- Key Considerations for Planning the European Capital of Culture ? The Case of Veliko Tarnovo
- ‘Walkabout Tourism’: Is there an Indigenous Tourism Market in Outback Australia?
- The Relationship of City Branding and Tourist Promotion: The Case of Plymouth (UK) and Malaga (Spain)
Last modified: 2015-03-27 22:08:17