ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Using Lift as a Practical Measure of Surprise in a Document Stream

Proceeding: Third International Conference on E-Technologies and Business on the Web (EBW)

Publication Date:

Authors : ;

Page : 7-12

Keywords : Information-Retrieval; Lift; Mutual-Information; Sketching; Suprisal;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

We describe how the concept of Lift can be generalized to order small documents in a corpus by their degree of similarity. This surprisal norm can be used in conjunction with other features to search over the corpus. From an information theoretic point of view surprisal is the combination of the Mutual-information of all word pairs in a documents. We show how the calculation of surprisals can be performed efficiently on a document stream using sketching techniques.

Last modified: 2015-03-27 22:08:17