ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Mining Sequential Patterns from Probabilistic with Source Level Uncertainty

Journal: International Journal of Science and Research (IJSR) (Vol.4, No. 11)

Publication Date:

Authors : ;

Page : 241-244

Keywords : Uncertainty; SPM; Probabilistic database; optimization;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Sequential Pattern Mining (SPM) is an important data mining problem. Although it is assumed in classical SPM that the data to be mined is deterministic, it is recognized that data obtained from a wide variety of data sources is inherently noisy or uncertain, such as data from sensors or data being collected from the web from different (potentially conflicting) data sources. Probabilistic database is a popular framework for modeling uncertainty. Recently, several data mining and ranking problems have been studied in probabilistic databases. In this work we proposed one of the uncertainty models for spm, namely source level uncertainty which is covered under the framework of probabilistic databases framework. We give a dynamic programming algorithm to compute the source support probability and hence the expected support of a sequence in a source-level uncertain database. We then propose optimizations to speed up the support computation task. Next, we propose probabilistic SPM algorithms based on the candidate generation and pattern growth frameworks for the source-level uncertainty model and the expected support measure. We implement these algorithms and give an empirical evaluation of the probabilistic SPM algorithms and show the scalability of these algorithms under different parameter settings using both real and synthetic datasets. Finally, we demonstrate the effectiveness of the probabilistic SPM framework at extracting meaningful patterns in the presence of noise.

Last modified: 2021-07-01 14:26:37