To Investigate the Problem of Similarity Search on Dimension Incomplete Data
Journal: International Journal of Science and Research (IJSR) (Vol.4, No. 3)Publication Date: 2015-03-05
Authors : Amol Patil; Ashwini Sagade;
Page : 214-216
Keywords : Missing Dimensions; Similarity search; Whole sequence query; Probability triangle inequality; Temporal data;
Abstract
Similarity query in multidimensional database is a fundamental research problem with numerous applications in the areas of database, data mining, and information retrieval. The existing work on querying incomplete data addresses the problem where the data values on certain dimensions are unknown. Missing dimension information poses great computational challenge, since all possible combinations of missing dimensions need to be examined when evaluating the similarity between the query and the data objects. We develop the lower and upper bounds of the probability that a data object is similar to the query. These bounds enable efficient filtering of irrelevant data objects without explicitly examining all missing dimension combinations. A probability triangle inequality is also employed to further prune the search space and speed up the query process. The proposed probabilistic framework and techniques can be applied to both whole and subsequence queries. Extensive experimental results on real-life data sets demonstrate the effectiveness and efficiency of our approach.
Other Latest Articles
Last modified: 2021-06-30 21:34:49