Selectivity Estimation of Range Queries in Data Streams using Micro-Clustering
Journal: The International Arab Journal of Information Technology (Vol.13, No. 4)Publication Date: 2016-07-01
Authors : Sudhanshu Gupta; Deepak Garg;
Page : 396-402
Keywords : Selectivity estimation; range query; data streams; micro-clustering;
Abstract
Selectivity estimation is an important task for query optimization. The common data mining techniques are not applicable on large, fast and continuous data streams as they require one pass processing of data. These requirements make Range Query Estimation (RQE) a challenging task. We propose a technique to perform RQE using micro-clustering. The technique maintains cluster statistics in terms of micro-clusters. These micro-clusters also maintain data distribution information of the cluster values using cosine coefficients. These cosine coefficients are used for estimating range queries. The estimation can be done over a range of data values spread over a number of clusters. The technique has been compared with cosine series technique for selectivity estimation. Experiments have been conducted on both synthetic and real datasets of varying sizes and results confirm that our technique offers substantial improvements in accuracy over other methods.
Other Latest Articles
- Control and Management of Coal Mines with Control Information Systems
- Threshold-based Steganography: A Novel Technique for Improved Payload and SNR
- Task Scheduling Using Probabilistic Ant Colony Heuristics
- Arabic/Farsi Handwritten Digit Recognition using Histogram of Oriented Gradient and Chain Code Histogram
- Investigation of Blockchain Based Identity System for Privacy Preserving University Identity Management System
Last modified: 2019-11-13 21:49:24