ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Ensemble Framework for Web Content Extraction to User Query Obfuscations

Journal: International Journal of Science and Research (IJSR) (Vol.3, No. 2)

Publication Date:

Authors : ; ;

Page : 431-435

Keywords : Information Retrieval; Data clustering; Data Prediction; Web Data extraction;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

As the dynamic exploration of digital data contents generated on the Web, Users of Web search engines are often forced to sift through the long ordered list of results returned by the engines for obfuscated queries. Data stream classification poses many challenges to the web mining community with challenges like infinite length, concept-drift, Concept-evolution, and feature-evolution, data semantics. Since a data stream is theoretically infinite in length, it is impractical to store and use all the historical data for training. Most existing data stream classification techniques fails to classify the data with less entropy. The proposed framework includes another two components: 1) multi Correlation extraction model is proposed to perform query prediction based annotation similarity, it also check the similarity of data records and detect the correct data region with higher precision using the semantic properties of these data records.2) We introduce User-specific preference modeling to map the query relevance and user preference into the same user-specific cluster space. The advantages of this method are that it can extract any types of data records provides options for aligning iterative and disjunctive data items. Experimental results show that proposed system achieves high precision and outperforms existing state-of-the-art data extraction methods.

Last modified: 2021-06-30 20:58:50