ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

VWDRE – A VISION-BASED APPROACH FOR MINING DATA FROM SEARCH ENGINE RESULT PAGES

Journal: International Journal of Civil Engineering and Technology (IJCIET) (Vol.8, No. 9)

Publication Date:

Authors : ;

Page : 973-982

Keywords : Vision-Based; Wrapper; Data Extraction; Search Engine Result Pages; DOM tree.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The data extraction from the dynamically generated web pages is a challenging factor because the result of the search engines are always different for every query submitted. Many techniques were proposed to address this issue but most of them have the common problem of language-dependency. In order to overcome the limitations of previous works, there are few ways which analyze visual features of the web page. In this paper, we proposed a new vision-based approach which is independent of the code used. It broadly utilizes the visual features on the search engine result pages to locate the data region so asto mine the data records from it. We develop a clustering by similarity algorithm to check the similarity of data records. Also, we propose a technique to generate the wrapper for data record extraction by examining the multiple result pages from the same search engine.

Last modified: 2018-04-16 17:37:26