ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

REVIEW PAPER ON THE DEEP WEB DATA EXTRACTION

Journal: International Journal of Engineering Sciences & Research Technology (IJESRT) (Vol.7, No. 4)

Publication Date:

Authors : ;

Page : 39-44

Keywords : Web data extraction; Visual features of deep web pages; Wrapper generation; Feature extracting; Webpage;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Deep web data extraction is the process of extracting a set of data records and the items that they contain from a query result page. Such structured data can be later integrated into results from other data sources and given to the user in a single, cohesive view. Domain identification is used to identify the query interfaces related to the domain from the forms obtained in the search process. The surface web contains a large amount of unfiltered information, whereas the deep web includes high-quality, managed and subject-specific information. The deep web grows faster than the surface web because the surface web is limited to what is easily found by search engines. The deep web covers domains such as education, sports and the economy. Deep web contents are accessed by queries submitted to web databases and the returned data records are enwrapped in dynamically generated web pages (they will be called deep web pages in this paper). Extracting structured data from deep web pages is a challenging problem due to the underlying intricate structures of such pages. For this large set of web databases show that the proposed vision-based approach is highly effective for deep web data extraction.

Last modified: 2018-04-10 20:58:26