ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

UNSUPERVISED APPROACH TO DEDUCE SCHEMA & EXTRACT DATA FROM TEMPLATE WEB PAGES

Journal: International Journal of Computer Engineering and Technology (IJCET) (Vol.5, No. 11)

Publication Date:

Authors : ; ;

Page : 57-64

Keywords : computer engineering; cloud computing; network security; wireless communication; iaeme journals; IJCET; journal article; research paper; open access journals; journal publication;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Web data extraction has been an important part for many Web data analysis applications. The aim of a web data extraction system is to extract relevant data from the web pages. Embedding of fixed template into web page is done by using fixed template. Thus extracting structured data from the template generated web pages are challenging the task that is useful for Web Information Integration. In this paper, an unsupervised approach is presented that automatically detects schema of web pages & performs page-level extraction task. He re, first visually similar web pages are found out by comparing their visual clues.

Last modified: 2016-08-10 21:16:16