UNSUPERVISED APPROACH TO DEDUCE SCHEMA & EXTRACT DATA FROM TEMPLATE WEB PAGES
Journal: International Journal of Computer Engineering and Technology (IJCET) (Vol.5, No. 11)Publication Date: 2014-11-28
Authors : SHINDE SANTAJI KRISHNA; SHASHANK DATTATRAYA JOSHI;
Page : 57-64
Keywords : computer engineering; cloud computing; network security; wireless communication; iaeme journals; IJCET; journal article; research paper; open access journals; journal publication;
Abstract
Web data extraction has been an important part for many Web data analysis applications. The aim of a web data extraction system is to extract relevant data from the web pages. Embedding of fixed template into web page is done by using fixed template. Thus extracting structured data from the template generated web pages are challenging the task that is useful for Web Information Integration. In this paper, an unsupervised approach is presented that automatically detects schema of web pages & performs page-level extraction task. He re, first visually similar web pages are found out by comparing their visual clues.
Other Latest Articles
- ECOLOGICAL AND ECONOMIC ASPECTS OF THE TRANSITION TO THE "GREEN" ECONOMY IN UKRAINE
- INFORMATION AND LEGAL SUPPORT OF DEVELOPMENT OF THE PERSONNEL OF ENTERPRISES
- THE NECESSITY OF ECOLABELING IN NATIONAL DAIRY PRODUCTION
- THE MAIN DIRECTIONS OF INNOVATION AND INVESTMENT IN THE DEVELOPMENT OF THE ZAKARPAT REGION
- SYNCHRONOUS AND ASYNCHRONOUS FLUCTUATIONS OF BUSINESS ACTIVITY IN THE COUNTRIES ACCORDING TO THEIR CREDIT RATING
Last modified: 2016-08-10 21:16:16