ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

An Overview of Web Data Extraction Techniques

Journal: International Journal of Scientific Engineering and Technology (IJSET) (Vol.2, No. 4)

Publication Date:

Authors : ;

Page : 278-287

Keywords : data extraction; wrapper induction; DOM tree; web crawler; Data alignment; pattern mining;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Web pages are usually generated for visualization not for data exchange. Each page may contain several groups of structured data. Web pages are generated by plugging data values to predefined templates. Manual data extraction from semi supervised web pages is a difficult task. This paper focuses on study of various automatic web data extraction techniques. There are mainly two types of techniques one is based on wrapper induction another is automatic extraction. In wrapper induction set of extraction rules are used, which are learnt from multiple pages containing similar data records.

Last modified: 2013-04-03 18:18:56