An Overview of Web Data Extraction Techniques
Journal: International Journal of Scientific Engineering and Technology (IJSET) (Vol.2, No. 4)Publication Date: 2013-04-01
Authors : Devika K Subu Surendran;
Page : 278-287
Keywords : data extraction; wrapper induction; DOM tree; web crawler; Data alignment; pattern mining;
Abstract
Web pages are usually generated for visualization not for data exchange. Each page may contain several groups of structured data. Web pages are generated by plugging data values to predefined templates. Manual data extraction from semi supervised web pages is a difficult task. This paper focuses on study of various automatic web data extraction techniques. There are mainly two types of techniques one is based on wrapper induction another is automatic extraction. In wrapper induction set of extraction rules are used, which are learnt from multiple pages containing similar data records.
Other Latest Articles
- Dynamic Modelling & Controller Design for Z-Source DC-DC Converter
- Design of Low Power FFT Processor for OFDM Wireless Communication Systems
- Optimization of Turning Process Parameters Using Multivariate Statistical Method-PCA Coupled with Taguchi Method
- Piezoelectric Crystals : Future Source of Electricity
- Classification of EEG Signals under Different Mental Tasks Using Wavelet Transform and Neural Network with One Step Secant Algorithm
Last modified: 2013-04-03 18:18:56