A LITERATURE SURVEY ON WEB CRAWLERS
Journal: International Journal of Computer Science and Mobile Applications IJCSMA (Vol.2, No. 5)Publication Date: 2014-05-30
Authors : V. Rajapriya;
Page : 36-44
Keywords : Web crawlers; Architecture; Challenges; Uses;
Abstract
The web contains large data and it contains innumerable websites that is monitored by a tool or a program known as Crawler. Forum Crawler Under Supervision is a supervised web-scale forum crawler. The goal is to crawl relevant forum content from the web with minimal overhead. Forums have different layouts or styles and are powered by different forum software packages. They have similar implicit navigation paths connected by specific URL types to lead users from entry pages to thread pages. It reduces the web forum crawling problem to a URL type recognition problem. It also shows how to learn accurate and effective regular expression patterns of implicit navigation paths from automatically created training sets using aggregated results from weak page type classifiers. These type classifiers can be trained and applied to large set of unseen forums. It produces the best effectiveness and addresses the scalability issue and includes the concept called sentimental analysis. This paper tells about the web crawler and their challenges and I produced survey of four papers.
Other Latest Articles
- PROCURE DATA CENTRE SHARING SCHEME IN VIRTUAL CLOUD ENVIRONMENT USING CLOUDSIM?
- Dynamic Routing for ADA in Wireless Sensor Networks?
- Optimizing the Performance and Secure Distributed Wireless Network in Unreliable D-NCS using CGA?
- Result Analysis for LBP and Shape Context Methodologies used as Authentication Mechanisms of Digital Signatures used for Certification?
- Reversible Data Hiding and its Methods: A Survey?
Last modified: 2014-05-28 01:42:36