A Review of Focused Web Crawling Strategies
Journal: International Journal of Advanced Computer Research (IJACR) (Vol.2, No. 6)Publication Date: 2012-12-16
Authors : Bireshwar Ganguly Rahila Sheikh;
Page : 261-267
Keywords : Web crawling algorithms; search engine; focused crawling algorithm survey; page rank; Information Retrieval.;
Abstract
Modern world with tons of competition also brings a sense of responsibility of preserving the valuable time of user in case of searching for information around the web. But the abundance of data indexed is quite huge and with different user perspective, searching has a significant impact using a standard exhaustive crawling. A standard crawler starts well with a promising set of initial seed URLs but the amplitude of its graph decline in between the process. This is major reason why researches place heavy emphasis on the relevancy and robustness of the data found. Also the users’ perspective differs from time to time from topic to topic. i.e. ones’ want is others unnecessary. This is where the importance of Focused crawling comes into play. Focused crawlers aim to search and retrieve only the subset of the world-wide web that pertains to a specific topic of relevance. The ideal focused crawler retrieves the maximal set of relevant pages while simultaneously traversing the minimal number of irrelevant documents on the web. In this paper we review the researches on several focused web crawling strategies and propose a new technique which focuses on the assignment of credits to the web pages as per its semantic contents. We also give emphasis to prioritize the frontier queue so that the higher credit page URLs are given priority to crawl over lower one.
Other Latest Articles
- A Survey Paper on ECG Data Compression techniques and Proposing a New Method to achieve a Low PRD Value
- Secure and Cost Effective Framework for Cloud Computing Based On optimization and Virtualization
- Text mining: A Brief survey
- Design of Earthing System for a Substation : A Case Study
- Variations of Support Vector Machine classification Technique: A survey
Last modified: 2013-01-26 20:03:35