Design of Improved Web Crawler By Analysing Irrelevant Result?
Journal: International Journal of Computer Science and Mobile Computing - IJCSMC (Vol.2, No. 8)Publication Date: 2013-08-30
Authors : Prashant Dahiwale M.M. Raghuwanshi Latesh Malik;
Page : 243-247
Keywords : URL; focused crawler; classifier; relevance prediction; links; search engine; ranking;
Abstract
A key issue in designing a focused Web crawler is how to determine whether an unvisited URL is relevant to the search topic. Effective relevance prediction can help avoid downloading and visiting many irrelevant pages. In this module, we propose a new learning-based approach to improve relevance prediction in focused Web crawlers. For this study, we chose Naïve Bayesian as the base prediction model, which however can be easily switched to a different prediction model. The performance of a focused crawler depends mostly on the richness of links in the specific topic being searched, and focused crawling usually relies on a general web search engine for providing starting points.
Other Latest Articles
- FINANCIAL DECISION & INFLUENCING FACTORS
- FIRM EFFICIENCY AND STOCK RETURNS: EVIDENCE FROM INDIAN PHARMACEUTICAL INDUSTRY A DATA ENVELOPMENT ANALYSIS APPROACH
- INFECTIVE ENDOCARDITIS DIAGNOSIS ANTIMICROBIAL THERAPY AND MANAGEMENT
- PSYCHIATRIC DISORDERS IN HIV SEROPOSITIVE INDIVIDUALS
- ON EQUALITY OF NEWTON’S FORWARD, NEWTON’S BACKWARD AND LAGRANGE’S INTERPOLATION FORMULA*
Last modified: 2013-09-10 21:38:17