Focused Crawling System based on Improved LSI
Journal: International Journal of Science and Research (IJSR) (Vol.2, No. 9)Publication Date: 2013-09-15
Authors : Radhika Gupta; AP Nidhi;
Page : 61-64
Keywords : LSI; Breath first crawler; focused crawler;
Abstract
In this research work we have developed a semi-deterministic algorithm and a scoring system that takes advantage of the Latent Semantic indexing scoring system for crawling web pages that belong to particular domain or is specific to the topic .The proposed algorithm calculates a preference factor in addition to the LSI score to determine which web page needs to preferred for crawling by the multi threaded crawler application, by doing this we were able to produce a retrieval system that has high recall and precision values as it builds a queue which is specific to a particular domain/topic which would not have been possible in Breath first and only LSI based information retrieval systems.
Other Latest Articles
- Comparative Production of Different Amino Acids by Pseudomonas Boreopolis MD-4
- Unascended Left Kidney with Malrotation: A Rare Congenital Anomaly
- The Study of Facial Index among Haryanvi Adults
- Institutionalization of Dowry in India: Social Custom or Modern Malaise?
- Updating and Scheduling of Streaming Web Services in Data Warehouses
Last modified: 2013-10-01 22:38:03