Survey On Discovering Deep Web Interfaces Using Data MiningJournal: International Journal for Scientific Research and Development | IJSRD (Vol.3, No. 11)
Publication Date: 2016-02-01
Authors : Roshana R. Bangar; Kahate S.A.; Deokate G.D.;
Page : 439-445
Keywords : Deep Web; Two-Stage Crawler; Feature Selection; Ranking; Adaptive Learning;
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. It represents a large portion of the structured data on the Web; accessing Deep-Web content has been a long-standing challenge for the database community. Deep web crawling is fundamental problem faced by web crawlers that has profound effect on search engine efficiency. Recent study shows that nearly 96% of data over internet is hidden i.e. not found to search engines. The challenge imposed on search engines is to retrieve hidden web data at low cost. This system uses a machine learning approach that is completely automatic, highly scalable, and very efficient, that helps to improve data retrieval at reduced cost. This system uses focused crawling strategy for retrieving accurate results related to query and selects only relevant links according to their similarity with respect to query. The algorithm used in this system efficiently selects only possible candidates rather than searching whole search space for inclusion in too ur web search index.
Other Latest Articles
Last modified: 2016-01-28 15:07:31