Web Search Results Summarization Using Similarity Assessment
Journal: International Journal of Advanced Computer Research (IJACR) (Vol.4, No. 15)Publication Date: 2014-06-17
Authors : Sawant V.V.; Takale S.A.;
Page : 568-574
Keywords : Web mining; Layout Similarity; Visual Similarity; EMD Link Analysis; Summarization; Web Similarity; Text Similarity; WWW.;
Abstract
Now day’s internet has become part of our life, the WWW is most important service of internet because it allows presenting information such as document, imaging etc. The WWW grows rapidly and caters to a diversified levels and categories of users. For user specified results web search results are extracted. Millions of information pouring online, users has no time to surf the contents completely .Moreover the information available is repeated or duplicated in nature. This issue has created the necessity to restructure the search results that could yield results summarized. The proposed approach comprises of different feature extraction of web pages. Web page visual similarity assessment has been employed to address the problems in different fields including phishing, web archiving, web search engine etc. In this approach, initially by enters user query the number of search results get stored. The Earth Mover's Distance is used to assessment of web page visual similarity, in this technique take the web page as a low resolution image, create signature of that web page image with color and co-ordinate features .Calculate the distance between web pages by applying EMD method. Compute the Layout Similarity value by using tag comparison algorithm and template comparison algorithm. Textual similarity is computed by using cosine similarity, and hyperlink analysis is performed to compute outward links. The final similarity value is calculated by fusion of layout, text, hyperlink and EMD value. Once the similarity matrix is found clustering is employed with the help of connected component. Finally group of similar web pages i.e. summarized results get displayed to user. Experiment conducted to demonstrate the effectiveness of four methods to generate summarized result on different web pages and user queries also.
Other Latest Articles
- Mining Association Rules to Evade Network Intrusion in Network Audit Data
- Utilization of Competitive Intelligence to Enhance Firm Performance: A Case of South African Small and Medium Enterprises
- Representative and Diverse Image Set Gathering for Geographic Area and its Surrounding Region
- An Efficient Automated English to Bengali Script Conversion Mechanism
- Image Processing Algorithms ? A Comprehensive Study
Last modified: 2014-12-17 17:10:43