ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Comparison of Keyword-based and Semantic-based Web Page Clustering Systems

Journal: International Journal of Science and Research (IJSR) (Vol.8, No. 1)

Publication Date:

Authors : ; ;

Page : 1511-1516

Keywords : Semantic; Word Sense Disambiguation; Clustering;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Today, web page clustering is useful for many applications such as categorization, cleaning, schema detection and automatic extractions. Web page clustering is classified into different categories that are hierarchical and flat clustering, online and offline clustering, soft and hard clustering, and document-based and keywords-based clustering. Among them, keyword-based web page clustering uses the single words or compounds words occurring in the web page set as the features for clustering. In this situation, these words cant precisely represent the content of the web page because the synonyms and polysemous of the word can lead the ambiguity problems. Semantic analysis is useful to solve this ambiguity problem. So, this system proposes both keyword-based and semantic-based web page clustering system, and then compares the performance between them. In the semantic analysis, words in each web page are first mapped to word senses by using supervised based word sense disambiguation method. Then, semantic-based web page clustering system uses both keywords and semantic features for clustering. After performing each cluster process, this system points out the semantic-based web page clustering system is more precise and effective than the keyword-based clustering system.

Last modified: 2021-06-28 17:20:55