ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

ENHANCED QUERY EXPANSION FOR WEB INFORMATION RETRIEVAL

Journal: International Journal of Civil Engineering and Technology (IJCIET) (Vol.8, No. 8)

Publication Date:

Authors : ; ;

Page : 910-915

Keywords : Retrieval Feedback; Query Expansion; optimization; pseudo relevant documents; Naïve Bayes classifier; Co-occurrence approach;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Due to the increase in demand of web documents, finding the required document is an important issue for the user. The user is unable to access the relevant documents due to inappropriate query and improper knowledge. In order to enhance the searching efficiency of relevant documents, the original user query is needed to be reformulated. In this paper, a novel Enhanced Query Expansion based Classifier (EQC) technique is proposed for web document retrieval. It uses feedback based documents for query expansion, reformation and optimization. First, the relevant documents for the given query are obtained by means of NTCIR-6 algorithm. From that, the topmost k relevant documents are selected to represent the feature terms. Then the documents are classified using Naïve Bayes classifier which provides good feedback documents. Then the unique terms are extracted and they are ranked using co-occurrence based approach. Rocchio Algorithm was implemented to reweight the unique terms and to expand the query. By using Binary Group Search Optimizer (BGSO) Algorithm, optimum query is selected for document retrieval. The original query is reformulated and feedback is given to the dataset for searching the relevant document. The efficiency of expanded query is measured in terms of Precision (P), Recall (R), F-measure and Mean Average Precision (MAP). The precision, recall and F-measure values are increased with 1%, 3% and 3.5% respectively. The performance measures show the improvement in relevant document retrieval scheme with reduced computational complexity.

Last modified: 2018-04-10 15:20:43