ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Fast Dictionary Construction using Data Structure and Numeration Methodology with Double Hashing

Journal: International Journal of Science and Research (IJSR) (Vol.6, No. 5)

Publication Date:

Authors : ; ;

Page : 2718-2725

Keywords : Fast String Operations; Binary Search; Double Hashing; Thesaurus Construction; Generation; Knowledge Dictionary; extract word Features; data structures concepts;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

The problem of text retrieval is continuously attracting more research attention, they still used for efficiently analyze text data. The unstructured text data take more importance in numerous fields such as business analysis, customer retention and extension, social media, information retrieval and legal applications, etc. This article considers the importance of exploratory dictionary construction for finding the concepts of interest, also it proposes a system for efficient dictionary construction, tuning. The re-use of these dictionaries across a large scale and different datasets still remain an unsolved problem. This paper employing different types of hash functions to conduct progressive multi-search stages, and reducing the time that required constructing the dictionary as much as possible while maintaining the accuracy of the information contained in it. Many text-mining tools, hashing functions, data structures concepts and numeration operations were utilized in the planned system in order to provide a dynamic word dictionary. This could be used for fast text retrieval systems as a result of its small size in comparison with the original dataset. The proposed algorithm was designed for improving the time complexity due to the ability to retrieve an accurate result in a short time. This could be done by obtaining the advantages of binary search, which lets the processing time replaced from being linear to logarithmic behavior. The obtaining result is considered the highest when compared with the results of other published works, especially those based on dealing with string as a sequence of characters. The proposed system extracts the important word informations which gave chance to text retrieval system for attaining accurate and fast results.

Last modified: 2021-06-30 18:55:25