Indexing Strategies of MapReduce for Information Retrieval in Big Data
Journal: International Journal of Advances in Computer Science and Technology (IJACST) (Vol.5, No. 3)Publication Date: 2016-04-19
Authors : Mazen Farid; Rohaya Latip; Masnida Hussin; Mohammed Abdulkarem;
Page : 1-6
Keywords : Hadoop; Indexing; MapReduce; Sensei; Terrier;
Abstract
In Information Retrieval (IR), the efficient strategy of indexing large dataset and terabyte-scale data is still an issue because of information overload as the result of increasing the knowledge, increasing the number of different media, increasing the number of platforms, and increasing the interoperability of platforms. Across multiple processing machines, MapReduce has been suggested as a suitable platform that use for distributing the intensive data operations. In this project, sensei and Per-posting list indexing (Terrier) will be analyze as they are the two efficient MapReduce indexing strategies. The two indexing will be implemented in an existing framework of IR, and an experiment will be performed by using the Hadoop for MapReducing with the same large dataset. In particular, this paper will study the effectiveness of two indexing strategies (Sensei & Terrier), and try to find and verify the better efficient strategy between them. The experiment will measure the performance of retrieving when the size and processing power enlarge. The experiment examines how the indexing strategies scaled and work with large size of dataset and distributed number of machines. The throughput will be measured by using MB/S (Megabyte per Second), and the experiment results analyzing the performance and efficiency of indexing strategies between Sensei & Per-posting list indexing (Terrier).
Other Latest Articles
- How to secure web servers by the intrusion prevention system (IPS)?
- Transformation of LOG file using LIPT technique
- Integrating multiple intelligences and learning styles on solving problems , achievement in, and attitudes towards math in six graders with learning disabilities in cooperative groups
- A study of power availability in Oleh community in Isoko South local Government area of Delta state, Nigeria
- The implementation and investigation of securing web applications upon multi-platform for a single sign-on functionality
Last modified: 2016-05-15 01:22:50