Enhancing N-List Structure and Performance for Efficient Large Dataset Analysis

Journal: International Journal of Computer Science and Mobile Computing - IJCSMC (Vol.13, No. 1)

Publication Date: 2024-01-30

Authors : Arkan A. Ghaib; Abdullah A. Nahi;

Page : 49-58

Keywords : data mining; frequent itemset mining; DiffNodesets; N-list; data structures; Node-list;

Source : Download Find it from : Google Scholar

Abstract

One of the main challenges in data-intensive sectors like scientific research, data mining, and machine learning is efficiently analyzing enormous datasets. A popular data structure in similarity search algorithms to speed up the retrieval of closest neighbors is the N-List. In this paper, a high-performance method for mining frequent item sets called EN-list is presented. It represents item sets using an N-list and finds frequently recurring item sets directly using an aset-enumeration search tree. Specifically, it drastically reduces the search field by applying the powerful pruning approach known as Children-Parent Equivalency pruning. We conducted extensive experiments to compare En-list against three state-of-the-art algorithms: Fin, PrePost, and DiffNodesets on four distinct real datasets. The experimental results show that EN-list is always the fastest approach across all datasets. Furthermore, EN-list shows good memory consumption performance, requiring less memory than DiffNodesets and PrePost methods and just slightly more than the Fin approach.

Main Menu

Searching By

PARTNERS

Enhancing N-List Structure and Performance for Efficient Large Dataset Analysis

Abstract

Advertisement