ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Anomaly detection in smart contracts based on optimal relevance hybrid features analysis in the Ethereum blockchain employing ensemble learning

Journal: International Journal of Advanced Technology and Engineering Exploration (IJATEE) (Vol.10, No. 109)

Publication Date:

Authors : ; ;

Page : 1552-1579

Keywords : Ethereum; Blockchain; Smart contract; Features selection; Relevance features; Ensemble method; Anomaly detection.;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Blockchain 2.0 has revolutionized the domain by introducing blockchain as a decentralized application (DApp) development platform, previously recognized mainly in the cryptocurrency sphere. Consequently, the rise of DApp development has inadvertently camouflaged fraudulent activities within smart contracts, leading to substantial losses for investors. Implementing machine learning (ML) approaches can significantly enhance the efficacy of anomaly detection. However, many studies still grapple with selecting the most pertinent features to optimize anomaly detection levels. This challenge intensifies when managing the high-dimensional raw data extracted directly from the Ethereum blockchain network, which falls under the category of big data. Smart contracts, the core of blockchain that governs DApp logic, have increasingly become a haven for fraud. This study focuses on analyzing three primary characteristic components based on contract source code (operation code (opcode), application binary interface (ABI) code, and contract transaction) to develop anomaly detection models in smart contracts using an ensemble hybrid feature strategy. The approach involves two key stages: firstly, reducing the initial feature size through constant, quasi-constant, and variant validation; and secondly, identifying the most relevant feature set using the searching for uncorrelated list of variables (SULOV) method, grounded in the minimum redundancy maximum relevance (MRMR) principle. The anomaly detection model employs a voting ensemble technique, harnessing a dataset of the most pertinent features. The model's effectiveness is gauged by comparing its performance with individual models, including random forest (RF), k-nearest neighbor (KNN), decision tree (DT), linear discriminant analysis (LDA), and stochastic gradient descent (SGD). The findings indicate that the proposed model achieves superior anomaly detection levels, with a determination value measurement rate of 92.99%, outperforming individual classifiers using the 44 most relevant features while minimizing classification time. The model's efficiency is further corroborated through comparative analysis with previous studies and alternative methodologies using the same contract dataset. The proposed ensemble-based model significantly improves anomaly detection in contract source code analysis, employing a minimal and relevant set of features refined through the SULOV method.

Last modified: 2024-01-04 15:19:19