[go: up one dir, main page]

Next Article in Journal
FD-YOLO: A YOLO Network Optimized for Fall Detection
Previous Article in Journal
Multi-Objective Optimization of Building Ventilation Systems Using Model Predictive Control: Integrating Air Quality, Energy Cost, and Environmental Impact
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detecting Rug-Pull: Analyzing Smart Contract Backdoor Codes in Ethereum

by
Kwan Woo Yu
1 and
Byung Mun Lee
2,*
1
Department of IT Convergence, Gachon University, Seongnam-si 13120, Republic of Korea
2
Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(1), 450; https://doi.org/10.3390/app15010450
Submission received: 26 October 2024 / Revised: 27 November 2024 / Accepted: 1 December 2024 / Published: 6 January 2025
(This article belongs to the Special Issue Advanced Blockchain Technology and Its Applications)

Abstract

:
Smart contracts enable autonomous execution between contracting parties without a centralized authority, thereby reducing contract management costs and enhancing the transparency and reliability of contracts. However, the absence of such a certification authority increases the risk of fraud. Rug-pull, a typical form of fraud, involves developers hiding backdoor codes in smart contracts to steal funds under certain conditions, causing significant damage to users. A Rug-pull list warns users of potential fraud, but it only identifies risks after damage has occurred. Additionally, existing backdoor code analysis tools are limited in their ability to detect backdoor codes hidden through modifications to existing patterns or suffer from low accuracy because they rely on comparisons with predefined backdoor codes. Therefore, this paper proposes a balance-tracking-based backdoor code detection model to identify backdoor codes in smart contracts. The proposed model detects backdoor codes by extracting functions from Ethereum bytecodes and inspecting the extracted functions to track balance changes. This approach allows for the detection of balance changes even when backdoor codes are concealed. Experimental results verifying the effectiveness of this model demonstrate 98% accuracy, 0.96 recall, and 0.98 precision. These results are expected to contribute significantly to effectively reducing fraud risks such as Rug-pull.

1. Introduction

Blockchain technology connects blocks in a chain using hash algorithms and stores multiple nodes in a distributed manner, making it impossible to falsify or alter the data within a block. This ensures safe transactions and data processing even in the absence of a dependable certification authority [1]. The Ethereum network, which combines blockchain technology with the Ethereum Virtual Machine (EVM), allows for the automatic execution of contracts between individuals without a certification authority [2]. The emergence of Decentralized Applications (DApps) that utilize these smart contracts enables users to easily use smart contracts and execute contracts with low fees [3].
However, the absence of such certification authorities increases the risk and potential damage from fraud. Specifically, “Rug-pull” is a form of fraud in which project developers hide backdoor codes in smart contracts and perform abnormal operations to steal funds under certain conditions, causing significant damage to users [4]. A typical example is the SQUID GAME TOKEN, which concealed backdoor codes allowing only the developer to sell the tokens, forcing users who invested in the tokens to lose all their investments [5]. Even when such frauds occur, it is difficult for a central authority to control or regulate them due to the freedom of programming and the decentralization of the blockchain on the Ethereum network.
Currently, to reduce the damages caused by Rug-pull, DApp developers are creating Rug-pull lists to notify users of the risks and discourage them from making investments [6,7]. However, these lists are provided only after damage has occurred, making it challenging to prevent damage in advance. Additionally, there is a limitation in that it takes time to evaluate a project before it can be added to the list [8]. Consequently, DApp developers are encouraging users to investigate projects themselves before making investments [9].
Users determine the risk of Rug-pull by reviewing the project’s content and the security of the smart contract’s source code. However, among the 68,268,300 smart contracts deployed on Ethereum, only 705,010 smart contracts disclose their source code, which is approximately 1% [10]. Due to this low disclosure rate, it is difficult to analyze smart contracts to assess their risks. Therefore, users rely on commercial services such as GoPlus to test the security level of smart contracts and detect backdoor codes [11]. These commercial services detect backdoor codes based on previously known code patterns. However, they cannot detect backdoor codes that have altered the location of the operator conducting the actual backdoor operation through modifications to existing patterns or backdoor codes hidden in atypical parts. Consequently, smart contracts equipped with such backdoor codes continue to create new victims on the Ethereum network [12].
To address these problems, this paper proposes a balance-tracking-based backdoor code detection model. The proposed model extracts functions from Ethereum bytecode and inspects the extracted functions to detect the presence of backdoor code. It achieves this by tracking the balance changes that occur within each function, enabling the detection of backdoor code encapsulation. The model identifies six types of backdoor codes based on these balance changes: token generation, token destruction, transfer limitation, funds manipulation, transfer fee, and proxy. This model detects backdoor codes that can cause direct financial loss through these mechanisms. This study aims to analyze 989 smart contracts for each attack type and measure accuracy, recall, and precision, thereby verifying the effectiveness of this model.
Section 2 explains the backdoor code types in Ethereum smart contracts and Rug-pull projects, as well as existing backdoor code detection models. Section 3 details the structure and operational methodology of the proposed backdoor code detection model, highlighting improvements over existing limitations. Section 4 describes the experiments conducted to classify backdoor codes in smart contracts, measuring the model’s accuracy and presenting the analysis results. Finally, Section 5 summarizes the entire content and provides concluding remarks.

2. Related Work

2.1. Ethreum Smart Contracts

On the Ethereum network, anyone can deploy contract code known as a smart contract and execute the deployed smart contract using transactions. A smart contract is a code that automatically executes programmed contract conditions. When certain conditions are satisfied, the contract content is automatically executed, enhancing efficiency in the contract execution process. Additionally, it can be executed in a safe and transparent manner without the intervention of a central authority, as all nodes on the Ethereum network share and verify the same data [13].
The process of using a smart contract on the Ethereum network involves two main steps: the developer’s deployment of the smart contract and the user’s interaction with it, as illustrated in Figure 1. First, when a developer deploys a smart contract, an address that identifies it is generated on the Ethereum blockchain. This address, consisting of 40 hexadecimal digits (e.g., 0xdAC17F958D2ee523a2206206994597C13D831ec7), is associated with the Ethereum Virtual Machine (EVM) code—the smart contract code—the nonce, which indicates the order of transactions; the Ethereum balance; and the storage, where data are stored [14]. Then, a user locates the address to use this smart contract, creates a transaction, and sends it to an Ethereum node. This process can be challenging for general users, serving as an entry barrier to the use of smart contracts.
DApp is an application designed to utilize these smart contracts more easily. It provides a user interface that allows users to access smart contracts through a web browser or mobile app, facilitating a more intuitive interaction with smart contracts [15,16]. A typical example of a DApp is Uniswap [17,18].
Uniswap is a decentralized exchange operating on the Ethereum blockchain, allowing developers to list tokens and users to purchase the tokens they desire, as shown in Figure 2. To list Token B on Uniswap, Token A, which will support exchange with Token B, is required. The developer connects their wallet to Uniswap and generates a Liquidity Pool to exchange Token A and Token B, thereby listing Token B. A user who wants to purchase the listed Token B first connects their wallet to Uniswap. After inputting the amount of Token A to be exchanged, the Liquidity Pool automatically processes the transaction and exchanges it for Token B [19,20]. Since Uniswap provides an open platform, anyone can become a developer who can list new tokens and a user who can purchase those tokens. This openness allows developers to introduce new projects and tokens to the market and enables users to invest more easily, generating mutual benefits.
However, the open nature of Uniswap also increases the likelihood of malicious projects emerging, as new tokens can be listed without thorough investigation and verification. For example, a malicious project developer can insert backdoor code into the smart contract, enabling only themselves to sell the tokens while users can only purchase them [21,22]. In such cases, users cannot sell the tokens they have purchased even if the price rises. The price of the listed token may increase as if it were an innovative project, enticing new users to purchase the token. Consequently, the project developer sells all the tokens after the price rises sufficiently to make a profit, causing users to suffer significant losses. These backdoor codes can be categorized into six types: the authority to generate tokens or destroy them arbitrarily, arbitrary transfer, funds manipulation, transfer fee, indirect control through a proxy contract, and transfer limitation, as illustrated in the example above.

2.2. Types of Backdoor Attacks

A smart contract backdoor attack refers to an attack technique that allows developers to exploit the system through malicious code or intentionally inserted vulnerabilities. These backdoor attacks can be categorized into the following types: token generation, token destruction, transfer limitation, funds manipulation, transfer fee, and proxy [23,24,25,26,27].
  • Token generation
Token generation backdoor code is a function that allows developers to issue additional new tokens after the initial issuance. If additional tokens are issued, it may devalue the tokens held by existing users. Examining the transfer function in Algorithm 1, it operates normally under standard conditions. However, if the specific condition in line 6 is triggered by a developer’s call, the code in line 7 executes to generate new tokens. By using this backdoor, developers can issue and sell additional tokens to gain unfair profits.
Algorithm 1. Solidity code of token generation in the transfer function.
1|function transfer(address to, uint256 amount) public {
2| require(to ! = address(0), “Invalid address”);
3| require(balance[msg.sender] >= amount, “Insufficient balance”);
4| balance[msg.sender] −= amount;
5| balance[to] += amount;
6| if(msg.sender==owner){
7|  balance[owner] = totalSupply;
8| }
9|}
  • Destroy token
Destroy token is a function that allows developers to arbitrarily destroy tokens from a specific balance. The developer can call the function to destroy tokens from any user’s balance. In the second line of Algorithm 2, the function checks if the caller is the owner, ensuring that only the developer has the authority to execute this action. Then, in the third line, it verifies that the target account has a sufficient balance to burn the specified amount of tokens. If these conditions are met, the function proceeds to the fourth line, where it deducts the amount from the user’s balance. This backdoor destroys users’ tokens by force, removes tokens from the market, and inflates the value of the developer’s own tokens, allowing the developer to gain unfair profits.
Algorithm 2. Solidity code of destroy token.
1|function burnFrom(address from, uint256 amount) public {
2| require(msg.sender()==owner, “Permission denied(Owner only)”);
3| require(balance[from] >= amount, “Insufficient balance”);
4| balance[from] −= amount;
5|}
  • Transfer limitation
The transfer limitation backdoor is a function that allows developers to block users from transferring tokens under certain conditions. Users can still hold their tokens, but they cannot trade or transfer them, which means that users cannot use their tokens. This includes the authority to generate tokens or destroy them arbitrarily, arbitrary transfer, funds manipulation, transfer fee, and indirect control through proxy contracts, as well as transfer limitation as in the above example. First, the developer can set the frozen account state to True using the account freezing function as in the third line of Algorithm 3. Then, when a transfer from the account is attempted, the transfer becomes impossible due to the inspection of the frozen state in the eighth line. This backdoor, similar to the destroy token backdoor, removes tokens from the market and inflates the value of the developer’s own tokens, allowing the developer to gain unfair profits.
Algorithm 3. Solidity code of transfer limitation with the account freeze function.
1|function setFrozen(address account) public {
2| require(msg.sender==owner, “Permission denied(Owner only)”);
3| frozen[account]= True;
4|}
5|function transfer(address to, uint256 amount) public {
6| require(to ! = address(0), “Invalid address”);
7| require(balance[msg.sender] >= amount, “Insufficient balance”);
8| require(frozen[msg.sender], “Transfer blocked”);
9| balance[msg.sender] −= amount;
10| balance[to] += amount;
11|}
  • Funds manipulation
The funds manipulation backdoor is a function that allows developers to arbitrarily transfer users’ tokens to other addresses. As shown in Algorithm 4, the developer can initiate the transfer of tokens from any user’s account by calling the function. In line 4, the function deducts the specified amount from the user’s balance, and in line 5, it adds the same amount to the recipient’s balance. In this case, users lose tokens without their own consent, and developers can steal users’ tokens, allowing the developer to gain unfair profits.
Algorithm 4. Solidity code of funds manipulation.
1|function ownerTransfer(address from, address to, uint256 amount) public {
2| require(msg.sender()==owner, “Permission denied(Owner only)”);
3| require(balance[from] >= amount, “Insufficient balance”);
4| balance[from] −= amount;
5| balance[to] += amount;
6|}
  • Transfer fee
Transfer fee is a function that allows the developer to charge additional fees when transferring tokens. This leads to a situation in which users need to pay high fees that exceed the transaction processing cost for transferring tokens. Looking at the example of Algorithm 5, the fee is set through the setFee function, and the fee is charged in the transfer function. First, the developer can set the fee to be applied using the fee setting function as in line 3. Then, when the transfer is attempted, the fee in line 9 is deducted and the token is transferred. If the developer applies a high fee, it can virtually make transfer impossible, similar to transfer limitation, removing tokens from the market and inflating the value of the developer’s own tokens, allowing the developer to gain unfair profits.
Algorithm 5. Solidity code of transfer fee with the fee set function.
1|function setFee(uint256 _feeRate) public {
2| require(msg.sender==owner, “Permission denied(Owner only)”);
3| feeRate = _feeRate/100;
4|}
5|function transfer(address to, uint256 amount) public {
6| require(to ! = address(0), “Invalid address”);
7| require(balance[msg.sender] >= amount, “Insufficient balance”);
8| balance[msg.sender] −= amount;
9| balance[to] += amount*(1-feeRate);
10| balance[owner]+=amount*(feeRate)
11|}
  • Proxy
The proxy contract is a method to manage the interface contract and the logic contract that performs the actual logic separately, providing the function to change the logic contract at any time. This means that developers can insert backdoor code by changing the logic contract at a specific point in time. For example, in Algorithm 6, the delegateTransfer function performs all operations of the transfer logic in the calculator address in line 2, and the staticTransfer function performs the balance increase and decrease operations in the calculator address in lines 8 and 10, receiving the results and applying them. This proxy-based backdoor may not seem to have a problem at first, but over time, backdoor codes may be inserted, resulting in unexpected damages.
Algorithm 6. Solidity code of proxy call on transfer function.
1|function delegateTransfer(address to, uint256 amount) public {
2| (bool success,) = calculator.delegatecall(
  abi.encodeWithSignature(“transfer(address,uint256)”, to, amount));
3| require(success, “Delegatecall failed”);
4|}
5|function staticTransfer(address to, uint256 amount) public {
6| unit256 _sender_ = balance[msg.sender];
7| unit256 _to = balance[to];
8| (bool successSub, bytes memory senderBalance) = calculator.staticcall(
  abi.encodeWithSignature(“sub(uint256,uint256)”, _sender, amount));
9| require(successSub, “Staticcall for sub failed”);
10| (bool successAdd, bytes memory receiverBalance) = calculator.staticcall(
  abi.encodeWithSignature(“add(uint256,uint256)”, _to, amount));
11| require(successAdd, “Staticcall for add failed”);
12| balance[msg.sender] = abi.decode(senderBalance, (uint256));
13| balance[to] = abi.decode(receiverBalance, (uint256));
14|}
To prevent damage from such backdoor codes, it is necessary to review the source code of the smart contract. Nevertheless, on the Ethereum network, the disclosure of source code is not mandatory and depends on the developer’s choice, making it difficult for users to directly inspect the source code. To overcome this limitation, several models are being introduced to detect backdoor code.

2.3. Backdoor Code Detection Models

Backdoor code detection models aim to identify and detect backdoor codes hidden in smart contracts. These detection models are divided into static detection models and dynamic detection models, depending on the backdoor code detection method [22,23].
  • Static detection model
Static detection models detect backdoor codes by analyzing the source code or EVM code without executing the smart contract. Since this model does not require execution, it detects backdoor codes by inspecting only the code itself without input data. For example, DEFIDE-FENDER tracks the execution path of the smart contract function and collects execution conditions, then compares the code patterns to detect backdoor codes [26]. First, an Abstract Syntax Tree (AST) is constructed in Ethereum bytecode to trace the execution path of the function. Then, static execution is performed based on the execution path of the function to collect the execution conditions of the backdoor code, and branch-by-branch execution conditions of the function are collected. Finally, the backdoor code is detected by comparing the collected execution conditions with the pattern of the function. This detection model is effective when the execution conditions and the pattern of the code match the known backdoor code patterns. However, this model has a limitation in that it cannot detect backdoor codes hidden in atypical parts through modification.
  • Dynamic detection model
Dynamic detection models detect backdoor codes while actually executing a smart contract. This model generates the input data required for execution and detects backdoor codes by analyzing transactions and state changes that occur through repeated executions. For example, Pied-Piper deploys EVM code to a local environment and detects backdoor codes by inducing the behavior of the backdoor code through repeated function calls [27]. First, it operates a local node and deploys EVM code. Then, it randomly generates input values for function calls. Finally, it repeatedly executes the function based on the generated input values and compares the changes in state to detect backdoor codes. However, this model has a limitation in that it takes a long time to detect backdoor codes since it repeatedly executes based on randomly entered input values and cannot detect backdoor codes if it fails to execute them. To solve this problem, this study proposes a balance-tracking-based backdoor code detection model. Since the proposed model operates based on a static detection model, not a dynamic detection model, it shortens the detection time and allows it to detect hidden backdoor codes by tracking the balance changes that occur within the function.

3. Balance-Tracking-Based Backdoor Detection Model

This study proposes a balance-tracking-based backdoor detection model to efficiently detect backdoor codes hidden in Ethereum smart contracts. The proposed model aims to detect even modified backdoor codes that are difficult to identify using existing detection models by analyzing the balance changes in internal functions, using the EVM code of a smart contract as input. The proposed balance-tracking-based backdoor detection model consists of three elements, as shown in Figure 3. First, the Function Extractor preprocesses the input EVM code to extract functions for backdoor detection. Then, the extracted functions track the balance changes that occur in the Balance Tracker to identify candidate functions that contain backdoor attacks. Finally, the Backdoor Code Inspector detects conditional logic and analyzes patterns to identify the presence of backdoor codes.

3.1. Function Extractor

Panoramix Decompiler is an open-source decompiler that recovers high-level function structures from EVM code [28]. EVM code is composed of stack-based bytecodes, making it difficult for humans to interpret them directly. Panoramix analyzes these bytecodes to extract the start and end of functions, control flow, operations, etc., and then decompiles them into a high-level language similar to Solidity. In this process, function signatures are used to identify each function. A function signature is a 4-byte identifier generated by hashing a string that connects the function name and input variable types using the Keccak-256 hash function. For example, the transfer(address to, uint256 amount) function has a signature of 0xa9059cbb. However, due to the unilateral nature of the hash function, it is difficult to directly identify the original function name or parameter types with the signature alone. This makes it challenging to accurately understand the meaning of the function in the decompiled code. To solve this problem, the Ethereum Signature Database is used [29]. This database, a community-based open-source project, stores various function signatures and their corresponding function names and parameter information. Developers register and share new function signatures to help others easily identify the functions. Therefore, Panoramix Decompiler compares the function signatures extracted from the EVM code with the Ethereum Signature Database, allowing it to clearly identify the original names and parameters of the restored functions. The Function Extractor identifies all public functions in the smart contract based on the matched information. Public functions can be called externally and perform the main functions of the smart contract, making them particularly important in security analysis. It collects information on the structure, parameters, return values, and other functions or operations called internally within each function, providing a basis for tracking balance changes and detecting backdoor codes in subsequent stages.

3.2. Balance Tracker

The Balance Tracker analyzes the public functions generated by the Function Extractor to classify candidate functions that pose a risk of being backdoor functions. A candidate function is defined as a function that has balance changes within it or contains a proxy-type function. To this end, the proxy function patterns are inspected, and balance changes are tracked.
First, the command to call another contract is inspected to examine the proxy function pattern. The method for calling another contract on the Ethereum network can be divided into Call, DELE-GATECALL, and STATICCALL, as shown in Table 1 below. CALL and STATICCALL can execute the code of the called contract and receive the result. For example, looking at the staticTransfer function in Algorithm 4, the ADD and SUB functions that process the increase and decrease of the balance are called from the calculator address, and the results are received and applied. If the developer modifies the address of the calculator function, they can execute the backdoor code. In addition, DELEGATECALL is a method that is executed directly in the context of the caller through the function call, enabling direct backdoor code application. For example, looking at the transfer function in Algorithm 4, the transfer function is called from the calculator address, and the transfer is processed without directly calculating the balance within the function. If the developer modifies the calculator address, they can execute the backdoor code. Therefore, the presence of CALL, DELEGATECALL, and STATICCALL commands is inspected and then classified as candidate functions.
Then, for the functions, the balance changes that may occur when the function is executed are tracked and classified as candidate functions. Except for proxy, all of the backdoor code types explained in Section 2.2—token generation, destroy token, transfer limitation, funds manipulation, and transfer fee—cause balance changes. For example, in the case of the token generation backdoor, the balance increases when the function is executed, but a new balance is generated without a corresponding decrease or by a larger amount compared to the existing balance. To track such balance changes, it is necessary to first identify the balance to be tracked. In the smart contract, the balance is managed as a Mapping structure that stores the user wallet address as a key and the balance as data, as shown in Algorithm 7, and can be queried using the balanceOf function, which is an ERC-20 standard function [30]. Therefore, to identify the balance variable, the variables in the mapping structure are identified, and the variables returned by the balanceOf function are tracked to identify the variable that stores the balance.
Algorithm 7. Solidity code of the balanceOf function.
1|mapping(address => uint256) private balance;
2|function balanceOf(address account) public view returns (uint256) {
3| return balance[account];
4|}
After that, whether there is an operation on the balance variable in each function is closely inspected. Operations on the balance variable can be largely divided into two cases. The first case is when the balance variable is directly used for the operation, which means the balance variable is directly used within the function, as in line 4 of Algorithm 1. In this case, all locations where the balance variable is used are found, and the result of the operation and the location where the operation occurred are recorded. The second case is when the operation is performed using another variable and the result is stored in the balance variable, which means the value assigned to the balance variable is the result of the operation of other variables, as in line 12 of Algorithm 4. In this case, all variables that assign values to the balance variable must be tracked back. To this end, a variable tracing process is performed. Variable tracing identifies the variables assigned to the balance variable and confirms the declaration location and initial value of the variable. Then, the operation of the identified variables is tracked, and the location of the operation and the result value are recorded. Finally, the candidate functions are classified by integrating the results of all functions in which balance changes occurred and proxy function patterns. The candidate functions identified in this way are then finally classified as backdoor codes in the backdoor code inspector.

3.3. Backdoor Code Inspector

The backdoor code inspector analyzes balance changes and conditional statements of the candidate function generated in the balance tracker and classifies them as backdoor codes. To this end, it analyzes the balance change pattern of the candidate function and identifies the cause and characteristics of the balance change through conditional statements to determine the existence of a backdoor. The type of backdoor code is classified into the six attack types outlined in Section 2.2: token generation, destroy token, transaction limitation, funds manipulation, fee, and proxy.
Token generation causes the result of balance tracking to lead to an increase in the balance. To detect this, the balance tracking results of the candidate function are analyzed as shown in Figure 4, and if a new balance is generated without decreasing the balance or if a larger amount of balance is increased compared to the existing one, it is classified as a token generation backdoor.
Destroy token causes the result of balance tracking to lead to a decrease in someone else’s balance. To detect this, the balance tracking result of the candidate function is analyzed as shown in Figure 5 to inspect whether there is only a decrease in the balance or whether there is a decrease in the balance that is greater than the input value. Then, whether the address, in which the decrease occurs, is the message sender is inspected, and if it is not the message sender, it is classified as a destroy token backdoor.
Transaction limitation restricts balance transfers through a conditional statement before the balance transfer. To detect this, candidate functions are analyzed as shown in Figure 6, and conditional statements of the function with the balance transfer are analyzed to see if there is an additional conditional statement. The additional conditional statement is not a general conditional statement that inspects balance or overflow, but a conditional statement that allows or prohibits the transfer, depending on a variable or an account. If an additional conditional statement is found as a result of the analysis, it is classified as a transaction limitation backdoor.
Funds manipulation causes the balance transfer after a call from an unauthorized user. To detect this, whether the message sender’s balance decreases and the balance of the input address increases in the candidate function is inspected, as shown in Figure 7. This confirms whether the balance of another account decreases, or whether the recipient’s balance increases normally. Then, the conditional statements of the function approval are inspected, and if they do not match the approval process, it is classified as a funds manipulation backdoor.
Transfer fee causes a decrease in the resulting value of the transfer by the amount of the fee. To detect this, the calculation of the fee is analyzed regarding the balance change in the candidate function, as shown in Figure 8. If a small value is sent or if there is a fee transfer, it is classified as a transfer fee backdoor.
Proxy can execute backdoor code by calling the functions of other addresses. To detect this, the method of calling other functions is confirmed in the candidate function, as shown in Figure 9. In the case of DELEGATECALL, it is classified as a proxy backdoor since it can directly execute backdoor code in the present contract. Additionally, in the case of CALL and STATICCALL, if there is a balance change or state change after the call, it is classified as a proxy backdoor.
Whether backdoor code is detected for each function is determined based on the final classified results. Through this process, users can determine whether a specific function has a potential security risk.

3.4. Extending the Model to Other Platforms

The proposed balance-tracking-based backdoor detection model’s applicability can be enhanced by extending it to blockchain environments beyond the Ethereum platform. The model operates by analyzing the EVM code of standard Ethereum smart contracts. Platforms such as Binance Smart Chain, which also utilizes EVM, do not present significant challenges for application. However, platforms that employ different smart contract architectures, such as Solana or Sui, pose challenges due to differences in the languages and formats of their smart contracts. To address this issue, there needs to be an interface.
The interface for inspecting smart contracts on other platforms can be accessed through the model’s balance tracker. However, if the contracts on these platforms do not comply with the ERC-20 standard, the model’s accuracy may decrease. In order to resolve this, the standards of each platform need to be converted to conform to the ERC-20 standard. Table 2 lists the ERC-20 standard functions necessary for this conversion, and functions corresponding to these descriptions should be mapped accordingly. The balanceOf function, which returns an account’s balance, is essential for accurate balance tracking in the model. The transfer function, responsible for token transfers, is crucial for analyzing transfer restrictions and fees. Similarly, the transferFrom, approve, and allowance functions play important roles in permission tracking, which is vital for detecting unauthorized access or manipulation.

4. Evaluation and Discussion

4.1. Generation of Evaluation Data

In this study, to evaluate the backdoor code tracking performance of the balance-tracking-based backdoor detection model, an experiment was conducted using the bytecode of a smart contract as input to inspect the backdoor code and compare the results. The bytecodes used in the experiment were generated directly based on source codes randomly selected from the smart contracts registered in Uniswap. To this end, as shown in Figure 10a, the names and addresses of projects registered in Uniswap were collected, and the verified contracts in Etherscan were used, as shown in Figure 10b, to obtain the source codes of the projects. The verified contracts in Etherscan are the Solidity source codes that match the registered bytecodes of the Ethereum network, allowing the retrieval of the same source codes as the actual projects.
The source code obtained through this process was compiled into EVM code, as shown in Figure 11b, using the Solidity compiler after adding backdoor code, as shown in Figure 11a. Then, data were constructed by recording the project name, backdoor code function name in the backdoor_data.csv file, and type, as shown in Figure 11c. For the EVM code used in the experiment, 200 codes for each of the six attack types defined in Section 2.2 were prepared, and 100 codes were configured so that different backdoor codes existed together to confirm whether the proposed model could accurately detect and classify even when multiple backdoor codes were present.
In addition, to enhance the credibility of the experimental results, 189 actual backdoor EVM code samples collected from the Ethereum blockchain were used [27]. These codes were extracted directly from deployed contracts on the Ethereum network via Etherscan, which provides access to the bytecode of live contracts. They represent real-world instances of backdoor attacks and offer a practical evaluation of the model’s performance on genuine malicious contracts. By directly utilizing on-chain data registered in the Ethereum blockchain and accessed through Etherscan, this experiment ensures that the detection model is tested against authentic threats, reflecting its effectiveness in real-world scenarios.

4.2. Environments

For the experimental environment, Python 3.9.16 and Panoramix Decompiler 0.6.1 were used, and the Ethereum Signature Database as of 20 October 2024 was utilized, as shown in Table 3. The server specifications were Intel® Xeon® CPU E5-2630 v3 @ 2.40 GHz processor, 32 GB RAM, and Rocky Linux 9.2 OS. All experimental environments were set identically to maintain the consistency of results.
This experiment analyzed the previously prepared bytecode using the balance-tracking-based backdoor detection model, as shown in the operation log of Figure 12a, and the analysis results were recorded in the inspection_results.csv file, as shown in Figure 12b. In the analysis results file, the name of the analyzed bytecode was recorded in column A, and the name of the bytecode file was recorded in column B to identify which backdoor code the analysis result corresponds to. Additionally, the name of the backdoor function was recorded in column C, and the type of the backdoor function was recorded in column D to confirm whether the type of the function classified as the backdoor function matched the actual backdoor code type.

4.3. Evaluation of Backdoor Code Detection Accuracy

Through this experiment, a total of 989 EVM codes were inspected for each of the six attack types defined in Section 2.2. This total includes 200 EVM codes prepared for the evaluation of each attack type and an additional 189 actual backdoor EVM code samples extracted directly from real-world data registered on the Ethereum blockchain. The accuracy of the model was evaluated by comparing the inspection results with the presence of actual backdoor codes. For the experimental results, a confusion matrix was used, as shown in Table 4. The confusion matrix consists of the following four elements. The frequency of accurately detecting the actual backdoor code type (TP: true positives). The frequency of correctly detecting normal code without backdoor code (TN: true negatives). The frequency of incorrectly detecting normal code as backdoor code (FP: false positives). The frequency of incorrectly detecting backdoor code as normal code (FN: false negatives). Based on these values, accuracy, precision, and recall for each type were calculated using Equations (1)–(3).
Accuracy = (TP + TN)/(TP + FP + FN + TN)
Precision = TP/(TP + FP)
Recall = TP/(TP + FN)
As shown in Table 4, the proposed model achieved high accuracy across all backdoor code types, ranging from 97.4% to 99.4%. The precision values varied between 0.95 and 0.98, while recall values ranged from 0.95 to 1.0. These metrics indicate that the backdoor codes were accurately detected with minimal false positives and false negatives. Notably, the model achieved a recall of 1.0 for the proxy backdoor code type, meaning it correctly identified all instances of this backdoor without missing any. This indicates that the proposed model can effectively identify backdoor codes in smart contracts. In addition, it accurately detected all backdoor codes in smart contracts where multiple backdoor codes existed simultaneously. These results demonstrate that the model can effectively respond even when an attacker performs a complex attack by combining multiple backdoors.

5. Conclusions

This study proposed a balance-tracking-based backdoor code detection model to efficiently detect backdoor codes hidden in Ethereum smart contracts. To overcome the limitations of existing static and dynamic detection models, the model was designed to effectively detect even modified or hidden backdoor codes by tracking balance changes within functions and analyzing conditional statements.
The proposed model consists of three major components. First, public functions were extracted from EVM codes through the Function Extractor to identify the target functions for analysis. Then, balance changes that occurred in the extracted functions were tracked using the Balance Tracker, and proxy patterns were identified to select candidate functions that had the potential to be backdoor codes. Finally, in the Backdoor Code Inspector, the conditional statements and patterns of candidate functions were analyzed in detail to classify backdoor codes into six types.
As a result of the experiment, the proposed model achieved 98% accuracy in detecting backdoor codes. Specifically, the model’s accuracy ranged from 97.4% to 99.4% across different backdoor code types, with precision values between 0.95 and 0.98 and recall values from 0.95 to 1.0. This means that the proposed model can accurately detect various types of backdoor codes with minimal false positives and false negatives. In particular, the reliability and effectiveness of the model were proven since it accurately identified almost all backdoor codes in smart contracts with multiple backdoor codes. These results demonstrate the model’s effectiveness in responding to complex attacks that combine multiple backdoors.
This study presented a practically applicable backdoor code detection method to enhance the security level of smart contracts, contributing to allowing users to recognize the risks of smart contracts in advance and prevent damage. In addition, it has high practical applicability in that it can perform analysis with only bytecode, regardless of whether the source code is disclosed.
Future studies will focus on developing a generalized backdoor code detection model that can be applied to other blockchain platforms beyond Ethereum, contributing to enhancing the security of the entire blockchain ecosystem.

Author Contributions

K.W.Y. and B.M.L. conceived and designed the experiments; K.W.Y. performed the experiments; K.W.Y. and B.M.L. analyzed the data; K.W.Y. wrote the paper. K.W.Y. and B.M.L. have read and approved the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Gachon University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available at https://github.com/yukwanwoo/Smart-contract-detection-test (accessed on 26 November 2024).

Acknowledgments

This work was supported by the Gachon University research fund of 2024 (GCU-202400540001).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Nakamoto, S. Bitcoin: A peer-to-peer electronic cash system. Decentralized Bus. Rev. 2008, 21260. [Google Scholar]
  2. Buterin, V. Ethereum Whitepaper. Available online: https://ethereum.org/en/whitepaper/ (accessed on 26 November 2024).
  3. Khan, S.N.; Loukil, F.; Ghedira-Guegan, C.; Benkhelifa, E.; Bani-Hani, A. Blockchain smart contracts: Applications, challenges, and future trends. Peer-to-Peer Netw. Appl. 2021, 14, 2901–2925. [Google Scholar] [CrossRef] [PubMed]
  4. Bartoletti, M.; Carta, S.; Cimoli, T.; Saia, R. Dissecting Ponzi schemes on Ethereum: Identification, analysis, and impact. Future Gener. Comput. Syst. 2020, 102, 259–277. [Google Scholar] [CrossRef]
  5. CoinDesk. Squid Game Token Crashes. Available online: https://www.coindesk.com/markets/2021/11/01/squid-game-token-crashes-developers-say-theyve-left-the-project/ (accessed on 26 November 2024).
  6. De.Fi. Rekt Database. Available online: https://de.fi/rekt-database (accessed on 26 November 2024).
  7. Uniswap. Unsupported Tokens on Uniswap. Available online: https://unsupportedtokens.uniswap.org/ (accessed on 26 November 2024).
  8. Uniswap. What Are Token Warnings? Available online: https://support.uniswap.org/hc/en-us/articles/8723118437133-What-are-token-warnings (accessed on 26 November 2024).
  9. Finance, C. Risks, Security & Audits. Available online: https://resources.curve.fi/risks-security/risks/pool/ (accessed on 26 November 2024).
  10. Etherscan. Verified Contracts on Etherscan. Available online: https://etherscan.io/contractsVerified (accessed on 26 November 2024).
  11. GoPlus-Labs. GoPlus Network Whitepaper. Available online: https://whitepaper.gopluslabs.io/goplus-network (accessed on 26 November 2024).
  12. Chaliasos, S.; Charalambous, M.A.; Zhou, L.; Galanopoulou, R.; Gervais, A.; Mitropoulos, D.; Livshits, B. Smart Contract and DeFi Security Tools: Do They Meet the Needs of Practitioners? In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, Lisbon, Portugal, 14–20 April 2024; pp. 1–13. [Google Scholar]
  13. John, K.; Kogan, L.; Saleh, F. Smart contracts and decentralized finance. Annu. Rev. Financ. Econ. 2023, 15, 523–542. [Google Scholar] [CrossRef]
  14. Wood, G. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Yellow Pap. Shanghai Version 2024, 151, 1–42. [Google Scholar]
  15. Metcalfe, W. Ethereum, Smart Contracts, DApps. In Blockchain and Crypto Currency; Yano, M., Dai, C., Masuda, K., Kishimoto, Y., Eds.; Springer: Singapore, 2020; pp. 77–93. ISBN 978-981-15-3376-1. [Google Scholar]
  16. Ranganthan, V.P.; Dantu, R.; Paul, A.; Mears, P.; Morozov, K. A decentralized marketplace application on the ethereum blockchain. In Proceedings of the IEEE 4th International Conference on Collaboration and Internet Computing, Philadelphia, PA, USA, 18–20 October 2018; pp. 90–97. [Google Scholar]
  17. Kitzler, S.; Victor, F.; Saggese, P.; Haslhofer, B. Disentangling decentralized finance (DeFi) compositions. ACM Trans. Web 2023, 17, 1–26. [Google Scholar] [CrossRef]
  18. Uniswap. Available online: https://app.uniswap.org/ (accessed on 26 November 2024).
  19. Xu, J.; Paruch, K.; Cousaert, S.; Feng, Y. Sok: Decentralized exchanges (DEX) with automated market maker (AMM) protocols. ACM Comput. Surv. 2023, 55, 1–50. [Google Scholar] [CrossRef]
  20. Alamsyah, A.; Salsabila, N. Exploring the Mechanisms of Decentralized Finance (DeFi) Using Blockchain Technology. In Proceedings of the 2024 3rd International Conference on Creative Communication and Innovative Technology, Tangerang, Indonesia, 7–8 August 2024; pp. 1–8. [Google Scholar]
  21. Zhou, Y.; Sun, J.; Ma, F.; Chen, Y.; Yan, Z.; Jiang, Y. Stop pulling my rug: Exposing rug pull risks in crypto token to investors. In Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice, New York, NY, USA, 14–20 April 2024; pp. 228–239. [Google Scholar]
  22. Sun, D.; Ma, W.; Nie, L.; Liu, Y. SoK: Comprehensive Analysis of Rug Pull Causes, Datasets, and Detection Tools in DeFi. arXiv 2024, arXiv:2403.16082. [Google Scholar]
  23. Qian, P.; Cao, R.; Liu, Z.; Li, W.; Li, M.; Zhang, L.; Xu, Y.; Chen, J.; He, Q. Empirical review of smart contract and defi security: Vulnerability detection and automated repair. arXiv 2023, arXiv:2309.02391. [Google Scholar]
  24. Cernera, F.; La Morgia, M.; Mei, A.; Sassi, F. Token Spammers, Rug Pulls, and Sniper Bots: An Analysis of the Ecosystem of Tokens in Ethereum and in the Binance Smart Chain (BNB). In Proceedings of the 32nd USENIX Security Symposium, Anaheim, CA, USA, 9–11 August 2023; pp. 3349–3366. [Google Scholar]
  25. Li, X.; Yang, J.; Chen, J.; Tang, Y.; Gao, X. Characterizing Ethereum Upgradable Smart Contracts and Their Security Implications. In Proceedings of the ACM on Web Conference 2024, New York, NY, USA, 13–17 May 2024; pp. 1847–1858. [Google Scholar]
  26. Chen, J.; Hu, J.; Xia, X.; Lo, D.; Grundy, J.; Gao, Z.; Chen, T. Angels or demons: Investigating and detecting decentralized financial traps on ethereum smart contracts. Autom. Softw. Eng. 2024, 31, 63. [Google Scholar] [CrossRef]
  27. Ma, F.; Ren, M.; Ouyang, L.; Chen, Y.; Zhu, J.; Chen, T.; Zheng, Y.; Dai, X.; Jiang, Y.; Sun, J. Pied-piper: Revealing the backdoor threats in ethereum erc token contracts. ACM Trans. Softw. Eng. Methodol. 2023, 32, 1–24. [Google Scholar] [CrossRef]
  28. Kolinko, T. Panoramix Decompiler. Available online: https://pypi.org/project/panoramix-decompiler (accessed on 26 November 2024).
  29. Bitfly. Ethereum Signature Database. Available online: https://www.4byte.directory (accessed on 26 November 2024).
  30. Vogelsteller, F.; Buterin, V. ERC-20: Token Standard. Available online: https://eips.ethereum.org/EIPS/eip-20 (accessed on 26 November 2024).
Figure 1. Example of Smart Contract Deployment and Use on the Ethereum Network.
Figure 1. Example of Smart Contract Deployment and Use on the Ethereum Network.
Applsci 15 00450 g001
Figure 2. Process of Uniswap on the Ethereum Network.
Figure 2. Process of Uniswap on the Ethereum Network.
Applsci 15 00450 g002
Figure 3. Balance-Tracking-Based Backdoor Detection Model.
Figure 3. Balance-Tracking-Based Backdoor Detection Model.
Applsci 15 00450 g003
Figure 4. Token Generation Classification Process.
Figure 4. Token Generation Classification Process.
Applsci 15 00450 g004
Figure 5. Destroy Token Classification Process.
Figure 5. Destroy Token Classification Process.
Applsci 15 00450 g005
Figure 6. Transaction limitation Classification Process.
Figure 6. Transaction limitation Classification Process.
Applsci 15 00450 g006
Figure 7. Funds Manipulation Classification Process.
Figure 7. Funds Manipulation Classification Process.
Applsci 15 00450 g007
Figure 8. Fee Classification Process.
Figure 8. Fee Classification Process.
Applsci 15 00450 g008
Figure 9. Proxy Classification Process.
Figure 9. Proxy Classification Process.
Applsci 15 00450 g009
Figure 10. A portion of the data collected for the experiment. (a) Contract addresses collected from Uniswap. (b) Contract Source Code of the contract addresses.
Figure 10. A portion of the data collected for the experiment. (a) Contract addresses collected from Uniswap. (b) Contract Source Code of the contract addresses.
Applsci 15 00450 g010
Figure 11. Example of data used for the experiment. (a) Example of a backdoor function inserted into the source code. (b) A portion of the EVM code used to detect the backdoor code. (c) Contents of the backdoor_data.csv file.
Figure 11. Example of data used for the experiment. (a) Example of a backdoor function inserted into the source code. (b) A portion of the EVM code used to detect the backdoor code. (c) Contents of the backdoor_data.csv file.
Applsci 15 00450 g011
Figure 12. The operation results of the Balance-Tracking-Based Backdoor Detection Model. (a) Operation log. (b) Contents of the inspection_results.csv file.
Figure 12. The operation results of the Balance-Tracking-Based Backdoor Detection Model. (a) Operation log. (b) Contents of the inspection_results.csv file.
Applsci 15 00450 g012
Table 1. Type of Contract Calls in the Ethereum Network.
Table 1. Type of Contract Calls in the Ethereum Network.
Call TypeDescription
CALLCall a method in another contract.
DELEGATECALLCall a method in another contract using the storage of the current contract.
STATICCALLCall a method in another contract without state changes.
Table 2. ERC-20 Standard Functions and Descriptions for Smart Contract Transformation.
Table 2. ERC-20 Standard Functions and Descriptions for Smart Contract Transformation.
Function NameDescription
balanceOf (owner)Returns the token balance of the specified owner address.
transfer (to, amount)Transfers a specified number of tokens from the caller’s account to the to address.
transferFrom (from, to, amount)Transfers a specified number of tokens from the from address to the to address using the allowance mechanism.
approve (spender, amount)Allows the spender address to withdraw up to a specified amount from the caller’s account.
allowance (owner, spender)Returns the remaining number of tokens that the spender is allowed to withdraw from the owner’s account.
Table 3. Server Specifications and Software Versions.
Table 3. Server Specifications and Software Versions.
ItemSpecification or Version
CPUIntel(R) Xeon(R) CPU E5-2630 v3 @ 2.40 GHz (Santa Clara, CA, USA)
RAM32 GB
OSRocky Linux 9.2
PythonPython 3.9.16
DecompilerPanoramix Decompiler 0.6.1
Ethereum Signature DatabaseAccess date: 20 October 2024
Table 4. Confusion Matrix and Accuracy for Backdoor Code Detection.
Table 4. Confusion Matrix and Accuracy for Backdoor Code Detection.
Backdoor Code TypeTPFPFNTNAccuracyPrecisionRecall
Token generation2285675098.8%0.970.97
Destroy token2269375198.7%0.960.98
Transaction limitation309121365597.4%0.960.95
Funds manipulation2029177898.9%0.950.99
Fee2004278399.3%0.980.99
Proxy2005-78499.4%0.971.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yu, K.W.; Lee, B.M. Detecting Rug-Pull: Analyzing Smart Contract Backdoor Codes in Ethereum. Appl. Sci. 2025, 15, 450. https://doi.org/10.3390/app15010450

AMA Style

Yu KW, Lee BM. Detecting Rug-Pull: Analyzing Smart Contract Backdoor Codes in Ethereum. Applied Sciences. 2025; 15(1):450. https://doi.org/10.3390/app15010450

Chicago/Turabian Style

Yu, Kwan Woo, and Byung Mun Lee. 2025. "Detecting Rug-Pull: Analyzing Smart Contract Backdoor Codes in Ethereum" Applied Sciences 15, no. 1: 450. https://doi.org/10.3390/app15010450

APA Style

Yu, K. W., & Lee, B. M. (2025). Detecting Rug-Pull: Analyzing Smart Contract Backdoor Codes in Ethereum. Applied Sciences, 15(1), 450. https://doi.org/10.3390/app15010450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop