DeFiScope is a tool for detecting various DeFi price manipulations with LLM reasoning. This repo contains the source code of DeFiScope and datasets used in fine-tuning and evaluation.
Full details including the setting of hyperparameters could be found in model_comparison_result.md
Python 3.12.9
$ pip install -r fine-tuning/requirements.txtRun:
deepspeed --num_gpus=<number of gpus> fine-tuning/fine-tuning.py --base_model_name <base model name> --model_name_or_path <model name or path> --tokenizer_name_or_path <tokenizer name or path> --version <checkpoint version> --num_epochs <number of epochs> --lr <learning rate> --save_per_x_epochs <save checkpoint period>
Where
<number of gpus>is the number of GPUs you want to used in fine-tuning.<base model name>,<checkpoint version>together form the path where the checkpoint is saved: checkpoints/{base_model_name}/{version}.<model name or path>can be either a local path or a huggingface model name, e.g. microsoft/Phi-3-medium-128k-instruct.<number of epochs>refers to the total number of epochs will run in the fine-tuning process.<learning rate>is to set the learning rate.<save checkpoint period>denotes the interval at which checkpoints are saved.
Python 3.10.12
$ pip install -r requirements.txtFollow the instructions in OpenAI for instructions on setting OpenAI API key
Run detector with the following command:
$ python main.py -tx <transaction hash> -bp <chain ID>- Where
<transaction hash>is the transaction hash of the test transaction - Where
<chain ID>is the blockchain platform where test transaction is on. The blockchain platforms supported by the detector include Ethereum (chain ID= ethereum), BSC (chain ID= bsc).
Run detector with the following command:
$ python main.py -tx <transaction hash> -bp <chain ID> --use_local_model --model_path <model path>- Where
<model path>can be either a local path or a huggingface model name, e.g.microsoft/Phi-3-medium-128k-instruct.
Detection result is stored in detection_result.jsonl
Under the folder dataset/:
-
D1.csv,D2.csv,D3.csvrefer to D1, D2, D3 in §VII, respectively. -
training_set.csv,eval_set.csv,test_set.csvare the datasets used in fine-tuning OpenAI's models and Phi-3-medium-128k-instruct, containing 800, 100, 100 samples respectively. -
1000_tx.csvcontains 1000 transaction used to quantitatively compare the performance of recovering DeFi operations between the proposed Transfer Graph (TG) and the Cash Flow Tree (CFT) introduced by DeFiRanger. Full details of this dataset and the corresponding experiment could be found in the Supplementary Material (supplementary_material.md), Section K.
This project is released under the MIT License.
Please cite the paper as follows if you use the data or code from DeFiScope:
@inproceedings{zhong2025defiscope,
title={{Detecting Various DeFi Price Manipulations with LLM Reasoning}},
author={Zhong, Juantao and Wu, Daoyuan and Liu, Ye and Xie, Maoyi and Liu, Yang and Li, Yi and Liu, Ning},
booktitle={Proc. IEEE/ACM Automated Software Engineering (ASE)},
year={2025}
}