Adversarial generation method for smart contract fuzz testing seeds guided by chain-based LLM

Jiaze Sun^1,2,3,
Zhiqiang Yin¹,
Hengshan Zhang^1,2,3^na1,
Xiang Chen⁴^na1 &
…
Wei Zheng⁵^na1

290 Accesses
Explore all metrics

Abstract

With the rapid development of smart contract technology and the continuous expansion of blockchain application scenarios, the security issues of smart contracts have garnered significant attention. However, traditional fuzz testing typically relies on randomly generated initial seed sets. This random generation method fails to understand the semantics of smart contracts, resulting in insufficient seed coverage. Additionally, traditional fuzz testing often ignores the syntax and semantic constraints within smart contracts, leading to the generation of seeds that may not conform to the syntactic rules of the contracts and may even include logic that violates contract semantics, thereby reducing the efficiency of fuzz testing. To address these challenges, we propose a method for adversarial generation for smart contract fuzz testing seeds guided by Chain-Based LLM, leveraging the deep semantic understanding capabilities of LLM to assist in seed set generation. Firstly, we propose a method that utilizes Chain-Based prompts to request LLM to generate fuzz testing seeds, breaking down the LLM tasks into multiple steps to gradually guide the LLM in generating high-coverage seed sets. Secondly, by establishing adversarial roles for the LLM, we guide the LLM to autonomously generate and optimize seed sets, producing high-coverage initial seed sets for the program under test. To evaluate the effectiveness of the proposed method, 2308 smart contracts were crawled from Etherscan for experimental purposes. Results indicate that using Chain-Based prompts to request LLM to generate fuzz testing seed sets improved instruction coverage by 2.94% compared to single-step requests. The method of generating seed sets by establishing adversarial roles for the LLM reduced the time to reach maximum instruction coverage from 60 s to approximately 30 s compared to single-role methods. Additionally, the seed sets generated by the proposed method can directly trigger simple types of vulnerabilities (e.g., timestamp dependency and block number dependency vulnerabilities), with instruction coverage improvements of 3.8% and 4.1%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Differential testing solidity compiler through deep contract manipulation and mutation

Article 23 April 2024

CSAFuzzer: Fuzzing smart contracts combining with static analysis

Article 10 February 2025

Adaptive scheduling-based fine-grained greybox fuzzing for cloud-native applications

Article Open access 26 June 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Ackerman, J., Cybenko, G.: Large language models for fuzzing parsers (registered report). In: Proceedings of the 2nd International Fuzzing Workshop, pp. 31-38 (2023)
Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on ethereum smart contracts (SoK). In: Maffei, M., Ryan, M. (eds) Principles of security and trust. POST 2017. Lect. Notes Comput. Sci. 10204. Springer, Berlin, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54455-6_8
Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on ethereum smart contracts (sok). In: Principles of Security and Trust: 6th International Conference, POST 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22–29, 2017, Proceedings 6, pp. 164–186. Springer, Berlin (2017)
Borji, A.: Generated faces in the wild: Quantitative comparison of stable diffusion, midjourney and dall-e 2 (2022). arXiv: preprintarXiv:2210.00586
Chowdhery, A., Narang, S., Devlin, J., et al.: Palm: scaling language modeling with pathways. J. Mach. Learn. Res. 24(240), 1–113 (2023)
MATH Google Scholar
Deng, Y., Xia, C. S., Peng, H., et al.: Large language models are zero-shot fuzzers: fuzzing deep-learning libraries via large language models. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 423–435 (2023)
Deng, Y., Xia, C. S., Yang, C., et al.: Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt (2023). arXiv preprint arXiv:2304.02014
Enis, M., Hopkins, M.: From LLM to NMT: Advancing Low-Resource Machine Translation with Claude (2024). arXiv preprint arXiv:2404.13813
Feng, Z., Guo, D., Tang, D., et al.: Codebert: a pre-trained model for programming and natural languages (2020). arXiv preprint arXiv:2002.08155
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 1 (2014)
MATH Google Scholar
Gu, Q.: Llm-based code generation method for golang compiler testing. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 2201–2203 (2023)
Guo, D., Ren, S., Lu, S., et al.: Graphcodebert: pre-training code representations with data flow (2020). arXiv preprint arXiv:2009.08366
Hu, J., Qian, Z., Heng, Y.: Augmenting greybox fuzzing with generative AI (2023). arXiv preprint arXiv:2306.06782
Hu, Sihao, et al.: Large language model-powered smart contract vulnerability detection: New perspectives (2023). arXiv preprint arXiv:2310.01152
Jiang, B., Liu, Y., Chan, W.K.: ContractFuzzer: Fuzzing smart contracts for vulnerability detection. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), Montpellier, France (2018). https://doi.org/10.1145/3238147.3238177
Lemieux, C., Inala, J.P., Lahiri, S.K., et al.: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 919–931. IEEE (2023)
Liu, C., Liu, H., Cao, Z., Chen, Z., Chen, B., Roscoe, B.: ReGuard: Finding reentrancy bugs in smart contracts. In: IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), Gothenburg, Sweden, 2018, pp. 65-68 (2018)
Liu, C., Bao, X., Zhang, H., et al.: Improving chatgpt prompt for code generation (2023). arXiv preprint arXiv:2305.08360
Ma, W., Wu, D., Sun, Y., et al.: Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications (2024). arXiv preprint arXiv:2403.16073
Meng, R., Mirchev, M., Bohme, M., et al.: Large language model guided protocol fuzzing. In: Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS) (2024)
Miah, M.S.U., Kabir, M.M., Sarwar, T.B., et al.: A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM. Sci. Rep. 14, 9603 (2024). https://doi.org/10.1038/s41598-024-60210-7
Article MATH Google Scholar
Mihalache, A., Grad, J., Patil, N. S., et al.: Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye, 1, 1–6 (2024)
Qiu, F., Ji, P., Hua, B., et al.: Chemfuzz: large language models-assisted fuzzing for quantum chemistry software bug detection. In: 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C), pp. 103–112. IEEE (2023)
Radhakrishnan, A.M.: Is Midjourney-AI a new anti-hero of architectural imagery and creativity. GSJ 11(1), 94–104 (2023)
MATH Google Scholar
Sakhrawi, Z., Labidi, T.: Test case selection and prioritization approach for automated regression testing using ontology and COSMIC measurement[J]. Automat. Softw. Eng. 31(2), 51 (2024)
Article MATH Google Scholar
Shou, C., Tan, S., Sen, K.: Ityfuzz: Snapshot-based fuzzer for smart contract. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023, pp. 322–333 (2023)
Wang, Y., Wang, W., Joty, S., et al.: Codet5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation (2021). arXiv preprint arXiv:2109.00859
Wei, J., Tay, Y., Bommasani, R., et al.: Emergent abilities of large language models (2022). arXiv preprint arXiv:2206.07682
Wu, T., He, S., Liu, J., et al.: A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Automat. Sinica 10(5), 1122–1136 (2023)
Article MATH Google Scholar
Wustholz, V., Christakis, M.: Harvey: A greybox fuzzer for smart contracts. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2020, pp. 1398-1409 (2020)
Xia, C.S., et al.: Fuzz4all: Universal fuzzing with large language models. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp. 1–13 (2024)
Zhang, C., Bai, M., Zheng, Y., et al.: Understanding large language model based fuzz driver generation (2023). arXiv preprint arXiv:2307.12469
Zolkepli, H., Razak, A., Adha, K., et al.: Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding (2024). arXiv preprint arXiv:2401.13565

Download references

Funding

This work was supported in part by the Key R & D project of Shaanxi Province(2023-YBGY-030), the Key Industrial Chain Core Technology Research Project of Xi’an (23ZDCYJSGG0028–2022), the National Natural Science Foundation of China (62272387).

Author information

Hengshan Zhang, Xiang Chen and Wei Zheng have contributed equally to this work.

Authors and Affiliations

School of Computer Science and Technology, Xi’an University of Posts and Telecommunications, Xi’an, 710121, China
Jiaze Sun, Zhiqiang Yin & Hengshan Zhang
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing (Xi’an University of Posts and Telecommunications), Xi’an, 710121, China
Jiaze Sun & Hengshan Zhang
Xi’an Key Laboratory of Big Data and Intelligent Computing (Xi’an University of Posts and Telecommunications), Xi’an, 710121, China
Jiaze Sun & Hengshan Zhang
School of Information Science and Technology, Nantong University, Nantong, 226019, China
Xiang Chen
School of Software, Northwestern Polytechnical University, Xi’an, 710072, China
Wei Zheng

Authors

Jiaze Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Hengshan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaze Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, J., Yin, Z., Zhang, H. et al. Adversarial generation method for smart contract fuzz testing seeds guided by chain-based LLM. Autom Softw Eng 32, 12 (2025). https://doi.org/10.1007/s10515-024-00483-4

Download citation

Received: 01 July 2024
Accepted: 19 December 2024
Published: 31 December 2024
DOI: https://doi.org/10.1007/s10515-024-00483-4

Adversarial generation method for smart contract fuzz testing seeds guided by chain-based LLM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Differential testing solidity compiler through deep contract manipulation and mutation

CSAFuzzer: Fuzzing smart contracts combining with static analysis

Adaptive scheduling-based fine-grained greybox fuzzing for cloud-native applications

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Adversarial generation method for smart contract fuzz testing seeds guided by chain-based LLM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Differential testing solidity compiler through deep contract manipulation and mutation

CSAFuzzer: Fuzzing smart contracts combining with static analysis

Adaptive scheduling-based fine-grained greybox fuzzing for cloud-native applications

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now