Abstract
With the rapid development of smart contract technology and the continuous expansion of blockchain application scenarios, the security issues of smart contracts have garnered significant attention. However, traditional fuzz testing typically relies on randomly generated initial seed sets. This random generation method fails to understand the semantics of smart contracts, resulting in insufficient seed coverage. Additionally, traditional fuzz testing often ignores the syntax and semantic constraints within smart contracts, leading to the generation of seeds that may not conform to the syntactic rules of the contracts and may even include logic that violates contract semantics, thereby reducing the efficiency of fuzz testing. To address these challenges, we propose a method for adversarial generation for smart contract fuzz testing seeds guided by Chain-Based LLM, leveraging the deep semantic understanding capabilities of LLM to assist in seed set generation. Firstly, we propose a method that utilizes Chain-Based prompts to request LLM to generate fuzz testing seeds, breaking down the LLM tasks into multiple steps to gradually guide the LLM in generating high-coverage seed sets. Secondly, by establishing adversarial roles for the LLM, we guide the LLM to autonomously generate and optimize seed sets, producing high-coverage initial seed sets for the program under test. To evaluate the effectiveness of the proposed method, 2308 smart contracts were crawled from Etherscan for experimental purposes. Results indicate that using Chain-Based prompts to request LLM to generate fuzz testing seed sets improved instruction coverage by 2.94% compared to single-step requests. The method of generating seed sets by establishing adversarial roles for the LLM reduced the time to reach maximum instruction coverage from 60 s to approximately 30 s compared to single-role methods. Additionally, the seed sets generated by the proposed method can directly trigger simple types of vulnerabilities (e.g., timestamp dependency and block number dependency vulnerabilities), with instruction coverage improvements of 3.8% and 4.1%, respectively.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ackerman, J., Cybenko, G.: Large language models for fuzzing parsers (registered report). In: Proceedings of the 2nd International Fuzzing Workshop, pp. 31-38 (2023)
Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on ethereum smart contracts (SoK). In: Maffei, M., Ryan, M. (eds) Principles of security and trust. POST 2017. Lect. Notes Comput. Sci. 10204. Springer, Berlin, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54455-6_8
Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on ethereum smart contracts (sok). In: Principles of Security and Trust: 6th International Conference, POST 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22–29, 2017, Proceedings 6, pp. 164–186. Springer, Berlin (2017)
Borji, A.: Generated faces in the wild: Quantitative comparison of stable diffusion, midjourney and dall-e 2 (2022). arXiv: preprintarXiv:2210.00586
Chowdhery, A., Narang, S., Devlin, J., et al.: Palm: scaling language modeling with pathways. J. Mach. Learn. Res. 24(240), 1–113 (2023)
Deng, Y., Xia, C. S., Peng, H., et al.: Large language models are zero-shot fuzzers: fuzzing deep-learning libraries via large language models. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 423–435 (2023)
Deng, Y., Xia, C. S., Yang, C., et al.: Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt (2023). arXiv preprint arXiv:2304.02014
Enis, M., Hopkins, M.: From LLM to NMT: Advancing Low-Resource Machine Translation with Claude (2024). arXiv preprint arXiv:2404.13813
Feng, Z., Guo, D., Tang, D., et al.: Codebert: a pre-trained model for programming and natural languages (2020). arXiv preprint arXiv:2002.08155
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 1 (2014)
Gu, Q.: Llm-based code generation method for golang compiler testing. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 2201–2203 (2023)
Guo, D., Ren, S., Lu, S., et al.: Graphcodebert: pre-training code representations with data flow (2020). arXiv preprint arXiv:2009.08366
Hu, J., Qian, Z., Heng, Y.: Augmenting greybox fuzzing with generative AI (2023). arXiv preprint arXiv:2306.06782
Hu, Sihao, et al.: Large language model-powered smart contract vulnerability detection: New perspectives (2023). arXiv preprint arXiv:2310.01152
Jiang, B., Liu, Y., Chan, W.K.: ContractFuzzer: Fuzzing smart contracts for vulnerability detection. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), Montpellier, France (2018). https://doi.org/10.1145/3238147.3238177
Lemieux, C., Inala, J.P., Lahiri, S.K., et al.: Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 919–931. IEEE (2023)
Liu, C., Liu, H., Cao, Z., Chen, Z., Chen, B., Roscoe, B.: ReGuard: Finding reentrancy bugs in smart contracts. In: IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), Gothenburg, Sweden, 2018, pp. 65-68 (2018)
Liu, C., Bao, X., Zhang, H., et al.: Improving chatgpt prompt for code generation (2023). arXiv preprint arXiv:2305.08360
Ma, W., Wu, D., Sun, Y., et al.: Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications (2024). arXiv preprint arXiv:2403.16073
Meng, R., Mirchev, M., Bohme, M., et al.: Large language model guided protocol fuzzing. In: Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS) (2024)
Miah, M.S.U., Kabir, M.M., Sarwar, T.B., et al.: A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM. Sci. Rep. 14, 9603 (2024). https://doi.org/10.1038/s41598-024-60210-7
Mihalache, A., Grad, J., Patil, N. S., et al.: Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye, 1, 1–6 (2024)
Qiu, F., Ji, P., Hua, B., et al.: Chemfuzz: large language models-assisted fuzzing for quantum chemistry software bug detection. In: 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C), pp. 103–112. IEEE (2023)
Radhakrishnan, A.M.: Is Midjourney-AI a new anti-hero of architectural imagery and creativity. GSJ 11(1), 94–104 (2023)
Sakhrawi, Z., Labidi, T.: Test case selection and prioritization approach for automated regression testing using ontology and COSMIC measurement[J]. Automat. Softw. Eng. 31(2), 51 (2024)
Shou, C., Tan, S., Sen, K.: Ityfuzz: Snapshot-based fuzzer for smart contract. In: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023, pp. 322–333 (2023)
Wang, Y., Wang, W., Joty, S., et al.: Codet5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation (2021). arXiv preprint arXiv:2109.00859
Wei, J., Tay, Y., Bommasani, R., et al.: Emergent abilities of large language models (2022). arXiv preprint arXiv:2206.07682
Wu, T., He, S., Liu, J., et al.: A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Automat. Sinica 10(5), 1122–1136 (2023)
Wustholz, V., Christakis, M.: Harvey: A greybox fuzzer for smart contracts. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2020, pp. 1398-1409 (2020)
Xia, C.S., et al.: Fuzz4all: Universal fuzzing with large language models. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pp. 1–13 (2024)
Zhang, C., Bai, M., Zheng, Y., et al.: Understanding large language model based fuzz driver generation (2023). arXiv preprint arXiv:2307.12469
Zolkepli, H., Razak, A., Adha, K., et al.: Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding (2024). arXiv preprint arXiv:2401.13565
Funding
This work was supported in part by the Key R & D project of Shaanxi Province(2023-YBGY-030), the Key Industrial Chain Core Technology Research Project of Xi’an (23ZDCYJSGG0028–2022), the National Natural Science Foundation of China (62272387).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, J., Yin, Z., Zhang, H. et al. Adversarial generation method for smart contract fuzz testing seeds guided by chain-based LLM. Autom Softw Eng 32, 12 (2025). https://doi.org/10.1007/s10515-024-00483-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-024-00483-4