Abstract
With the emergence and popularity of large-scale unlabeled data, self-supervised learning has become a new development trend. This learning paradigm can acquire high-quality representations of complex data without relying on labels, saving labor costs, avoiding human labeling errors, and rivaling supervised learning in expressing image features. However, research has revealed insufficient security in neural network models trained under the self-supervised learning paradigm, making them susceptible to backdoor attacks. This malicious behavior significantly undermines the security of artificial intelligence models and severely hinders the development of the Internet of Things with intelligence. This paper proposes an invisible backdoor attack scheme on key regions based on target neurons. Firstly, the key regions of image feature expression are determined through the attention mechanism. In this region, this paper trains a set of critical neurons to obtain a trigger capable of inducing misclassifications. Subsequently, a poison dataset is constructed to attack the self-supervised training model. The triggers generated by this scheme resemble random noise. They are inconspicuous in visual space, achieving a high attack success rate and enhancing both the concealment and effectiveness of the triggers. The self-supervised training model with the implanted backdoor evades backdoor detection, further increasing the model’s indistinguishability. Ultimately, experimental results demonstrate that while ensuring the concealment of triggers, this scheme can achieve a high attack success rate with only 1% poison data, and the poison model can escape detection.
X. Qian and Y. He—Contribute equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728 (2018)
Chen, K., Hong, L., Xu, H., Li, Z., Yeung, D.Y.: Multisiam: self-supervised multi-instance siamese representation learning for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7546–7554 (2021)
Chen, X., He, K.: Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–15758 (2021)
Chen, Y., et al.: Image super-resolution reconstruction based on feature map attention mechanism. Appl. Intell. 51, 4367–4380 (2021)
Chou, S.Y., Chen, P.Y., Ho, T.Y.: How to backdoor diffusion models? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4015–4024 (2023)
Doan, K., Lao, Y., Zhao, W., Li, P.: Lira: Learnable, imperceptible and robust backdoor attacks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11966–11976 (2021)
Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: Badnets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)
Gulzar, Y.: Fruit image classification model based on mobilenetv2 with deep transfer learning technique. Sustainability 15, 1906 (2023)
Lee, J.H., Kim, H., Park, H.J., Heo, J.H.: Temporal prediction modeling for rainfall-induced shallow landslide hazards using extreme value distribution. Landslides 18, 321–338 (2021)
Li, Y., Lyu, X., Koren, N., Lyu, L., Li, B., Ma, X.: Anti-backdoor learning: training clean models on poisoned data. Adv. Neural. Inf. Process. Syst. 34, 14900–14912 (2021)
Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Invisible backdoor attack with sample-specific triggers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16463–16472 (2021)
Liang, H., et al.: Self-supervised spatiotemporal representation learning by exploiting video continuity. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1564–1573 (2022)
Misra, I., Maaten, L.v.d.: Self-supervised learning of pretext-invariant representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6707–6717 (2020)
Niizumi, D., Takeuchi, D., Ohishi, Y., Harada, N., Kashino, K.: Byol for audio: self-supervised learning for general-purpose audio representation. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021)
Saha, A., Tejankar, A., Koohpayegani, S.A., Pirsiavash, H.: Backdoor attacks on self-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13337–13346 (2022)
Wang, B., et al.: Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723 (2019)
Wang, Y., Liu, Y., Wang, Q., Wang, C., Li, C.: Poisoning self-supervised learning based sequential recommendations. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 300–310 (2023)
Xu, J., et al..: Groupvit: semantic segmentation emerges from text supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18134–18144 (2022)
Zeng, Y., Park, W., Mao, Z.M., Jia, R.: Rethinking the backdoor attacks’ triggers: a frequency perspective. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16473–16481 (2021)
Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.G.: Clean-label backdoor attacks on video recognition models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14443–14452 (2020)
Zhao, Y., Wang, G., Luo, C., Zeng, W., Zha, Z.J.: Self-supervised visual representations learning by contrastive mask prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10160–10169 (2021)
Acknowledgments
This research is supported by the National Natural Science Foundation of China (NSFC) under grant number 62172377, the Taishan Scholars Program of Shandong province under grant number tsqn202312102, and the Startup Research Foundation for Distinguished Scholars under grant number 202112016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Qian, X., He, Y., Zhang, R., Kang, Z., Sheng, Y., Xia, H. (2024). Invisible Backdoor Attacks on Key Regions Based on Target Neurons in Self-Supervised Learning. In: Cao, C., Chen, H., Zhao, L., Arshad, J., Asyhari, T., Wang, Y. (eds) Knowledge Science, Engineering and Management. KSEM 2024. Lecture Notes in Computer Science(), vol 14886. Springer, Singapore. https://doi.org/10.1007/978-981-97-5498-4_10
Download citation
DOI: https://doi.org/10.1007/978-981-97-5498-4_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5497-7
Online ISBN: 978-981-97-5498-4
eBook Packages: Computer ScienceComputer Science (R0)