FFA: Foreground Feature Approximation Digitally against Remote Sensing Object Detection
<p>Illustration of the FFA realizing the targetless attack and the targeted attack. The purple box indicates the part that joins the perturbation and succeeds in the attack, and the green box indicates the part that does not join the perturbation and is recognized correctly. The orange dashed box part indicates the targetless attack, and the blue dashed box part indicates the targeted attack.</p> "> Figure 2
<p>Visualization of the feature extraction process. The left column shows the input image, the middle column shows the foreground portion of various images, and the right column represents the corresponding shallow features of the foreground image. Blue arrows indicate the foreground extraction process, and purple arrows indicate the feature extraction process. The orange dashed box portion indicates an targetless attack, and the blue dashed box portion indicates a targeted attack.</p> "> Figure 3
<p>Flowchart of FFA for generating AEs. The bottommost arrow in the figure represents the adversarial example generation process. First, we need to generate a hybrid image <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold-italic">x</mi> <mo>˜</mo> </mover> </semantics></math>. Input image <math display="inline"><semantics> <mi mathvariant="bold-italic">x</mi> </semantics></math> after the detector to obtain the predicted foreground box <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>p</mi> <mi>r</mi> <mi>e</mi> </mrow> </msub> </semantics></math> and predicted labels <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>p</mi> <mi>r</mi> <mi>e</mi> </mrow> </msub> </semantics></math>. Use <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>p</mi> <mi>r</mi> <mi>e</mi> </mrow> </msub> </semantics></math> to intercept the input image <math display="inline"><semantics> <mi mathvariant="bold-italic">x</mi> </semantics></math> and hybrid image <math display="inline"><semantics> <mover accent="true"> <mi mathvariant="bold-italic">x</mi> <mo>˜</mo> </mover> </semantics></math> to obtain the input foreground <math display="inline"><semantics> <msub> <mi mathvariant="bold-italic">x</mi> <mi mathvariant="bold-italic">f</mi> </msub> </semantics></math> and hybrid foreground <math display="inline"><semantics> <mover accent="true"> <msub> <mi mathvariant="bold-italic">x</mi> <mi mathvariant="bold-italic">f</mi> </msub> <mo>˜</mo> </mover> </semantics></math>. After extracting the features through the backbone network, the KL divergence of the two is calculated as the feature loss. The Smooth L1 loss between <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>p</mi> <mi>r</mi> <mi>e</mi> </mrow> </msub> </semantics></math> and the real object box <math display="inline"><semantics> <msub> <mi>B</mi> <mrow> <mi>t</mi> <mi>r</mi> <mi>u</mi> <mi>t</mi> <mi>h</mi> </mrow> </msub> </semantics></math> is also calculated as the object box loss, and the cross-entropy loss between <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>p</mi> <mi>r</mi> <mi>e</mi> </mrow> </msub> </semantics></math> and the background label <math display="inline"><semantics> <msub> <mi>l</mi> <mrow> <mi>b</mi> <mi>g</mi> </mrow> </msub> </semantics></math> is calculated as the classification loss, which together constitute the detector prediction loss. The sum of the L2 distance between the <span class="html-italic">i</span>-1st generated AE <math display="inline"><semantics> <msubsup> <mi mathvariant="bold-italic">x</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> <mrow> <mi>i</mi> <mo>−</mo> <mn>1</mn> </mrow> </msubsup> </semantics></math> and <math display="inline"><semantics> <mi mathvariant="bold-italic">x</mi> </semantics></math> is calculated as the perception loss. The weighted combination of feature loss, prediction loss, and perception loss yields the total loss, which is iterated to generate the adversarial perturbation <math display="inline"><semantics> <msub> <mi>δ</mi> <mi>i</mi> </msub> </semantics></math>, which is summed with <math display="inline"><semantics> <mi mathvariant="bold-italic">x</mi> </semantics></math> to obtain the adversarial example <math display="inline"><semantics> <msubsup> <mi mathvariant="bold-italic">x</mi> <mrow> <mi>a</mi> <mi>d</mi> <mi>v</mi> </mrow> <mi>i</mi> </msubsup> </semantics></math> after iteration.</p> "> Figure 4
<p>Visualization of FFA targetless attack detection results. There are four layers in total. The top two layers represent the original images and AEs detection results for small objects, and the bottom two layers represent the original images and AEs detection results for large objects. The orange dashed section represents the detection outcomes of the two-stage ODs, while the blue dashed section represents the detection outcomes of the single-stage ODs.</p> "> Figure 5
<p>Visualization of object detection results for targeted attacks against planes. The part indicated by the red arrow is the target category, the orange color is the detection result of the two-stage OD, and the blue color is the detection result of the single-stage OD.</p> "> Figure 6
<p>Visualization of object detection results for targeted attacks against vehicles. The part indicated by the red arrow is the target category, the orange color is the detection result of the two-stage OD, and the blue color is the detection result of the single-stage OD.</p> "> Figure 7
<p>The impact of the iteration number on the attack. The trained and attacked ODs for both targetless attack and targeted attack are OR, and the backbone network is ReSNet50.</p> "> Figure 8
<p>Visualization results of perturbations in ablation experiments. The upper layer is the targetless attack, and the lower layer is the targeted attack. (<b>a</b>,<b>e</b>) are the original images; (<b>b</b>,<b>f</b>) are the perturbations with feature loss only; (<b>c</b>,<b>g</b>) are the perturbations with prediction loss and feature loss; and (<b>d</b>,<b>h</b>) are the perturbations with feature loss, prediction loss, and perception loss.</p> ">
Abstract
:1. Introduction
- We propose a universal digital attack framework for foreground feature approximation (FFA) using the output information of multiple modules of target detectors, aiming to find common vulnerabilities between different detectors and improve the aggression and migration of adversarial examples;
- We used a relatively complete evaluation system, including the attack efficiency, attack speed, quality of the confrontation sample, and other relevant parameters and made a detailed quantitative evaluation of the algorithm in this paper, which improved the persuasiveness of the results;
- The results of attacking seven rotating target detectors on the remote sensing target detection datasets DOTA and UCAS-COD show that our method can produce confrontation samples with strong attack ability, high mobility, and strong non-sensing type at a fast speed.
2. Related Work
2.1. Adversarial Attacks in Image Classification
2.2. Adversarial Attacks in Object Detection
2.3. Adversarial Attacks in Remote Sensing
3. Methodology
3.1. Problem Analysis
3.1.1. Location of the Perturbations
3.1.2. Magnitude of the Perturbations
3.2. Overview of the Methodology
3.3. Foreground Feature Approximation (FFA) Attack
3.3.1. Targetless Attack
Location of the Perturbation
The Size of the Perturbation
Algorithm 1: Foreground feature approximation (FFA). |
|
3.3.2. Targeted Attack
Location of the Perturbation
The Size of the Perturbation
4. Experiments
4.1. Experimental Preparation
4.1.1. Datasets
4.1.2. Detectors
4.1.3. Evaluation Metrics
4.1.4. Parameter Setting
4.2. Targetless Attack
4.3. Targeted Attack
4.3.1. White Box Attack Performance
4.3.2. Transferability Experiments
4.3.3. Imperceptibility and Attack Speed Test
4.4. Effect of Iteration Number
4.5. Ablation Experiments
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Dou, P.; Huang, C.; Han, W.; Hou, J.; Zhang, Y.; Gu, J. Remote sensing image classification using an ensemble framework without multiple classifiers. ISPRS J. Photogramm. Remote Sens. 2024, 208, 190–209. [Google Scholar] [CrossRef]
- Zhu, R.; Ma, S.; Lian, J.; He, L.; Mei, S. Generating Adversarial Examples Against Remote Sensing Scene Classification via Feature Approximation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 10174–10187. [Google Scholar] [CrossRef]
- Xu, Y.; Ghamisi, P. Universal adversarial examples in remote sensing: Methodology and benchmark. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Gu, H.; Gu, G.; Liu, Y.; Lin, H.; Xu, Y. Multi-Branch Attention Fusion Network for Cloud and Cloud Shadow Segmentation. Remote Sens. 2024, 16, 2308. [Google Scholar] [CrossRef]
- Xiao, T.; Liu, Y.; Huang, Y.; Li, M.; Yang, G. Enhancing multiscale representations with transformer for remote sensing image semantic segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Yi, H.; Liu, B.; Zhao, B.; Liu, E. Small Object Detection Algorithm Based on Improved YOLOv8 for Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 1734–1747. [Google Scholar] [CrossRef]
- Wang, W.; Cai, Y.; Luo, Z.; Liu, W.; Wang, T.; Li, Z. SA3Det: Detecting Rotated Objects via Pixel-Level Attention and Adaptive Labels Assignment. Remote Sens. 2024, 16, 2496. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chen, Y.; Tang, Y.; Xiao, Y.; Yuan, Q.; Zhang, Y.; Liu, F.; He, J.; Zhang, L. Satellite video single object tracking: A systematic review and an oriented object tracking benchmark. ISPRS J. Photogramm. Remote Sens. 2024, 210, 212–240. [Google Scholar] [CrossRef]
- Zhang, Y.; Pu, C.; Qi, Y.; Yang, J.; Wu, X.; Niu, M.; Wei, M. CDTracker: Coarse-to-Fine Feature Matching and Point Densification for 3D Single-Object Tracking. Remote Sens. 2024, 16, 2322. [Google Scholar] [CrossRef]
- Xie, Y.; Zhan, N.; Zhu, J.; Xu, B.; Chen, H.; Mao, W.; Luo, X.; Hu, Y. Landslide extraction from aerial imagery considering context association characteristics. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103950. [Google Scholar] [CrossRef]
- Zhu, J.; Zhang, J.; Chen, H.; Xie, Y.; Gu, H.; Lian, H. A cross-view intelligent person search method based on multi-feature constraints. Int. J. Digit. Earth 2024, 17, 2346259. [Google Scholar] [CrossRef]
- Xu, W.; Feng, Z.; Wan, Q.; Xie, Y.; Feng, D.; Zhu, J.; Liu, Y. Building Height Extraction From High-Resolution Single-View Remote Sensing Images Using Shadow and Side Information. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024. [Google Scholar] [CrossRef]
- Mei, S.; Lian, J.; Wang, X.; Su, Y.; Ma, M.; Chau, L.P. A comprehensive study on the robustness of image classification and object detection in remote sensing: Surveying and benchmarking. arXiv 2023, arXiv:2306.12111. [Google Scholar] [CrossRef]
- Baniecki, H.; Biecek, P. Adversarial attacks and defenses in explainable artificial intelligence: A survey. Inf. Fusion 2024, 107, 102303. [Google Scholar] [CrossRef]
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
- Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
- Cui, C.; Ma, Y.; Cao, X.; Ye, W.; Zhou, Y.; Liang, K.; Chen, J.; Lu, J.; Yang, Z.; Liao, K.D.; et al. A survey on multimodal large language models for autonomous driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 4–8 January 2024; pp. 958–979. [Google Scholar]
- Zhao, J.; Zhao, W.; Deng, B.; Wang, Z.; Zhang, F.; Zheng, W.; Cao, W.; Nan, J.; Lian, Y.; Burke, A.F. Autonomous driving system: A comprehensive survey. Expert Syst. Appl. 2023, 242, 122836. [Google Scholar] [CrossRef]
- Cai, X.; Tang, X.; Pan, S.; Wang, Y.; Yan, H.; Ren, Y.; Chen, N.; Hou, Y. Intelligent recognition of defects in high-speed railway slab track with limited dataset. Comput.-Aided Civ. Infrastruct. Eng. 2024, 39, 911–928. [Google Scholar] [CrossRef]
- Niu, H.; Yin, F.; Kim, E.S.; Wang, W.; Yoon, D.Y.; Wang, C.; Liang, J.; Li, Y.; Kim, N.Y. Advances in flexible sensors for intelligent perception system enhanced by artificial intelligence. InfoMat 2023, 5, e12412. [Google Scholar] [CrossRef]
- Li, G.; Xu, Y.; Ding, J.; Xia, G.S. Towards generic and controllable attacks against object detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–12. [Google Scholar] [CrossRef]
- Lu, J.; Sibai, H.; Fabry, E. Adversarial examples that fool detectors. arXiv 2017, arXiv:1712.02494. [Google Scholar] [CrossRef]
- Xie, C.; Wang, J.; Zhang, Z.; Zhou, Y.; Xie, L.; Yuille, A. Adversarial examples for semantic segmentation and object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1369–1378. [Google Scholar]
- Kurakin, A.; Goodfellow, I.J.; Bengio, S. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018; pp. 99–112. [Google Scholar]
- Thys, S.; Van Ranst, W.; Goedemé, T. Fooling automated surveillance cameras: Adversarial patches to attack person detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 15–20 June 2019; pp. 49–55. [Google Scholar]
- Czaja, W.; Fendley, N.; Pekala, M.; Ratto, C.; Wang, I.J. Adversarial examples in remote sensing. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, WA, USA, 6–9 November 2018; pp. 408–411. [Google Scholar] [CrossRef]
- Lu, M.; Li, Q.; Chen, L.; Li, H. Scale-adaptive adversarial patch attack for remote sensing image aircraft detection. Remote Sens. 2021, 13, 4078. [Google Scholar] [CrossRef]
- Du, A.; Chen, B.; Chin, T.J.; Law, Y.W.; Sasdelli, M.; Rajasegaran, R.; Campbell, D. Physical adversarial attacks on an aerial imagery object detector. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 1796–1806. [Google Scholar]
- Lian, J.; Wang, X.; Su, Y.; Ma, M.; Mei, S. CBA: Contextual background attack against optical aerial detection in the physical world. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef] [PubMed]
- Agnihotri, S.; Jung, S.; Keuper, M. Cospgd: A unified white-box adversarial attack for pixel-wise prediction tasks. arXiv 2023, arXiv:2302.02213. [Google Scholar] [CrossRef]
- Liu, H.; Ge, Z.; Zhou, Z.; Shang, F.; Liu, Y.; Jiao, L. Gradient correction for white-box adversarial attacks. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–12. [Google Scholar] [CrossRef]
- Lin, G.; Pan, Z.; Zhou, X.; Duan, Y.; Bai, W.; Zhan, D.; Zhu, L.; Zhao, G.; Li, T. Boosting adversarial transferability with shallow-feature attack on SAR images. Remote Sens. 2023, 15, 2699. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. In Proceedings of the International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar] [CrossRef]
- Shi, Y.; Wang, S.; Han, Y. Curls & whey: Boosting black-box adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; Long Beach, CA, USA, 16–17 June 2019, pp. 6519–6527.
- Chen, P.Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar] [CrossRef]
- Yin, F.; Zhang, Y.; Wu, B.; Feng, Y.; Zhang, J.; Fan, Y.; Yang, Y. Generalizable black-box adversarial attack with meta learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 1804–1818. [Google Scholar] [CrossRef]
- Brendel, W.; Rauber, J.; Bethge, M. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. arXiv 2017, arXiv:1712.04248. [Google Scholar] [CrossRef]
- Reza, M.F.; Rahmati, A.; Wu, T.; Dai, H. Cgba: Curvature-aware geometric black-box attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 124–133. [Google Scholar]
- Boutros, F.; Struc, V.; Fierrez, J.; Damer, N. Synthetic data for face recognition: Current state and future prospects. Image Vis. Comput. 2023, 135, 104688. [Google Scholar] [CrossRef]
- Sun, Y.; Zhang, Y.; Wang, H.; Guo, J.; Zheng, J.; Ning, H. SES-YOLOv8n: Automatic driving object detection algorithm based on improved YOLOv8. Signal Image Video Process. 2024, 18, 3983–3992. [Google Scholar] [CrossRef]
- Wenqi, Y.; Gong, C.; Meijun, W.; Yanqing, Y.; Xingxing, X.; Xiwen, Y.; Junwei, H. MAR20: A benchmark for military aircraft recognition in remote sensing images. Natl. Remote Sens. Bull. 2024, 27, 2688–2696. [Google Scholar] [CrossRef]
- Yang, J.; Xu, J.; Lv, Y.; Zhou, C.; Zhu, Y.; Cheng, W. Deep learning-based automated terrain classification using high-resolution DEM data. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103249. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Wu, S.; Tan, Y.a.; Wang, Y.; Ma, R.; Ma, W.; Li, Y. Towards transferable adversarial attacks with centralized perturbation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 6109–6116. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, C. Attacking object detector by simultaneously learning perturbations and locations. Neural Process. Lett. 2023, 55, 2761–2776. [Google Scholar] [CrossRef]
- Liu, X.; Yang, H.; Liu, Z.; Song, L.; Li, H.; Chen, Y. Dpatch: An adversarial patch attack on object detectors. arXiv 2018, arXiv:1806.02299. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Zhu, H.; Chen, X.; Dai, W.; Fu, K.; Ye, Q.; Jiao, J. Orientation robust object detection in aerial images using deep convolutional neural network. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 3735–3739. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.S.; Bai, X. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI transformer for oriented object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2849–2858. [Google Scholar]
- Han, J.; Ding, J.; Xue, N.; Xia, G.S. Redet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2786–2795. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully convolutional one-stage object detection. arXiv 2019, arXiv:1904.01355. [Google Scholar]
- Han, J.; Ding, J.; Li, J.; Xia, G.S. Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
- Zhou, Y.; Yang, X.; Zhang, G.; Wang, J.; Liu, Y.; Hou, L.; Jiang, X.; Liu, X.; Yan, J.; Lyu, C.; et al. Mmrotate: A rotated object detection benchmark using pytorch. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 7331–7334. [Google Scholar] [CrossRef]
- Wang, Z.; Li, Q. Information content weighting for perceptual image quality assessment. IEEE Trans. Image Process. 2010, 20, 1185–1198. [Google Scholar] [CrossRef]
- Chow, K.H.; Liu, L.; Loper, M.; Bae, J.; Gursoy, M.E.; Truex, S.; Wei, W.; Wu, Y. Adversarial objectness gradient attacks in real-time object detection systems. In Proceedings of the 2020 Second IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA), Atlanta, GA, USA, 28–31 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 263–272. [Google Scholar] [CrossRef]
- Chen, P.C.; Kung, B.H.; Chen, J.C. Class-aware robust adversarial training for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 10420–10429. [Google Scholar]
Trained ODs/Backbone | Attack Method | mAP50 ↓ Attacked ODs | IW-SSIM ↓ | Time ↓ (s/Image) | ||||||
---|---|---|---|---|---|---|---|---|---|---|
OR | GV | RT | RD | S2A | RR | RF | ||||
Clean | 84.1 | 80.9 | 87.4 | 83.5 | 81.0 | 78.9 | 77.6 | - | - | |
OR [52] R50 | TOG [61] | 11.8 | 26.8 | 30.7 | 40.9 | 26.7 | 28.5 | 26.3 | 0.49 | 4.05 |
CWA [62] | 9.2 | 25.7 | 28.5 | 38.5 | 25.6 | 26.3 | 25.5 | 1.31 | 4.23 | |
LGP [22] | 4.1 | 19.3 | 21.6 | 35.3 | 22.0 | 20.4 | 20.7 | 0.22 | 6.12 | |
FFA (ours) | 3.3 | 14.2 | 19.0 | 30.2 | 20.5 | 19.1 | 19.8 | 0.85 | 6.68 | |
GV [53] R50 | TOG [61] | 40.5 | 29.4 | 28.1 | 60.7 | 34.7 | 36.7 | 37.1 | 0.74 | 6.73 |
CWA [62] | 38.1 | 27.6 | 35.9 | 59.4 | 35.4 | 34.2 | 35.3 | 0.86 | 5.81 | |
LGP [22] | 32.7 | 23.1 | 33.7 | 56.3 | 32.0 | 30.6 | 32.2 | 0.38 | 7.85 | |
FFA (ours) | 27.2 | 21.6 | 30.5 | 52.8 | 29.5 | 25.7 | 30.6 | 0.61 | 9.03 | |
RT [54] R50 | TOG [61] | 40.8 | 36.3 | 30.4 | 62.7 | 37.1 | 35.4 | 37.2 | 0.66 | 7.71 |
CWA [62] | 39.3 | 35.1 | 28.6 | 63.1 | 35.8 | 33.6 | 36.4 | 1.07 | 8.34 | |
LGP [22] | 35.4 | 29.8 | 20.8 | 60.2 | 32.3 | 30.1 | 32.4 | 0.52 | 10.37 | |
FFA (ours) | 31.9 | 26.2 | 18.0 | 55.0 | 28.6 | 27.5 | 29.6 | 0.62 | 8.47 | |
RD [55] R50 | TOG [61] | 58.3 | 61.2 | 68.4 | 27.9 | 62.8 | 64.7 | 60.3 | 0.51 | 6.74 |
CWA [62] | 55.7 | 60.1 | 65.5 | 25.1 | 60.6 | 65.4 | 59.5 | 0.87 | 8.80 | |
LGP [22] | 53.6 | 55.8 | 60.3 | 22.0 | 58.2 | 61.1 | 57.9 | 0.18 | 9.65 | |
FFA (ours) | 50.2 | 50.6 | 53.0 | 19.6 | 53.4 | 59.7 | 56.2 | 0.36 | 10.35 | |
S2A [58] R50 | TOG [61] | 47.5 | 49.8 | 54.1 | 62.7 | 20.6 | 53.1 | 57.4 | 0.98 | 14.26 |
CWA [62] | 45.1 | 45.6 | 52.4 | 60.3 | 11.9 | 50.4 | 55.3 | 1.25 | 19.54 | |
LGP [22] | 42.8 | 43.3 | 49.6 | 57.0 | 5.2 | 47.4 | 51.7 | 0.64 | 25.32 | |
FFA (ours) | 39.3 | 38.9 | 47.8 | 55.4 | 4.5 | 43.6 | 48.2 | 0.74 | 26.47 | |
RR [56] R50 | TOG [61] | 55.7 | 56.4 | 52.1 | 52.8 | 60.3 | 17.3 | 50.1 | 0.83 | 13.67 |
CWA [62] | 56.9 | 55.3 | 50.7 | 50.9 | 58.4 | 14.6 | 48.6 | 1.07 | 15.82 | |
LGP [22] | 52.6 | 50.8 | 47.2 | 48.0 | 54.6 | 10.2 | 46.4 | 0.67 | 27.62 | |
FFA (ours) | 49.3 | 47.1 | 45.8 | 44.9 | 51.0 | 8.5 | 41.5 | 0.75 | 28.46 | |
RF [57] R50 | TOG [61] | 46.5 | 47.9 | 47.4 | 50.5 | 50.4 | 55.7 | 18.3 | 0.71 | 15.65 |
CWA [62] | 45.8 | 45.3 | 45.6 | 47.3 | 47.6 | 53.4 | 15.7 | 0.93 | 19.54 | |
LGP [22] | 40.6 | 43.7 | 42.1 | 44.7 | 42.1 | 49.8 | 10.3 | 0.55 | 28.31 | |
FFA (ours) | 35.1 | 40.5 | 38.6 | 41.7 | 38.2 | 46.3 | 9.2 | 0.69 | 30.66 |
Origin | Target | Attack Method | mAP50 ↓ | ↑ | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
OR | GV | RT | RD | S2A | RF | RR | OR | GV | RT | RD | S2A | RF | RR | |||
Plane | Clean | 90.1 | 90.5 | 89.7 | 91 | 89.3 | 90.2 | 89.1 | 3505 | 3241 | 3359 | 3418 | 3525 | 3684 | 3457 | |
Ground track field | TOG [61] | 25.7 | 22.3 | 21.2 | 25.5 | 17.1 | 15.7 | 10.3 | 2215 | 2339 | 2237 | 2189 | 2681 | 2706 | 2892 | |
CWA [62] | 23.4 | 19.6 | 20.1 | 25.2 | 15.3 | 14.2 | 9.8 | 2367 | 2551 | 2342 | 2135 | 2764 | 2833 | 3085 | ||
LGP [22] | 20.2 | 16.7 | 16.1 | 23.1 | 11.9 | 10.2 | 3.7 | 2553 | 2780 | 2718 | 2554 | 2937 | 2823 | 3142 | ||
FFA (ours) | 18.7 | 13.1 | 12.5 | 19.0 | 7.8 | 6.3 | 1.2 | 2876 | 3108 | 3005 | 2735 | 3121 | 3027 | 3331 | ||
Basket- ball court | TOG [61] | 24.7 | 19.4 | 22.3 | 25.7 | 16.7 | 17.3 | 8.4 | 2147 | 2371 | 2287 | 2108 | 2576 | 2478 | 3059 | |
CWA [62] | 21.3 | 17.3 | 20.5 | 24.4 | 15.2 | 15.2 | 8.9 | 2213 | 2564 | 2349 | 2235 | 2635 | 2593 | 2974 | ||
LGP [22] | 19.2 | 14.6 | 17.8 | 21.2 | 10.4 | 10.7 | 4.1 | 2632 | 2744 | 2605 | 2573 | 2854 | 2849 | 3213 | ||
FFA(ours) | 17.5 | 12.1 | 15.6 | 17.3 | 6.9 | 7.3 | 2.2 | 2719 | 2893 | 2819 | 2746 | 3122 | 3085 | 3397 | ||
Round- about | TOG [61] | 20.6 | 20.1 | 22.3 | 25.3 | 12.4 | 13.5 | 8.5 | 2368 | 2271 | 2263 | 2182 | 2403 | 2513 | 2889 | |
CWA [62] | 19.4 | 18.7 | 20.5 | 22.7 | 10.9 | 11.6 | 5.2 | 2507 | 2584 | 2416 | 2237 | 2648 | 2678 | 3011 | ||
LGP [22] | 17.3 | 16.1 | 17.1 | 20.6 | 9.6 | 9.2 | 2.6 | 2640 | 2747 | 2661 | 2519 | 2931 | 2832 | 3234 | ||
FFA (ours) | 15.8 | 12.7 | 17.8 | 18.1 | 7.7 | 6.1 | 0.9 | 2841 | 2908 | 2773 | 2795 | 3086 | 3225 | 3411 | ||
Vehicle | Clean | 81.5 | 82.3 | 79.8 | 85.2 | 83.1 | 81.6 | 82.7 | 3565 | 3483 | 3217 | 3238 | 3804 | 3561 | 3615 | |
Ground track field | TOG [61] | 27.1 | 30.3 | 24.6 | 25.8 | 14.2 | 9.6 | 15.2 | 2306 | 2241 | 2048 | 2246 | 2763 | 3047 | 2971 | |
CWA [62] | 25.3 | 26.7 | 22.3 | 22.6 | 12.4 | 8.3 | 12.5 | 2253 | 2437 | 2369 | 2517 | 2845 | 3112 | 3102 | ||
LGP [22] | 22.7 | 21.6 | 19.1 | 20.3 | 9.2 | 4.9 | 6.7 | 2537 | 2614 | 2715 | 2658 | 3183 | 3474 | 3368 | ||
FFA (ours) | 20.2 | 18.3 | 16.5 | 19.7 | 8.6 | 3.7 | 5.1 | 2618 | 2736 | 2823 | 2709 | 3341 | 3507 | 3472 | ||
Basket- ball court | TOG [61] | 25.5 | 27.6 | 27.0 | 28.9 | 12.4 | 9.8 | 10.7 | 2201 | 2356 | 2207 | 2087 | 2655 | 2736 | 2867 | |
CWA [62] | 22.4 | 25.4 | 25.8 | 26.3 | 10.7 | 10.5 | 9.8 | 2351 | 2409 | 2365 | 2139 | 2813 | 2912 | 2983 | ||
LGP [22] | 19.8 | 20.5 | 19.3 | 22.5 | 8.4 | 7.6 | 5.8 | 2689 | 2654 | 2677 | 2483 | 3017 | 3202 | 3391 | ||
FFA(ours) | 19.6 | 18.7 | 17.9 | 20.7 | 8.0 | 7.6 | 4.4 | 2605 | 2736 | 2782 | 2625 | 3224 | 3285 | 3457 | ||
Round- about | TOG [61] | 28.9 | 29.1 | 22.2 | 26.7 | 16.4 | 12.6 | 11.3 | 2168 | 2145 | 2516 | 2253 | 2806 | 3013 | 2975 | |
CWA [62] | 25.7 | 27.7 | 20.3 | 25.4 | 15.7 | 10.4 | 9.2 | 2253 | 2208 | 2453 | 2310 | 2849 | 3121 | 3010 | ||
LGP [22] | 22.4 | 22.3 | 17.8 | 21.6 | 9.6 | 5.8 | 7.1 | 2537 | 2580 | 2769 | 2544 | 3142 | 3348 | 3249 | ||
FFA(ours) | 21.3 | 19.6 | 16.1 | 21.3 | 10.1 | 5.1 | 6.5 | 2591 | 2674 | 2848 | 2569 | 3077 | 3403 | 3386 |
Origin | Target | Attack Method | Trained OD | mAP50 ↓ | ↑ | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Attacked ODs | |||||||||||||
GV | RT | RD | RF | RR | GV | RT | RD | RF | RR | ||||
Plane | Basketball court | TOG [61] | OR [52] | 58.6 | 54.5 | 53.5 | 49.1 | 48.5 | 953 | 1024 | 986 | 1055 | 1284 |
CWA [62] | 55.1 | 52.7 | 47.9 | 45.4 | 42.3 | 899 | 980 | 1443 | 1274 | 1596 | |||
LGP [22] | 48.7 | 46.8 | 44.6 | 39.6 | 35.9 | 1258 | 1386 | 1549 | 1595 | 1876 | |||
FFA (ours) | 45.8 | 43.2 | 42.6 | 37.1 | 34.3 | 1431 | 1503 | 1661 | 1752 | 1919 | |||
TOG [61] | S2A [58] | 73.8 | 70.6 | 61.5 | 62.2 | 50.5 | 536 | 683 | 974 | 961 | 1017 | ||
CWA [62] | 69.1 | 63.2 | 58.7 | 60.1 | 49.8 | 848 | 871 | 1037 | 1083 | 1385 | |||
LGP [22] | 58.5 | 62.7 | 58.3 | 54.4 | 46.1 | 1032 | 896 | 1096 | 1264 | 1568 | |||
FFA (ours) | 59.1 | 60.1 | 57.6 | 53.8 | 44.7 | 1075 | 935 | 1209 | 1348 | 1792 | |||
Vehicle | Ground track field | TOG [61] | OR [52] | 70.2 | 69.8 | 69.5 | 65.5 | 68.1 | 753 | 726 | 774 | 873 | 754 |
CWA [62] | 68.5 | 70.4 | 68.7 | 65.2 | 66.3 | 796 | 698 | 886 | 912 | 817 | |||
LGP [22] | 65.7 | 67.5 | 65.8 | 63.1 | 64.0 | 905 | 903 | 1027 | 1108 | 1283 | |||
FFA (ours) | 65.1 | 66.3 | 64.4 | 60.9 | 61.7 | 937 | 971 | 1136 | 1395 | 1407 | |||
TOG [61] | S2A [58] | 73.2 | 70.7 | 72.3 | 67.3 | 65.1 | 685 | 751 | 634 | 890 | 1018 | ||
CWA [62] | 70.3 | 68.8 | 70.9 | 65.5 | 63.4 | 758 | 891 | 776 | 1068 | 1039 | |||
LGP [22] | 67.7 | 63.4 | 68.2 | 60.3 | 59.6 | 979 | 1008 | 872 | 1321 | 1237 | |||
FFA (ours) | 67.1 | 62.8 | 65.4 | 59.1 | 57.8 | 1021 | 1085 | 958 | 1377 | 1414 |
Attack Method | IW-SSIM ↓ | ||||||
---|---|---|---|---|---|---|---|
OR | GV | RT | RD | S2A | RF | RR | |
TOG [61] | 1.96 | 2.09 | 1.77 | 2.41 | 2.56 | 1.94 | 2.09 |
CWA [62] | 2.53 | 2.74 | 1.98 | 2.37 | 2.91 | 2.04 | 3.62 |
LGP [22] | 1.56 | 1.47 | 1.53 | 1.94 | 2.13 | 1.06 | 2.76 |
FFA (ours) | 1.78 | 1.95 | 1.72 | 2.01 | 2.35 | 1.25 | 3.12 |
Attack Method | Time (s/Image) ↓ | ||||||
OR | GV | RT | RD | S2A | RF | RR | |
TOG [61] | 4.49 | 5.58 | 6.53 | 7.43 | 6.14 | 7.07 | 5.86 |
CWA [62] | 5.36 | 4.07 | 7.74 | 8.65 | 13.55 | 11.68 | 11.23 |
LGP [22] | 6.83 | 6.15 | 10.62 | 10.31 | 17.54 | 15.47 | 16.21 |
FFA (ours) | 6.95 | 7.81 | 11.83 | 10.47 | 18.75 | 16.29 | 18.58 |
Attack Method | OD/Backbone | I | 1 | 10 | 20 | 50 | 100 |
---|---|---|---|---|---|---|---|
Targetless FFA | OR/R50 | mAP50 ↓ | 34.2 | 6.7 | 4.1 | 3.3 | 6.8 |
IW-SSIM ↓ | 0.21 | 0.20 | 0.32 | 0.85 | 0.12 | ||
Time ↓ | 0.85 | 2.41 | 4.36 | 6.68 | 13.75 | ||
Targeted FFA | OR/R50 | mAP50 ↓ | 55.6 | 39.6 | 25.4 | 18.7 | 22.1 |
↑ | 1283 | 1583 | 2283 | 2876 | 2679 | ||
IW-SSIM ↓ | 0.51 | 0.67 | 1.05 | 1.78 | 1.03 | ||
Time ↓ | 1.03 | 2.29 | 4.71 | 6.95 | 15.33 |
Targetless FFA | OR | IW-SSIM ↓ | mAP50 ↓ | Time ↓ | |||
Clean | - | - | - | - | 83.3 | - | |
1 | √ | 3.81 | 11.2 | 5.38 | |||
2 | √ | √ | 2.51 | 8.9 | 7.45 | ||
3 | √ | √ | √ | 0.85 | 3.3 | 9.51 | |
Targeted FFA | OR/Plane | IW-SSIM ↓ | mAP50 ↓ | Time ↓ | |||
Clean | - | - | - | - | 90.1 | - | |
1 | √ | 5.85 | 25.7 | 2.32 | |||
2 | √ | √ | 3.64 | 23.9 | 2.96 | ||
3 | √ | √ | √ | 1.50 | 18.7 | 3.55 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, R.; Ma, S.; He, L.; Ge, W. FFA: Foreground Feature Approximation Digitally against Remote Sensing Object Detection. Remote Sens. 2024, 16, 3194. https://doi.org/10.3390/rs16173194
Zhu R, Ma S, He L, Ge W. FFA: Foreground Feature Approximation Digitally against Remote Sensing Object Detection. Remote Sensing. 2024; 16(17):3194. https://doi.org/10.3390/rs16173194
Chicago/Turabian StyleZhu, Rui, Shiping Ma, Linyuan He, and Wei Ge. 2024. "FFA: Foreground Feature Approximation Digitally against Remote Sensing Object Detection" Remote Sensing 16, no. 17: 3194. https://doi.org/10.3390/rs16173194