FCOSR: A Simple Anchor-Free Rotated Detector for Aerial Object Detection
<p>FCOSR architecture. The output of the backbone with the feature pyramid network (FPN) [<a href="#B40-remotesensing-15-05499" class="html-bibr">40</a>] is multi-level feature maps, including P3–P7. The head is shared with all multi-level feature maps. The predictions on the left of the head are the inference part, while the other components are only effective during the training stage. The label assignment module (LAM) allocates labels to each feature maps. <span class="html-italic">H</span> and <span class="html-italic">W</span> are the height and width of the feature map, respectively. Stride is the downsampling ratio for multi-level feature maps. <span class="html-italic">C</span> represents the number of categories, and regression branch directly predicts the center point, width, height, and angle of the target.</p> "> Figure 2
<p>Ellipse center area of OBB. The oriented rectangle represents the OBB of the target, and the shadow area represents the sampling region: (<b>a</b>) general sampling region, (<b>b</b>) horizontal center sampling region, (<b>c</b>) original elliptical region, and (<b>d</b>) shrinking elliptical region.</p> "> Figure 3
<p>A fuzzy sample label assignment demo: (<b>a</b>) is a 2D label assignment area diagram, and (<b>b</b>) is a 3D visualization effect diagram of <math display="inline"><semantics> <mrow> <mi>J</mi> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> </semantics></math> of two objects. The red OBB and area represent the court object, and the blue represents the ground track field. After <math display="inline"><semantics> <mrow> <mi>J</mi> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> </semantics></math> calculation, smaller areas in the red ellipse are allocated to the court, and other blue areas are allocated to the ground track field.</p> "> Figure 4
<p>Multi-level sampling: (<b>a</b>) Insufficient sampling, where green points in the diagram are sampling points. The ship is so narrow that there are no sampling points inside it. (<b>b</b>) A multi-level sampling demo. The red line indicates that the target follows the FCOS guidelines assigned to H6, but it is too narrow to sample effectively. The blue line indicates that the target is assigned to the lower level of features according to the MLS guidelines. This represents the target sampling at three different scales to handle the problem of insufficient sampling.</p> "> Figure 5
<p>Physical picture of the embedded object detection system based on the Nvidia Jetson platform.</p> "> Figure 6
<p>The detection result of the entire aerial image on the Nvidia Jetson platform. We completed the detection of P2043 image from the DOTA-v1.0 test set in 1.4 s on a Jetson AGX Xavier device and visualized the results. The size of this large image was 4165 × 3438.</p> "> Figure 7
<p>The FCOSR-M detection result on the DOTA-v1.0 test set. The confidence threshold is set to 0.3 when showing these results.</p> "> Figure 8
<p>The FCOSR-L detection result on HRSC2016. The confidence threshold is set to 0.3 when visualizing these results.</p> "> Figure 9
<p>Speed versus accuracy on DOTA-v1.0 single-scale test set. X indicates the ResNext backbone. R indicates the ResNet backbone. RR indicates the ReResNet(ReDet) backbone. Mobile indicates the Mobilenet v2 backbone. We tested ReDet [<a href="#B20-remotesensing-15-05499" class="html-bibr">20</a>], S<math display="inline"><semantics> <msup> <mrow/> <mn>2</mn> </msup> </semantics></math>ANet [<a href="#B16-remotesensing-15-05499" class="html-bibr">16</a>], and R<math display="inline"><semantics> <msup> <mrow/> <mn>3</mn> </msup> </semantics></math>Det [<a href="#B28-remotesensing-15-05499" class="html-bibr">28</a>] on a single RTX 2080-Ti device based on their source code. Faster-RCNN-O (FR-O) [<a href="#B8-remotesensing-15-05499" class="html-bibr">8</a>], RetinaNet-O (RN-O) [<a href="#B10-remotesensing-15-05499" class="html-bibr">10</a>], and Oriented RCNN (O-RCNN) [<a href="#B27-remotesensing-15-05499" class="html-bibr">27</a>] test results are from the OBBDetection repository<sup>2</sup>.</p> ">
Abstract
:1. Introduction
- We propose a one-stage anchor-free aerial oriented object detector, which is simple, fast, and easy to deploy.
- We design a set of label assignment strategies based on 2D Gaussian distribution and aerial image characteristics. These strategies assign more appropriate labels to training samples.
- We convert the lightweight FCOSR to the TensorRT format and successfully migrate it to Jetson Xavier NX, whose power is only 15 watts (W). The TensorRT model achieves an mAP of 73.93 with 10.68 FPS on the DOTA-v1.0 test set.
- Our method achieves an mAP of 79.25, 75.41, and 90.15 on the DOTA-v1.0, DOTA-v1.5, and HRSC2016 datasets, respectively. Compared with other anchor-free methods, FCOSR achieves better performance. FCOSR surpasses many two-stage methods in terms of its single-scale performance. Our model greatly reduces the gap in speed and accuracy between anchor-free and anchor-based methods. In terms of speed and accuracy, FCOSR surpasses current mainstream models.
2. Related Works
2.1. Anchor-Based Methods
2.2. Anchor-Free Methods
3. Method
3.1. Network Outputs
3.2. Ellipse Center Sampling
3.3. Fuzzy Sample Label Assignment
3.4. Multi-Level Sampling
3.5. Target Loss
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Lightweight and Embedded System
Methods | Parameters | Model Size | Input Size | GFLOPs | FPS | mAP |
---|---|---|---|---|---|---|
FCOSR-lite | 6.9 M | 51.63 MB | 1024 × 1024 | 101.25 | 7.64/12.59 | 74.30 |
FCOSR-tiny | 3.52 M | 23.2 MB | 1024 × 1024 | 35.89 | 10.68/17.76 | 73.93 |
4.4. Comparison with State-of-the-Art Methods
Method | Backbone | Parameters | Input Size | GFLOPs | FPS | mAP |
---|---|---|---|---|---|---|
FCOSR-S | Mobilenet v2 | 7.32 M | 1024 × 1024 | 101.42 | 23.7 | 74.05 |
FCOSR-M | ResNext50 | 31.4 M | 1024 × 1024 | 210.01 | 14.6 | 77.15 |
FCOSR-L | ResNext101 | 89.64 M | 1024 × 1024 | 445.75 | 7.9 | 77.39 |
Method | Backbone | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Anchor-based, two-stage | |||||||||||||||||
ROI-Trans. * [15] | R101 | 88.64 | 78.52 | 43.44 | 75.92 | 68.81 | 73.68 | 83.59 | 90.74 | 77.27 | 81.46 | 58.39 | 53.54 | 62.83 | 58.93 | 47.67 | 69.56 |
CenterMap * [46] | R101 | 89.83 | 84.41 | 54.60 | 70.25 | 77.66 | 78.32 | 87.19 | 90.66 | 84.89 | 85.27 | 56.46 | 69.23 | 74.13 | 71.56 | 66.06 | 76.03 |
SCRDet++ * [47] | R101 | 90.05 | 84.39 | 55.44 | 73.99 | 77.54 | 71.11 | 86.05 | 90.67 | 87.32 | 87.08 | 69.62 | 68.90 | 73.74 | 71.29 | 65.08 | 76.81 |
ReDet [20] | ReR50 | 88.79 | 82.64 | 53.97 | 74.00 | 78.13 | 84.06 | 88.04 | 90.89 | 87.78 | 85.75 | 61.76 | 60.39 | 75.96 | 68.07 | 63.59 | 76.25 |
ReDet * [20] | ReR50 | 88.81 | 82.48 | 60.83 | 80.82 | 78.34 | 86.06 | 88.31 | 90.87 | 88.77 | 87.03 | 68.65 | 66.90 | 79.26 | 79.71 | 74.67 | 80.10 |
Anchor-based, one-stage | |||||||||||||||||
RDet * [28] | R152 | 89.80 | 83.77 | 48.11 | 66.77 | 78.76 | 83.27 | 87.84 | 90.82 | 85.38 | 85.51 | 65.67 | 62.68 | 67.53 | 78.56 | 72.62 | 76.47 |
CSL * [29] | R152 | 90.13 | 84.43 | 54.57 | 68.13 | 77.32 | 72.98 | 85.94 | 90.74 | 85.95 | 86.36 | 63.42 | 65.82 | 74.06 | 73.67 | 70.08 | 76.24 |
SANet * [16] | R50 | 89.11 | 82.84 | 48.37 | 71.11 | 78.11 | 78.39 | 87.25 | 90.83 | 84.90 | 85.64 | 60.36 | 62.60 | 65.26 | 69.13 | 57.94 | 74.12 |
SANet * [16] | R50 | 88.89 | 83.60 | 57.74 | 81.95 | 79.94 | 83.19 | 89.11 | 90.78 | 84.87 | 87.81 | 70.30 | 68.25 | 78.30 | 77.01 | 69.58 | 79.42 |
Anchor-free, one-stage | |||||||||||||||||
BBAVec. * [17] | R101 | 88.63 | 84.06 | 52.13 | 69.56 | 78.26 | 80.40 | 88.06 | 90.87 | 87.23 | 86.39 | 56.11 | 65.62 | 67.10 | 72.08 | 63.96 | 75.36 |
DRN * [48] | H104 | 89.45 | 83.16 | 48.98 | 62.24 | 70.63 | 74.25 | 83.99 | 90.73 | 84.60 | 85.35 | 55.76 | 60.79 | 71.56 | 68.82 | 63.92 | 72.95 |
CFA [49] | R101 | 89.26 | 81.72 | 51.81 | 67.17 | 79.99 | 78.25 | 84.46 | 90.77 | 83.40 | 85.54 | 54.86 | 67.75 | 73.04 | 70.24 | 64.96 | 75.05 |
PolarDet [18] | R50 | 89.73 | 87.05 | 45.30 | 63.32 | 78.44 | 76.65 | 87.13 | 90.79 | 80.58 | 85.89 | 60.97 | 67.94 | 68.20 | 74.63 | 68.67 | 75.02 |
PolarDet * [18] | R101 | 89.65 | 87.07 | 48.14 | 70.97 | 78.53 | 80.34 | 87.45 | 90.76 | 85.63 | 86.87 | 61.64 | 70.32 | 71.92 | 73.09 | 67.15 | 76.64 |
FCOSR-S | Mobile | 89.09 | 80.58 | 44.04 | 73.33 | 79.07 | 76.54 | 87.28 | 90.88 | 84.89 | 85.37 | 55.95 | 64.56 | 66.92 | 76.96 | 55.32 | 74.05 |
FCOSR-S * | Mobile | 88.60 | 84.13 | 46.85 | 78.22 | 79.51 | 77.00 | 87.74 | 90.85 | 86.84 | 86.71 | 64.51 | 68.17 | 67.87 | 72.08 | 62.52 | 76.11 |
FCOSR-M | RX50 | 88.88 | 82.68 | 50.10 | 71.34 | 81.09 | 77.40 | 88.32 | 90.80 | 86.03 | 85.23 | 61.32 | 68.07 | 75.19 | 80.37 | 70.48 | 77.15 |
FCOSR-M * | RX50 | 89.06 | 84.93 | 52.81 | 76.32 | 81.54 | 81.81 | 88.27 | 90.86 | 85.20 | 87.58 | 68.63 | 70.38 | 75.95 | 79.73 | 75.67 | 79.25 |
FCOSR-L | RX101 | 89.50 | 84.42 | 52.58 | 71.81 | 80.49 | 77.72 | 88.23 | 90.84 | 84.23 | 86.48 | 61.21 | 67.77 | 76.34 | 74.39 | 74.86 | 77.39 |
FCOSR-L * | RX101 | 88.78 | 85.38 | 54.29 | 76.81 | 81.52 | 82.76 | 88.38 | 90.80 | 86.61 | 87.25 | 67.58 | 67.03 | 76.86 | 73.22 | 74.68 | 78.80 |
Method | Backbone | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | CC | mAP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RN-O. † [10] | R50 | 71.43 | 77.64 | 42.12 | 64.65 | 44.53 | 56.79 | 73.31 | 90.84 | 76.02 | 59.96 | 46.95 | 69.24 | 59.65 | 64.52 | 48.06 | 0.83 | 59.16 |
FR-O. ‡ [8] | R50 | 71.89 | 74.47 | 44.45 | 59.87 | 51.28 | 68.98 | 79.37 | 90.78 | 77.38 | 67.50 | 47.75 | 69.72 | 61.22 | 65.28 | 60.47 | 1.54 | 62.00 |
MR. ‡ [9] | R50 | 76.84 | 73.51 | 49.90 | 57.80 | 51.31 | 71.34 | 79.75 | 90.46 | 74.21 | 66.07 | 46.21 | 70.61 | 63.07 | 64.46 | 57.81 | 9.42 | 62.67 |
DAFNe * [50] | R101 | 80.69 | 86.38 | 52.14 | 62.88 | 67.03 | 76.71 | 88.99 | 90.84 | 77.29 | 83.41 | 51.74 | 74.60 | 75.98 | 75.78 | 72.46 | 34.84 | 71.99 |
FCOS [11] | R50 | 78.67 | 72.50 | 44.31 | 59.57 | 56.25 | 64.03 | 78.06 | 89.40 | 71.45 | 73.32 | 49.51 | 66.47 | 55.78 | 63.26 | 44.76 | 9.44 | 61.05 |
ReDet ‡ [20] | ReR50 | 79.20 | 82.81 | 51.92 | 71.41 | 52.38 | 75.73 | 80.92 | 90.83 | 75.81 | 68.64 | 49.29 | 72.03 | 73.36 | 70.55 | 63.33 | 11.53 | 66.86 |
ReDet [20] | ReR50 | 88.51 | 86.45 | 61.23 | 81.20 | 67.60 | 83.65 | 90.00 | 90.86 | 84.30 | 75.33 | 71.49 | 72.06 | 78.32 | 74.73 | 76.10 | 46.98 | 76.80 |
FCOSR-S | Mobile | 80.05 | 76.98 | 44.49 | 74.17 | 51.09 | 74.07 | 80.60 | 90.87 | 78.40 | 75.01 | 53.38 | 69.35 | 66.33 | 74.43 | 59.22 | 13.50 | 66.37 |
FCOSR-S * | Mobile | 87.84 | 84.60 | 53.35 | 75.67 | 65.79 | 80.71 | 89.30 | 90.89 | 84.18 | 84.23 | 63.53 | 73.07 | 73.29 | 76.15 | 72.64 | 14.72 | 73.12 |
FCOSR-M | RX50 | 80.48 | 81.90 | 50.02 | 72.32 | 56.82 | 76.37 | 81.06 | 90.86 | 78.62 | 77.32 | 53.63 | 66.92 | 73.78 | 74.20 | 69.80 | 15.73 | 68.74 |
FCOSR-M * | RX50 | 80.85 | 83.89 | 53.36 | 76.24 | 66.85 | 82.54 | 89.61 | 90.87 | 80.11 | 84.27 | 61.72 | 72.90 | 76.23 | 75.28 | 70.01 | 35.87 | 73.79 |
FCOSR-L | RX101 | 80.58 | 85.25 | 51.05 | 70.83 | 57.77 | 76.72 | 81.09 | 90.87 | 78.07 | 77.60 | 51.91 | 68.72 | 75.87 | 72.61 | 69.30 | 31.06 | 69.96 |
FCOSR-L * | RX101 | 87.12 | 83.90 | 53.41 | 70.99 | 66.79 | 82.84 | 89.66 | 90.85 | 81.84 | 84.52 | 67.78 | 74.52 | 77.25 | 74.97 | 75.31 | 44.81 | 75.41 |
Method | Backbone | mAP (07) | mAP (12) |
---|---|---|---|
PIoU [19] | DLA-34 | 89.20 | - |
SANet [16] | ResNet101 | 90.17 | 95.01 |
ProbIoU [34] | ResNet50 | 87.09 | - |
DRN [48] | Hourglass104 | - | 92.70 |
CenterMap [46] | ResNet50 | - | 92.80 |
BBAVectors [17] | ResNet101 | 88.60 | - |
PolarDet [18] | ResNet50 | 90.13 | - |
FCOSR-S(ours) | Mobilenet v2 | 90.05 (±0.042) | 92.59 (±0.054) |
FCOSR-M(ours) | ResNext50 | 90.12 (±0.034) | 94.81 (±0.030) |
FCOSR-L(ours) | ResNext101 | 90.13 (±0.028) | 95.70 (±0.026) |
4.5. Ablation Experiments
Method | Rotate Aug. | QFL | ECS | FLA | MLS | mAP |
---|---|---|---|---|---|---|
70.40 | ||||||
✓ | 74.43 | |||||
FCOSR-M | ✓ | ✓ | 75.34 | |||
✓ | ✓ | ✓ | 76.37 | |||
✓ | ✓ | ✓ | 75.92 | |||
✓ | ✓ | ✓ | 75.77 | |||
✓ | ✓ | ✓ | ✓ | 76.80 | ||
✓ | ✓ | ✓ | ✓ | ✓ | 77.15 |
Method | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GS | 89.40 | 83.73 | 50.97 | 71.42 | 80.87 | 77.81 | 88.49 | 90.76 | 85.36 | 85.45 | 60.12 | 62.98 | 76.22 | 75.64 | 65.93 | 76.34 |
HCS | 88.09 | 79.57 | 55.31 | 63.63 | 81.13 | 77.67 | 88.11 | 90.80 | 84.85 | 84.11 | 58.75 | 62.29 | 74.29 | 80.51 | 63.11 | 75.48 |
OES | 89.07 | 81.15 | 50.96 | 70.44 | 80.53 | 77.64 | 88.31 | 90.85 | 85.37 | 86.60 | 59.05 | 61.22 | 76.00 | 80.58 | 72.73 | 76.70 |
SES | 88.88 | 82.68 | 50.10 | 71.34 | 81.09 | 77.40 | 88.32 | 90.80 | 86.03 | 85.23 | 61.32 | 68.07 | 75.19 | 80.37 | 70.48 | 77.15 |
Method | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ATSS [51] | 88.91 | 81.79 | 53.93 | 72.42 | 80.75 | 80.77 | 88.33 | 90.79 | 86.27 | 85.54 | 56.99 | 63.19 | 75.90 | 74.61 | 68.87 | 76.60 |
simOTA [3] | 81.31 | 72.89 | 52.85 | 69.79 | 79.89 | 77.17 | 86.87 | 90.11 | 83.07 | 82.38 | 58.96 | 58.31 | 74.37 | 68.75 | 52.74 | 72.63 |
Ours | 88.88 | 82.68 | 50.10 | 71.34 | 81.09 | 77.40 | 88.32 | 90.80 | 86.03 | 85.23 | 61.32 | 68.07 | 75.19 | 80.37 | 70.48 | 77.15 |
4.6. Speed versus Accuracy
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Chen, Q.; Wang, Y.; Yang, T.; Zhang, X.; Cheng, J.; Sun, J. You Only Look One-level Feature. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: A simple and strong anchor-free object detector. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 1922–1933. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning roi transformer for oriented object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 2849–2858. [Google Scholar]
- Han, J.; Ding, J.; Li, J.; Xia, G.S. Align deep features for oriented object detection. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–11. [Google Scholar] [CrossRef]
- Yi, J.; Wu, P.; Liu, B.; Huang, Q.; Qu, H.; Metaxas, D. Oriented object detection in aerial images with box boundary-aware vectors. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 11–17 October 2021; pp. 2150–2159. [Google Scholar]
- Zhao, P.; Qu, Z.; Bu, Y.; Tan, W.; Guan, Q. PolarDet: A fast, more precise detector for rotated target in aerial images. Int. J. Remote Sens. 2021, 42, 5831–5861. [Google Scholar] [CrossRef]
- Chen, Z.; Chen, K.; Lin, W.; See, J.; Yu, H.; Ke, Y.; Yang, C. Piou loss: Towards accurate oriented object detection in complex environments. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 195–211. [Google Scholar]
- Han, J.; Ding, J.; Xue, N.; Xia, G.S. Redet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 2786–2795. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar]
- Liu, Z.; Yuan, L.; Weng, L.; Yang, Y. A high resolution optical satellite image dataset for ship recognition and some new baselines. In Proceedings of the International Conference on Pattern Recognition Applications and Methods, Porto, Portugal, 24–26 February 2017; SciTePress: Setubal, Portugal, 2017; Volume 2, pp. 324–331. [Google Scholar]
- Ma, J.; Shao, W.; Ye, H.; Wang, L.; Wang, H.; Zheng, Y.; Xue, X. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimed. 2018, 20, 3111–3122. [Google Scholar] [CrossRef]
- Liu, L.; Pan, Z.; Lei, B. Learning a rotation invariant detector with rotatable bounding box. arXiv 2017, arXiv:1711.09405. [Google Scholar]
- An, Q.; Pan, Z.; Liu, L.; You, H. DRBox-v2: An improved detector with rotatable boxes for target detection in SAR images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8333–8349. [Google Scholar] [CrossRef]
- Weiler, M.; Cesa, G. General e (2)-equivariant steerable cnns. Adv. Neural Inf. Process. Syst. 2019, 32, 14334–14345. [Google Scholar]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented r-cnn for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
- Yang, X.; Liu, Q.; Yan, J.; Li, A.; Zhang, Z.; Yu, G. R3det: Refined single-stage detector with feature refinement for rotating object. arXiv 2019, arXiv:1908.05612. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; He, T. On the arbitrary-oriented object detection: Classification based approaches revisited. arXiv 2020, arXiv:2003.05597. [Google Scholar]
- Yang, X.; Hou, L.; Zhou, Y.; Wang, W.; Yan, J. Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 15819–15829. [Google Scholar]
- Lin, Y.; Feng, P.; Guan, J.; Wang, W.; Chambers, J. IENet: Interacting embranchment one stage anchor free detector for orientation aerial object detection. arXiv 2019, arXiv:1912.00969. [Google Scholar]
- Yang, X.; Yan, J.; Ming, Q.; Wang, W.; Zhang, X.; Tian, Q. Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; ACM: New York, NY, USA; Volume 139, pp. 11830–11841. [Google Scholar]
- Yang, X.; Yang, X.; Yang, J.; Ming, Q.; Wang, W.; Tian, Q.; Yan, J. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Adv. Neural Inf. Process. Syst. 2021, 34, 18381–18394. [Google Scholar]
- Llerena, J.M.; Zeni, L.F.; Kristen, L.N.; Jung, C. Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection. arXiv 2021, arXiv:2106.06072. [Google Scholar]
- Li, Z.; Hou, B.; Wu, Z.; Guo, Z.; Ren, B.; Guo, X.; Jiao, L. Complete Rotated Localization Loss Based on Super-Gaussian Distribution for Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5618614. [Google Scholar] [CrossRef]
- Li, Z.; Hou, B.; Wu, Z.; Ren, B.; Ren, Z.; Jiao, L. Gaussian synthesis for high-precision location in oriented object detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5619612. [Google Scholar] [CrossRef]
- Wang, J.; Yang, L.; Li, F. Predicting Arbitrary-Oriented Objects as Points in Remote Sensing Images. Remote Sens. 2021, 13, 3731. [Google Scholar] [CrossRef]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- He, X.; Ma, S.; He, L.; Zhang, F.; Liu, X.; Ru, L. AROA: Attention Refinement One-Stage Anchor-Free Detector for Objects in Remote Sensing Imagery. In Proceedings of the International Conference on Image and Graphics, Haikou, China, 26–28 December 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 269–279. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open mmlab detection toolbox and benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Wang, J.; Yang, W.; Li, H.C.; Zhang, H.; Xia, G.S. Learning center probability map for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 4307–4323. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Yang, X.; Tang, J.; Liao, W.; He, T. Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. arXiv 2020, arXiv:2004.13316. [Google Scholar] [CrossRef]
- Pan, X.; Ren, Y.; Sheng, K.; Dong, W.; Yuan, H.; Guo, X.; Ma, C.; Xu, C. Dynamic refinement network for oriented and densely packed object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 11207–11216. [Google Scholar]
- Guo, Z.; Liu, C.; Zhang, X.; Jiao, J.; Ji, X.; Ye, Q. Beyond Bounding-Box: Convex-hull Feature Adaptation for Oriented and Densely Packed Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 8792–8801. [Google Scholar]
- Lang, S.; Ventola, F.; Kersting, K. DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection. arXiv 2021, arXiv:2109.06148. [Google Scholar]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 9759–9768. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Hou, B.; Wu, Z.; Ren, B.; Yang, C. FCOSR: A Simple Anchor-Free Rotated Detector for Aerial Object Detection. Remote Sens. 2023, 15, 5499. https://doi.org/10.3390/rs15235499
Li Z, Hou B, Wu Z, Ren B, Yang C. FCOSR: A Simple Anchor-Free Rotated Detector for Aerial Object Detection. Remote Sensing. 2023; 15(23):5499. https://doi.org/10.3390/rs15235499
Chicago/Turabian StyleLi, Zhonghua, Biao Hou, Zitong Wu, Bo Ren, and Chen Yang. 2023. "FCOSR: A Simple Anchor-Free Rotated Detector for Aerial Object Detection" Remote Sensing 15, no. 23: 5499. https://doi.org/10.3390/rs15235499