HARNU-Net: Hierarchical Attention Residual Nested U-Net for Change Detection in Remote Sensing Images
<p>Architecture of the proposed HARNU-Net. (<b>a</b>) is the backbone of HARNU-Net, used for feature extraction. (<b>b</b>) is an improved convolutional block, proposed to enhance the backbone network performance. AFFM is used for feature fusion, HARM is used for feature filtering and enhancement. Detailed structure of AFFM and HARM are shown in <a href="#sensors-22-04626-f002" class="html-fig">Figure 2</a> and <a href="#sensors-22-04626-f003" class="html-fig">Figure 3</a>.</p> "> Figure 2
<p>Architecture of Adjacent Feature Fusion Module (AFFM).</p> "> Figure 3
<p>Architecture of Hierarchical Attention Residual Module (HARM).</p> "> Figure 4
<p>Architecture of CAM.</p> "> Figure 5
<p>Architecture of SAM.</p> "> Figure 6
<p>Architecture of CBAM.</p> "> Figure 7
<p>Illustration of samples from three dataset. The samples from left to right are in order from CDD, BCDD, LEVIR-CD. T1 and T2 indicate the bi-temporal image pairs. GT indicates the ground truth.</p> "> Figure 8
<p>Visualization results on the CDD dataset. (<b>a</b>) T1 images. (<b>b</b>) T2 images. (<b>c</b>) Ground Truth. (<b>d</b>) FC-EF. (<b>e</b>) FC-Siam-conc. (<b>f</b>) FC-Siam-diff. (<b>g</b>) CDNet. (<b>h</b>) STANet. (<b>i</b>) BIT. (<b>j</b>) SNUNet. (<b>k</b>) Ours. White indicates correctly detected changed areas, black indicates correctly detected unchanged areas, red indicates incorrectly detected unchanged areas as changed areas, and green indicates unpredicted changed areas.</p> "> Figure 9
<p>Visualization results on the BCDD dataset. (<b>a</b>) T1 images. (<b>b</b>) T2 images. (<b>c</b>) Ground Truth. (<b>d</b>) FC-EF. (<b>e</b>) FC-Siam-conc. (<b>f</b>) FC-Siam-diff. (<b>g</b>) CDNet. (<b>h</b>) STANet. (<b>i</b>) BIT. (<b>j</b>) SNUNet. (<b>k</b>) Ours. White indicates correctly detected changed areas, black indicates correctly detected unchanged areas, red indicates incorrectly detected unchanged areas as changed areas, and green indicates unpredicted changed areas.</p> "> Figure 10
<p>Visualization results on the LEVIR-CD dataset. (<b>a</b>) T1 images. (<b>b</b>) T2 images. (<b>c</b>) Ground Truth. (<b>d</b>) FC-EF. (<b>e</b>) FC-Siam-conc. (<b>f</b>) FC-Siam-diff. (<b>g</b>) CDNet. (<b>h</b>) STANet. (<b>i</b>) BIT. (<b>j</b>) SNUNet. (<b>k</b>) Ours. White indicates correctly detected changed areas, black indicates correctly detected unchanged areas, red indicates incorrectly detected unchanged areas as changed areas, and green indicates unpredicted changed areas.</p> "> Figure 11
<p>Visualization results of ablation experiments performed on CDD dataset. (<b>a</b>) T1 images. (<b>b</b>) T2 images. (<b>c</b>) Ground Truth. (<b>d</b>) Baseline. (<b>e</b>) Baseline + A-R. (<b>f</b>) Baseline + AFFM. (<b>g</b>) Baseline + HARM. (<b>h</b>) Baseline + AFFM + HARM. (<b>i</b>) Baseline + A-R + AFFM + HARM. White indicates the predicted change area, black indicates the predicted unchanged area.</p> "> Figure 12
<p>Comparison of attentional heat maps before and after baseline model improvement. (<b>a</b>) T1 image, (<b>b</b>) T2 image, (<b>c</b>) ground truth, (<b>d</b>) attentional heat map generated by the baseline model UNet++, (<b>e</b>) attentional heat map generated by HARNU-Net.</p> "> Figure 13
<p>Validation results F1 of ablation experiments on CDD dataset. Smaller rectangular boxes show a clearer result.</p> "> Figure 14
<p>Validation results F1 for different branches of HARM on the BCDD dataset. Smaller rectangular boxes show a clearer result.</p> "> Figure 15
<p>Ablation experiments performed on the LEVIR-CD dataset applying HARM to other models. All scores are expressed as a percentage (%).</p> ">
Abstract
:1. Introduction
- 1.
- We proposed a novel and powerful network for remote sensing image CD, called HARNU-Net. Compared with the baseline network U-Net++, our network significantly reduces the miss detection rate of small change regions and shows strong robustness on pseudo-change cases.
- 2.
- We proposed HARM for effective enhancement of features in a finer-grained space, using the feature transferability of the hierarchy to effectively filter out redundant information and provide powerful feature representation and analysis capabilities for the model. As a plug-and-play module, HARM can be easily transplanted to other models.
- 3.
- The AFFM proposed by us can effectively integrate multi-level features and context information, so as to reduce the learning difficulty of the model during the training process, and make the boundary of the output change map more regular.
2. Related Work
3. Methodology
3.1. Network Architecture
- (1)
- Feature extraction part: An improved U-Net++ is used as the backbone network, and its encoder part is adjusted to a Siamese structure to meet the bi-temporal images input requirements of the CD task.
- (2)
- Feature fusion part: Unlike the most of previous approach of using only the output features of the last decoder layer of U-Net++, we innovatively use the output features of the four stages of the network to serve the final result. We consider the similarity and complementarity between adjacent features of different layers, so we use AFFM based on the adjacency strategy for complementary fusion of features.
- (3)
- Feature reinforcement part: The four groups of features are reinforced separately using HARM designed by the Convolutional Block Attention Module (CBAM) [44] with a hierarchical structure, and the change maps are output after the final processing.
3.2. Improved Backbone Network
3.3. Adjacent Feature Fusion Module
3.4. Hierarchical Attention Residual Module
3.5. Loss Funtion
4. Experiment and Analysis
4.1. Datasets and Pre-Processing
4.2. Evaluation Metrics and Implementation Details
4.3. Analysis of Experimental Results
4.3.1. Comparison Methods
4.3.2. Analysis Experiments on CDD
4.3.3. Analysis Experiments on BCDD
4.3.4. Analysis Experiments on LEVIR-CD
4.4. Ablation Study
4.5. Analysis of the Role of HARM
4.5.1. Attention Module Comparison
4.5.2. Rationality of the Hierarchical Structure
4.5.3. Validity of HARM
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [Google Scholar] [CrossRef]
- de Alwis Pitts, D.A.; So, E. Enhanced change detection index for disaster response, recovery assessment and monitoring of buildings and critical facilities—A case study for Muzzaffarabad, Pakistan. Int. J. Appl. Earth Obs. Geoinf. 2017, 63, 167–177. [Google Scholar] [CrossRef]
- He, C.; Wei, A.; Shi, P.; Zhang, Q.; Zhao, Y. Detecting land-use/land-cover change in rural–urban fringe areas using extended change-vector analysis. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 572–585. [Google Scholar] [CrossRef]
- Coppin, P.; Lambin, E.; Jonckheere, I.; Muys, B. Digital change detection methods in natural ecosystem monitoring: A review. Anal. Multi-Temporal Remote Sens. Images 2002, 3–36. [Google Scholar] [CrossRef]
- Celik, T. Unsupervised change detection in satellite images using principal component analysis and k-means clustering. IEEE Geosci. Remote Sens. Lett. 2009, 6, 772–776. [Google Scholar] [CrossRef]
- Im, J.; Jensen, J.; Tullis, J. Object-based change detection using correlation image analysis and image segmentation. Int. J. Remote Sens. 2008, 29, 399–423. [Google Scholar] [CrossRef]
- Li, Z.; Shi, W.; Lu, P.; Yan, L.; Wang, Q.; Miao, Z. Landslide mapping from aerial photographs using change detection-based Markov random field. Remote Sens. Environ. 2016, 187, 76–90. [Google Scholar] [CrossRef] [Green Version]
- Lu, P.; Qin, Y.; Li, Z.; Mondini, A.C.; Casagli, N. Landslide mapping from multi-sensor data through improved change detection-based Markov random field. Remote Sens. Environ. 2019, 231, 111235. [Google Scholar] [CrossRef]
- Yang, J.; Zhang, D.; Frangi, A.F.; Yang, J.Y. Two-dimensional PCA: A new approach to appearance-based face representation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 131–137. [Google Scholar] [CrossRef] [Green Version]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, 1 January 1967; Volume 1, pp. 281–297. [Google Scholar]
- Melgani, F.; Moser, G.; Serpico, S.B. Unsupervised change-detection methods for remote-sensing images. Opt. Eng. 2002, 41, 3288–3297. [Google Scholar]
- Ghosh, A.; Mishra, N.S.; Ghosh, S. Fuzzy clustering algorithms for unsupervised change detection in remote sensing images. Inf. Sci. 2011, 181, 699–715. [Google Scholar] [CrossRef]
- Christoulas, G.; Tsagaris, V.; Anastassopoulos, V. Textural characterization from various representations of MERIS data. Int. J. Remote Sens. 2007, 28, 675–692. [Google Scholar] [CrossRef]
- Karachristos, K.; Koukiou, G.; Anastassopoulos, V. Fully Polarimetric Land Cover Classification Based on Hidden Markov Models Trained with Multiple Observations. Adv. Remote Sens. 2021, 10, 102–114. [Google Scholar] [CrossRef]
- Koukiou, G.; Anastassopoulos, V. Fully Polarimetric Land Cover Classification Based on Markov Chains. Adv. Remote Sens. 2021, 10, 47–65. [Google Scholar] [CrossRef]
- Dong, R.; Xu, D.; Zhao, J.; Jiao, L.; An, J. Sig-NMS-based faster R-CNN combining transfer learning for small target detection in VHR optical remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8534–8545. [Google Scholar] [CrossRef]
- Mahmoudi, N.; Ahadi, S.M.; Rahmati, M. Multi-target tracking using CNN-based features: CNNMTT. Multimed. Tools Appl. 2019, 78, 7077–7096. [Google Scholar] [CrossRef]
- Liu, B.; Tang, J.; Huang, H.; Lu, X.Y. Deep learning methods for super-resolution reconstruction of turbulent flows. Phys. Fluids 2020, 32, 025105. [Google Scholar] [CrossRef]
- Romera, E.; Alvarez, J.M.; Bergasa, L.M.; Arroyo, R. Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 2017, 19, 263–272. [Google Scholar] [CrossRef]
- Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chopra, S.; Hadsell, R.; LeCun, Y. Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 1, pp. 539–546. [Google Scholar]
- Zheng, Z.; Wan, Y.; Zhang, Y.; Xiang, S.; Peng, D.; Zhang, B. CLNet: Cross-layer convolutional neural network for change detection in optical remote sensing imagery. ISPRS J. Photogramm. Remote Sens. 2021, 175, 247–267. [Google Scholar] [CrossRef]
- Peng, D.; Zhang, Y.; Guan, H. End-to-end change detection for high resolution satellite images using improved UNet++. Remote Sens. 2019, 11, 1382. [Google Scholar] [CrossRef] [Green Version]
- Lei, T.; Zhang, Q.; Xue, D.; Chen, T.; Meng, H.; Nandi, A.K. End-to-end change detection using a symmetric fully convolutional network for landslide mapping. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 3027–3031. [Google Scholar]
- Raza, A.; Liu, Y.; Huo, H.; Fang, T. EUNet-CD: Efficient UNet++ for Change Detection of Very High-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Chen, H.; Shi, Z. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens. 2020, 12, 1662. [Google Scholar] [CrossRef]
- Zhang, C.; Yue, P.; Tapete, D.; Jiang, L.; Shangguan, B.; Huang, L.; Liu, G. A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images. ISPRS J. Photogramm. Remote Sens. 2020, 166, 183–200. [Google Scholar] [CrossRef]
- Wu, J.; Li, B.; Qin, Y.; Ni, W.; Zhang, H.; Fu, R.; Sun, Y. A multiscale graph convolutional network for change detection in homogeneous and heterogeneous remote sensing images. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102615. [Google Scholar] [CrossRef]
- Wei, H.; Chen, R.; Yu, C.; Yang, H.; An, S. BASNet: A Boundary-Aware Siamese Network for Accurate Remote-Sensing Change Detection. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
- Zhan, Y.; Fu, K.; Yan, M.; Sun, X.; Wang, H.; Qiu, X. Change detection based on deep siamese convolutional network for optical aerial images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1845–1849. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Daudt, R.C.; Le Saux, B.; Boulch, A. Fully convolutional siamese networks for change detection. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 4063–4067. [Google Scholar]
- Li, X.; He, M.; Li, H.; Shen, H. A combined loss-based multiscale fully convolutional network for high-resolution remote sensing image change detection. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Guo, M.H.; Xu, T.X.; Liu, J.J.; Liu, Z.N.; Jiang, P.T.; Mu, T.J.; Zhang, S.H.; Martin, R.R.; Cheng, M.M.; Hu, S.M. Attention Mechanisms in Computer Vision: A Survey. arXiv 2021, arXiv:2111.07624. [Google Scholar] [CrossRef]
- Wang, Q.; Liu, S.; Chanussot, J.; Li, X. Scene classification with recurrent attention of VHR remote sensing images. IEEE Trans. Geosci. Remote Sens. 2018, 57, 1155–1167. [Google Scholar] [CrossRef]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3146–3154. [Google Scholar]
- Li, W.; Zhu, X.; Gong, S. Harmonious attention network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2285–2294. [Google Scholar]
- Wang, D.; Chen, X.; Jiang, M.; Du, S.; Xu, B.; Wang, J. ADS-Net: An Attention-Based deeply supervised network for remote sensing image change detection. Int. J. Appl. Earth Obs. Geoinf. 2021, 101, 102348. [Google Scholar]
- Chen, J.; Yuan, Z.; Peng, J.; Chen, L.; Huang, H.; Zhu, J.; Liu, Y.; Li, H. DASNet: Dual attentive fully convolutional siamese networks for change detection in high-resolution satellite images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 14, 1194–1206. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Ding, K.; Huo, C.; Xu, Y.; Zhong, Z.; Pan, C. Sparse hierarchical clustering for VHR image change detection. IEEE Geosci. Remote Sens. Lett. 2014, 12, 577–581. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Liu, M.; Sun, J. Activate or not: Learning customized activation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8032–8042. [Google Scholar]
- Lei, Y.; Peng, D.; Zhang, P.; Ke, Q.; Li, H. Hierarchical paired channel fusion network for street scene change detection. IEEE Trans. Image Process. 2020, 30, 55–67. [Google Scholar] [CrossRef]
- Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 652–662. [Google Scholar] [CrossRef] [Green Version]
- Saha, S.; Bovolo, F.; Bruzzone, L. Unsupervised deep change vector analysis for multiple-change detection in VHR images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 3677–3693. [Google Scholar] [CrossRef]
- Oksuz, K.; Cam, B.C.; Kalkan, S.; Akbas, E. Imbalance problems in object detection: A review. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3388–3415. [Google Scholar] [CrossRef] [Green Version]
- Zhang, M.; Shi, W. A feature difference convolutional neural network-based change detection method. IEEE Trans. Geosci. Remote Sens. 2020, 58, 7232–7246. [Google Scholar] [CrossRef]
- Wang, H.; Cao, P.; Wang, J.; Zaiane, O.R. UCTransNet: Rethinking the skip connections in U-Net from a channel-wise perspective with transformer. arXiv 2021, arXiv:2109.04335. [Google Scholar]
- Zhao, Z.; Zeng, Z.; Xu, K.; Chen, C.; Guan, C. Dsal: Deeply supervised active learning from strong and weak labelers for biomedical image segmentation. IEEE J. Biomed. Health Inform. 2021, 25, 3744–3751. [Google Scholar] [CrossRef] [PubMed]
- Fang, S.; Li, K.; Shao, J.; Li, Z. SNUNet-CD: A densely connected siamese network for change detection of VHR images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
- Lebedev, M.; Vizilter, Y.V.; Vygolov, O.; Knyaz, V.; Rubis, A.Y. CHANGE DETECTION IN REMOTE SENSING IMAGES USING CONDITIONAL ADVERSARIAL NETWORKS. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, 422, 565–571. [Google Scholar] [CrossRef] [Green Version]
- Ji, S.; Wei, S.; Lu, M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 2018, 57, 574–586. [Google Scholar] [CrossRef]
- Peng, X.; Zhong, R.; Li, Z.; Li, Q. Optical remote sensing image change detection based on attention mechanism and image difference. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7296–7307. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
- Alcantarilla, P.F.; Stent, S.; Ros, G.; Arroyo, R.; Gherardi, R. Street-view change detection with deconvolutional networks. Auton. Robot. 2018, 42, 1301–1322. [Google Scholar] [CrossRef]
- Chen, H.; Qi, Z.; Shi, Z. Remote sensing image change detection with transformers. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 5607514. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
A-R | Size (Channel × Height × Width) | |
---|---|---|
Input | Output | |
3 × 256 × 256 | 48 × 256 × 256 | |
3 × 256 × 256 | 48 × 256 × 256 | |
192 × 256 × 256 | 48 × 256 × 256 | |
240 × 256 × 256 | 48 × 256 × 256 | |
288 × 256 × 256 | 48 × 256 × 256 | |
336 × 256 × 256 | 48 × 256 × 256 | |
48 × 128 × 128 | 96 × 128 × 128 | |
48 × 128 × 128 | 96 × 128 × 128 | |
384 × 128 × 128 | 96 × 128 × 128 | |
480 × 128 × 128 | 96 × 128 × 128 | |
576 × 128 × 128 | 96 × 128 × 128 | |
96 × 64 × 64 | 192 × 64 × 64 | |
96 × 64 × 64 | 192 × 64 × 64 | |
768 × 64 × 64 | 192 × 64 × 64 | |
960 × 64 × 64 | 192 × 64 × 64 | |
192 × 32 × 32 | 384 × 32 × 32 | |
192 × 32 × 32 | 384 × 32 × 32 | |
1536 × 32 × 32 | 384 × 32 × 32 | |
384 × 16 × 16 | 768 × 16 × 16 |
Datasets | Resolution | Size | Number of Pixels | Change Objects | Number of Samples | ||||
---|---|---|---|---|---|---|---|---|---|
Changed | Unchanged | Ratio | Train | Validation | Test | ||||
CDD | 0.03–1 m/pixel | 256 × 256 | 134,068,750 | 914,376,178 | 1:6.82 | buildings, roads, cars, etc. | 10,000 | 3000 | 3000 |
BCDD | 0.3 m/pixel | 256 × 256 | 21,352,815 | 477,759,663 | 1:22.37 | Sparse large buildings | 5948 | 743 | 743 |
LEVIR-CD | 0.5 m/pixel | 256 × 256 | 30,913,975 | 637,028,937 | 1:20.61 | Dense small buildings | 7120 | 1024 | 2048 |
Models | Year | Architecture | Main Strategy | Loss Function |
---|---|---|---|---|
FC-EF | 2018 | Single-stream, FCN | Skip connection, multi-level fusion | WSCE loss 1 |
FC-Siam-conc | 2018 | Siamese, FCN | Siamese-concatenation, skip connection, multi-level fusion | WSCE loss |
FC-Siam-diff | 2018 | Siamese, FCN | Siamese-difference, skip connection, multi-level fusion | WSCE loss |
CDNet | 2018 | Single-stream, FCN | Stacking contraction, expansion blocks | Weighted cross-entropy loss |
STANet | 2020 | Siamese, ResNet | BAM,PAM | Batch-balanced contrastive loss |
BIT | 2021 | Siamese, ResNet | Bi-temporal image transformer | Cross-entropy loss |
SNUNet | 2021 | Siamese, UNet++ | Densely connected, ECAM | WSCE loss, dice loss |
Model | CDD | BCDD | LEVIR-CD |
---|---|---|---|
Pre/Recall/F1/IoU/OA | Pre/Recall/F1/IoU/OA | Pre/Recall/F1/IoU/OA | |
FC-EF | 76.56/43.49/55.47/38.38/91.76 | 82.28/70.66/76.03/61.33/97.92 | 82.27/66.28/73.41/58.00/97.55 |
FC-Siam-conc | 88.00/53.58/66.61/49.93/93.66 | 40.09/73.84/51.97/35.11/93.63 | 86.81/67.66/76.05/61.36/97.83 |
FC-Siam-diff | 88.49/51.53/65.14/48.30/93.49 | 38.82/71.80/50.40/33.69/93.40 | 86.55/74.38/80.00/66.68/98.11 |
CDNet | 91.93/84.87/88.26/78.98/97.34 | 92.16 /83.18/87.44/77.68/98.88 | 88.38/85.08/86.70/76.52/98.67 |
STANet | 88.97/94.31/91.56/84.44/97.95 | 91.25/86.18/88.64/79.61/98.97 | 80.99/91.21/85.79/75.12/98.46 |
BIT | 95.86/94.59/95.22/90.88/98.88 | 86.07/85.61/85.84/75.19/98.68 | 91.95 /88.57/90.23/82.19/99.02 |
SNUNet | 96.82/96.72/96.77/93.74/99.24 | 88.35/87.80/88.07/78.69/98.89 | 91.66/88.48/90.04/81.89/99.00 |
Ours | 97.10/97.30/97.20/94.56/99.34 | 92.70/88.72/90.67/82.93/99.15 | 91.23/89.37/90.29/82.30/99.02 |
Baseline | A-R | AFFM | HARM | CDD | |||||
---|---|---|---|---|---|---|---|---|---|
Pre | Recall | F1 | IoU | OA | Sheets/Sec. | ||||
✓ | 94.99 | 90.93 | 92.91 | 86.77 | 98.36 | 41.10 | |||
✓ | ✓ | 94.77 | 92.00 | 93.36 | 87.55 | 98.46 | 21.28 | ||
✓ | ✓ | 95.32 | 93.31 | 94.31 | 89.22 | 98.67 | 38.96 | ||
✓ | ✓ | 96.70 | 96.66 | 96.68 | 93.57 | 99.22 | 29.70 | ||
✓ | ✓ | ✓ | 96.94 | 97.06 | 97.00 | 94.17 | 99.29 | 26.41 | |
✓ | ✓ | ✓ | ✓ | 97.10 | 97.30 | 97.20 | 94.56 | 99.34 | 18.40 |
CDD | |||||
---|---|---|---|---|---|
Pre | Recall | F1 | IoU | OA | |
CAM | 95.17 | 93.58 | 94.37 | 89.35 | 98.68 |
SAM | 95.5 | 94.01 | 94.75 | 90.02 | 98.77 |
CBAM | 96.18 | 94.6 | 95.38 | 91.17 | 98.92 |
SE | 96.71 | 95.84 | 96.27 | 92.82 | 99.13 |
HARM | 97.1 | 97.3 | 97.2 | 94.56 | 99.34 |
HARM_ Branch | BCDD | ||||
---|---|---|---|---|---|
Pre | Recall | F1 | IoU | OA | |
1 | 88.87 | 84.94 | 86.86 | 76.77 | 98.8 |
2 | 90.64 | 86.08 | 88.3 | 79.05 | 98.94 |
3 | 92.7 | 88.72 | 90.67 | 82.93 | 99.15 |
4 | 86.22 | 88.77 | 87.47 | 77.75 | 98.81 |
6 | 89.92 | 86.1 | 87.97 | 78.53 | 98.9 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, H.; Wang, L.; Cheng, S. HARNU-Net: Hierarchical Attention Residual Nested U-Net for Change Detection in Remote Sensing Images. Sensors 2022, 22, 4626. https://doi.org/10.3390/s22124626
Li H, Wang L, Cheng S. HARNU-Net: Hierarchical Attention Residual Nested U-Net for Change Detection in Remote Sensing Images. Sensors. 2022; 22(12):4626. https://doi.org/10.3390/s22124626
Chicago/Turabian StyleLi, Haojin, Liejun Wang, and Shuli Cheng. 2022. "HARNU-Net: Hierarchical Attention Residual Nested U-Net for Change Detection in Remote Sensing Images" Sensors 22, no. 12: 4626. https://doi.org/10.3390/s22124626
APA StyleLi, H., Wang, L., & Cheng, S. (2022). HARNU-Net: Hierarchical Attention Residual Nested U-Net for Change Detection in Remote Sensing Images. Sensors, 22(12), 4626. https://doi.org/10.3390/s22124626