Automated Pixel-Level Deep Crack Segmentation on Historical Surfaces Using U-Net Models
<p>Historical building surface examples.</p> "> Figure 2
<p>Structure of the proposed deep crack segmentation approach.</p> "> Figure 3
<p>Samples of cracks in the primary dataset.</p> "> Figure 4
<p>Illustration of semi-supervised GrabCut pixel-level map generation.</p> "> Figure 5
<p>Architecture of Deep Residual U-Net (ResU-Net).</p> "> Figure 6
<p>Architecture of ResU-Net++.</p> "> Figure 7
<p>Architecture of <math display="inline"><semantics> <msup> <mi>U</mi> <mn>2</mn> </msup> </semantics></math>-Net.</p> "> Figure 8
<p>General structure of RSU block.</p> "> Figure 9
<p>Training dataset samples.</p> "> Figure 10
<p>Semi-supervised GrabCut pixel-level map generation.</p> "> Figure 11
<p><math display="inline"><semantics> <msup> <mi>U</mi> <mn>2</mn> </msup> </semantics></math>-Net result samples.</p> "> Figure 12
<p>Results on several samples of the used crack detection dataset, with different patterns—group1. In each column, we present the results based on: (<b>a</b>) Blurred image with a crack, (<b>b</b>) Wide crack with a marker, (<b>c</b>) Branched crack with a flower, (<b>d</b>) Branched crack with flower (a different point of view), (<b>e</b>) Branched crack.</p> "> Figure 13
<p>Results on several samples of the used crack detection dataset, with different patterns—group-2. In each column, we present the results based on: (<b>a</b>) Blurred image with a crack, (<b>b</b>) Wide crack with a marker, (<b>c</b>) Blurred image with a crack and texture, (<b>d</b>) Blurred image with a crack, (<b>e</b>) Image with a crack inside carvings.</p> "> Figure 14
<p>Results on several samples of the used crack detection dataset, with different patterns—group-3. In each column, we present the results based on: (<b>a</b>) Image with a short wide crack on the edge, (<b>b</b>) Wide crack on the edge, (<b>c</b>) Image with a crack and corrosion, (<b>d</b>) Blurred image with a crack, (<b>e</b>) Blurred image with a crack inside ornaments.</p> "> Figure 15
<p>Results on several samples of the used crack detection dataset, with different patterns—group-4. In each column, we present the results based on: (<b>a</b>) Image with a branched crack (a multi-depth view), (<b>b</b>) Image with a crack inside carvings, (<b>c</b>) Image with a crack and marker, (<b>d</b>) Image with a crack inside ornaments, (<b>e</b>) Image with a branched crack.</p> ">
Abstract
:1. Introduction
- Developing an automated pixel-level detection approach through assessing various U-Net deep learning architectures for handling the problem of deep crack segmentation on historical surfaces.
- Investigating two loss functions, namely Dice and cross-entropy (CE), in addition to a third hybrid loss function, for training and enhancing the performance of the proposed approach.
- Constructing an expert-annotated primary dataset of crack images on historical surfaces, collected over two years from a historical location in Historic Cairo, Egypt.
- Applying a contrast stretching method for handling the impacts of different environmental conditions on images of historical surfaces.
- Building an extra semi-supervised pixel-level map generation module for annotating historical surface images, to avoid the cost of pixel-by-pixel manual annotation.
2. State-of-the-Art Studies for Crack Segmentation
3. The Proposed Deep Crack Segmentation Approach
3.1. Data Acquisition Phase
3.2. Data Preparation Phase
- (1)
- Image bank generation: raw images are divided into ( pixel resolution) sub-images;
- (2)
- Filtering: only sub-images with cracks are considered, while intact ones are ignored, and
- (3)
- Augmentation: several spatial transformations are applied systematically, as follows [21]:
- 1.
- Flipping images vertically;
- 2.
- Flipping images horizontally;
- 3.
- Flipping images vertically, then horizontally;
- 4.
- Rotating images vertically by 90°, then by −90°, individually;
- 5.
- Combining the output images of the previous steps with the original images to establish a new dataset ().
3.3. Semi-Supervised Pixel-Level Map Generation
Algorithm 1 GrabCut pixel-level map generation |
|
3.4. Segmentation-Based Variant U-Net Models
3.4.1. Deep Residual Unet (ResUnet)
3.4.2. ResUnet++
3.4.3. -Net
- 1.
- The input convolution layer is responsible for transforming the input feature map to an intermediate one;
- 2.
- A L height U-Net is responsible for learning how to elicit and encode the multi-scale contextual information using the intermediate feature map as input;
- 3.
- The residual connection is responsible for fusing both local features and multi-scale features by a summation operator.
3.4.4. Utilized U-Net-Based Models
- Deep ResU-Net model: In this paper, a nine-level architecture of deep ResU-Net is utilized along with two different loss functions for pixel-by-pixel crack detection. All levels are built with residual blocks comprising two convolution blocks and an identity mapping connecting both the input and output of the block.
- ResU-Net++ model: The original architecture of ResU-Net++ is utilized along with two different loss functions for pixel-by-pixel crack detection with the filter numbers [16, 32, 64, 128, 256]. The filter number was selected based on experiments.
- -Net model: Moreover, the original architecture of -Net is utilized along with two different loss functions for pixel-by-pixel crack detection.
4. Experimental Results
- True Positive (TP): the pixel is a crack and is classified as a crack;
- False Positive (FP): the pixel is intact and is classified as a crack;
- True Negative (TN): the pixel is intact and is classified as intact;
- False Negative (FN): the pixel is a crack and is classified as intact.
4.1. Dataset Description
- 1.
- The open crack detection dataset [7] is a benchmark dataset, consisting of a total of 537 images with manual annotation maps. It is divided into 300 and 237 images as training and testing datasets, respectively.
- 2.
- The CrackForest dataset [31] is another benchmark crack detection dataset, consisting of a total of 118 images.
- 3.
- The primary dataset consists of a total of 263 crack images of historical surfaces with ornaments, carvings, wood patterns, separators, and corrosion on walls.
4.2. Results and Discussion
4.3. Comparative Analysis
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kim, H.; Ahn, E.; Shin, M.; Sim, S.-H. Crack and noncrack classification from concrete surface images using machine learning. Struct. Health Monit. 2019, 18, 725–738. [Google Scholar] [CrossRef]
- Atha, D.J.; Jahanshahi, M.R. Evaluation of deep learning approaches based on convolutional neural networks for corrosion detection. Struct. Health Monit. 2018, 17, 1110–1128. [Google Scholar] [CrossRef]
- Cavalagli, N.; Kita, A.; Falco, S.; Trillo, F.; Costantini, M.; Ubertini, F. Satellite radar interferometry and in-situ measurements for static monitoring of historical monuments: The case of Gubbio, Italy. Remote. Sens. Environ. 2019, 235, 111453. [Google Scholar] [CrossRef]
- Zhang, W.; Zhang, Z.; Qi, D.; Liu, Y. Automatic Crack Detection and Classification Method for Subway Tunnel Safety Monitoring. Sensors 2011, 14, 19307–19328. [Google Scholar] [CrossRef] [PubMed]
- Munawar, H.S.; Hammad, A.W.; Haddad, A.; Soares, C.A.P.; Waller, S.T. Image-Based Crack Detection Methods: A Review. Infrastructures 2021, 6, 115. [Google Scholar] [CrossRef]
- Palevičius, P.; Pal, M.; Landauskas, M.; Orinaitė, U.; Timofejeva, I.; Ragulskis, M. Automatic Detection of Cracks on Concrete Surfaces in the Presence of Shadows. Sensors 2022, 22, 3662. [Google Scholar] [CrossRef]
- Liu, Y.; Jian, Y.; Xiaohu, L.; Renping, X.; Li, L. DeepCrack: A deep hierarchical feature learning architecture for crack segmentation. Neurocomputing 2019, 338, 139–153. [Google Scholar] [CrossRef]
- Guan, H.; Li, J.; Yu, Y.; Chapman, M.; Wang, H.; Wang, C.; Zhai, R. Iterative Tensor Voting for Pavement Crack Extraction Using Mobile Laser Scanning Data. IEEE Trans. Geosci. Remote. Sens. 2015, 53, 1527–1537. [Google Scholar] [CrossRef]
- Weng, X.; Huang, Y.; Wang, W. Segment-based pavement crack quantification. Autom. Constr. 2019, 105, 102819. [Google Scholar] [CrossRef]
- Chen, T.; Cai, Z.; Zhao, X.; Chen, C.; Liang, X.; Zou, T.; Wang, P. Pavement crack detection and recognition using the architecture of SegNet. J. Ind. Inf. Integr. 2020, 18, 100144. [Google Scholar] [CrossRef]
- Dais, D.; Bal, I.E.; Smyrou, E.; Sarhosis, V. Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Autom. Constr. 2021, 125, 103606. [Google Scholar] [CrossRef]
- Song, W.; Jia, G.; Zhu, H.; Jia, D.; Gao, L. Automated pavement Crack damage detection using deep multiscale convolutional features. J. Adv. Transp. 2020, 2020, 1–11. [Google Scholar] [CrossRef]
- Dung, C.V. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Constr. 2019, 99, 52–58. [Google Scholar] [CrossRef]
- Zhang, X.; Rajan, D.; Story, B. Concrete crack detection using context-aware deep semantic segmentation network. Comput. Aided Civ. Infrastruct. Eng. 2019, 34, 951–971. [Google Scholar] [CrossRef]
- Kang, D.; Benipal, S.S.; Gopal, D.L.; Cha, Y.J. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Autom. Constr. 2020, 118, 103291. [Google Scholar] [CrossRef]
- Liu, J.; Yang, X.; Lau, S.; Wang, X.; Luo, S.; Lee, V.C.-S.; Ding, L. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Comput. Aided Civ. Infrastruct. Eng. 2020, 35, 1291–1305. [Google Scholar] [CrossRef]
- Ghorbanzadeh, O.; Shahabi, H.; Crivellari, A.; Hoayouni, S.; Blaschke, T.; Ghamisi, P. Landslide detection using deep learning and object-based image analysis. Landslides 2022, 19, 929–939. [Google Scholar] [CrossRef]
- Jia, A.; Xue, X.; Wang, Y.; Luo, X.; Xue, W. An integrated approach to automatic pixel-level crack detection and quantification of asphalt pavement. Autom. Constr. 2020, 114, 103176. [Google Scholar] [CrossRef]
- Li, Y.; Li, H.; Wang, H. Pixel-Wise Crack Detection Using Deep Local Pattern Predictor for Robot Application. Sensors 2018, 18, 3042. [Google Scholar] [CrossRef]
- Guzmán-Torres, J.A.; Naser, M.Z.; Domínguez-Mota, F.J. Effective medium crack classification on laboratory concrete specimens via competitive machine learning. Structures 2022, 37, 858–870. [Google Scholar] [CrossRef]
- Sajjad, M.; Khan, S.; Muhammad, K.; Wu, W.; Ullah, A.; Baik, S.W. Multi-grade brain tumor classification using deep CNN with extensive data augmentation. J. Comput. Sci. 2019, 30, 174–182. [Google Scholar] [CrossRef]
- Rother, C.; Kolmogorov, V.; Blake, A. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 2004, 23, 309–314. [Google Scholar] [CrossRef]
- Khattab, D.; Theobalt, C.; Hussein, A.S.; Tolba, M.F. Modified GrabCut for human face segmentation. Ain Shams Eng. J. 2014, 5, 1083–1091. [Google Scholar] [CrossRef]
- Vuola, A.O.; Akram, S.U.; Kannala, J. Mask-RCNN and U-Net ensembled for nuclei segmentation. In Proceedings of the IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 208–212. [Google Scholar] [CrossRef]
- Zhang, Z.; Liu, Q.; Wang, Y. Road extraction by deep residual U-Net. IEEE Geosci. Remote. Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef]
- Jetley, S.; Lord, N.A.; Lee, N.; Torr, P.H.S. Learn To Pay Attention. In Proceedings of the 6th International Conference on Learning Representations, (ICLR 2018), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: A nested U-Net architecture for medical image segmentation. In Proceedings of the 4th International Workshop of Deep Learning in Medical Image Analysis (DLMIA 2018) & 8th International Workshop of Multimodal Learning for Clinical Decision Support (ML-CDS 2018), Granada, Spain, 20 September 2018; pp. 3–11. [Google Scholar] [CrossRef]
- Xiao, X.; Lian, S.; Luo, Z.; Li, Z.S. Weighted Res-UNet for high-quality retina vessel segmentation. In Proceedings of the 9th IEEE International Conference on Information Technology in Medicine and Education (ITME 2018), Hangzhou, China, 19–21 October 2018; pp. 327–331. [Google Scholar] [CrossRef]
- Qin, X.; Zhang, Z.; Huang, C.; Dehghan, M.; Zaiane, O.R.; Jagersand, M. U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recognit. 2020, 106, 107404. [Google Scholar] [CrossRef]
- Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic road crack detection using random structured forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
- Labatut, V.; Cherifi, H. Accuracy measures for the comparison of classifiers. In Proceedings of the 5th International Conference on Information Technology, Amman, Jordan, 11–13 May 2011; pp. 1–5. [Google Scholar] [CrossRef]
Study | Method | Feature | Post Processing | Datasets | Performance |
---|---|---|---|---|---|
[11] | U-Net and FPN with different CNN as encoder | CNN learned features | N/A | Dataset of 351 images containing cracks and 118 intact images | F1 score = 79.6% |
[7] | An end-to-end deep hierarchical CNN with a special loss function | Multi-scale and multi-level CNN features | CRFs and GF methods | Dataset of 537 images with manually annotated maps | mIoU = 85.9%, l Best F1 score = 86.5% |
[12] | An end-to-end trainable deep CNN with a multi-scale dilated convolution module | Multi-scale dilated convolution CNN features | N/A | Datasets of total 4736, 1036, and 2416 crack images are used as training, validation, and testing set CFD and AigleRN datasets used for testing, | Recall = 97.85%, F-measure = 97.92%, Precision = 98.00%, mIoU = 73.53% |
[13] | Context-aware deep CNN | CNN learned features | N/A | CrackForest Dataset (CFD), Tomorrows Road Infrastructure Monitoring Management Dataset (TRIMMD), Customized Field Test Dataset (CFTD) | F1 score = 0.8234, 0.7937, 0.8252 |
[15] | A U-Net with a pre-trained ResNet-34 as an encoder, followed by SCSE modules in the decoder | CNN learned features | N/A | CFD dataset, Crack500 dataset | F1 score = 96%, 73% |
[16] | An end-to-end deep FCN with pre-trained VGG16 as the encoder part | CNN learned features | N/A | Crack dataset, containing 40,000 images | Average precision = 90% |
[18] | DeepLabv3+ | Multi-scale CNN features | Skeletonizing and FPT methods | Dataset of 300 crack images captured using a smartphone, another dataset of 80 pavement crack images | mIoU = 0.8342, 0.7331 using validation and testing datasets |
[10] | An encoder-decoder based on a modified SegNet architecture | CNN learned features | N/A | Datasets of 2000 bridge deck images and 1000 pavement crack images | Accuracy = 90% mAP = 83% |
[19] | A crack segmentation method based on DLPP | DLPP features | N/A | 326,000 images collected from 45 various bridges | The proposed method was effective and robust |
[14] | A hybrid method based on a Faster R-CNN, modified TuFF and DTM algorithms | Level set function | N/A | A dataset of 100 images captured from various places | mIoU = 83% |
[20] | Source crack detection approach based on a much improved VGG-16 transfer learned model | CNN learned features | N/A | SDNET2018 dataset | Accuracy = 99.5% and F1 score = 100% |
Loss Function | Contrast Stretching | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|---|
Dice Loss | No | 98.189 | 46.312 | 40.651 | 65.226 | 40.231 | 25.181 |
Yes | 98.135 | 63.760 | 52.642 | 69.77 | 52.489 | 35.583 | |
Binary Cross-Entropy | No | 98.256 | 58.618 | 41.761 | 67.069 | 40.712 | 25.559 |
Yes | 98.279 | 79.968 | 61.607 | 74.959 | 57.923 | 40.769 | |
Hybrid | No | 98.022 | 47.336 | 32.855 | 62.567 | 34.495 | 20.842 |
Yes | 98.155 | 70.195 | 44.622 | 67.876 | 47.678 | 31.301 |
Loss Function | Contrast Stretching | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|---|
Dice Loss | No | 98.285 | 70.360 | 60.621 | 73.359 | 58.934 | 41.778 |
Yes | 97.96 | 66.648 | 74.66 | 76.236 | 68.073 | 51.599 | |
Binary Cross-Entropy | No | 98.318 | 79.161 | 52.466 | 72.124 | 52.447 | 35.545 |
Yes | 97.99 | 79.12 | 69.205 | 77.572 | 62.27 | 45.212 | |
Hybrid | No | 98.448 | 71.78 | 54.60 | 71.88 | 55.31 | 38.227 |
Yes | 98.21 | 70.69 | 74.29 | 77.32 | 69.43 | 53.175 |
Individual Test Dataset | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|
Dataset [7] | 98.67 | 78.10 | 81.81 | 82.01 | 77.80 | 63.67 |
Historical Dataset | 98.19 | 67.03 | 33.17 | 63.92 | 37.98 | 23.44 |
Individual Test Dataset | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|
Dataset [7] | 98.34 | 73.92 | 87.94 | 82.05 | 79.58 | 66.09 |
Historical Dataset | 97.34 | 62.37 | 69.20 | 73.11 | 61.42 | 44.32 |
Loss Function | Contrast Stretching | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|---|
Dice Loss | No | 98.507 | 73.169 | 84.347 | 80.922 | 75.523 | 60.672 |
Yes | 98.473 | 72.397 | 84.995 | 80.778 | 75.294 | 60.377 | |
Binary Cross-Entropy | No | 98.239 | 75.248 | 75.795 | 78.834 | 71.678 | 55.858 |
Yes | 98.189 | 74.267 | 77.406 | 78.944 | 71.929 | 56.153 | |
Hybrid | No | 98.208 | 69.396 | 85.365 | 79.502 | 73.065 | 57.561 |
Yes | 98.175 | 68.742 | 85.773 | 79.318 | 72.746 | 57.166 |
Individual Test Dataset | Accuracy % | Precision% | Recall% | mIoU% | Dice Score% | Jaccard% |
---|---|---|---|---|---|---|
Dataset [7] | 98.87 | 79.71 | 85.39 | 83.78 | 80.52 | 67.392 |
Historical Dataset | 98.32 | 67.73 | 83.03 | 78.38 | 71.09 | 55.147 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Elhariri, E.; El-Bendary, N.; Taie, S.A. Automated Pixel-Level Deep Crack Segmentation on Historical Surfaces Using U-Net Models. Algorithms 2022, 15, 281. https://doi.org/10.3390/a15080281
Elhariri E, El-Bendary N, Taie SA. Automated Pixel-Level Deep Crack Segmentation on Historical Surfaces Using U-Net Models. Algorithms. 2022; 15(8):281. https://doi.org/10.3390/a15080281
Chicago/Turabian StyleElhariri, Esraa, Nashwa El-Bendary, and Shereen A. Taie. 2022. "Automated Pixel-Level Deep Crack Segmentation on Historical Surfaces Using U-Net Models" Algorithms 15, no. 8: 281. https://doi.org/10.3390/a15080281