[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (55)

Search Parameters:
Keywords = PA-UNet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 10271 KiB  
Article
HSP-UNet: An Accuracy and Efficient Segmentation Method for Carbon Traces of Surface Discharge in the Oil-Immersed Transformer
by Hongxin Ji, Xinghua Liu, Peilin Han, Liqing Liu and Chun He
Sensors 2024, 24(19), 6498; https://doi.org/10.3390/s24196498 - 9 Oct 2024
Viewed by 347
Abstract
Restricted by a metal-enclosed structure, the internal defects of large transformers are difficult to visually detect. In this paper, a micro-robot is used to visually inspect the interior of a transformer. For the micro-robot to successfully detect the discharge level and insulation degradation [...] Read more.
Restricted by a metal-enclosed structure, the internal defects of large transformers are difficult to visually detect. In this paper, a micro-robot is used to visually inspect the interior of a transformer. For the micro-robot to successfully detect the discharge level and insulation degradation trend in the transformer, it is essential to segment the carbon trace accurately and rapidly from the complex background. However, the complex edge features and significant size differences of carbon traces pose a serious challenge for accurate segmentation. To this end, we propose the Hadamard production-Spatial coordinate attention-PixelShuffle UNet (HSP-UNet), an innovative architecture specifically designed for carbon trace segmentation. To address the pixel over-concentration and weak contrast of carbon trace image, the Adaptive Histogram Equalization (AHE) algorithm is used for image enhancement. To realize the effective fusion of carbon trace features with different scales and reduce model complexity, the novel grouped Hadamard Product Attention (HPA) module is designed to replace the original convolution module of the UNet. Meanwhile, to improve the activation intensity and segmentation completeness of carbon traces, the Spatial Coordinate Attention (SCA) mechanism is designed to replace the original jump connection. Furthermore, the PixelShuffle up-sampling module is used to improve the parsing ability of complex boundaries. Compared with UNet, UNet++, UNeXt, MALUNet, and EGE-UNet, HSP-UNet outperformed all the state-of-the-art methods on both carbon trace datasets. For dendritic carbon traces, HSP-UNet improved the Mean Intersection over Union (MIoU), Pixel Accuracy (PA), and Class Pixel Accuracy (CPA) of the benchmark UNet by 2.13, 1.24, and 4.68 percentage points, respectively. For clustered carbon traces, HSP-UNet improved MIoU, PA, and CPA by 0.98, 0.65, and 0.83 percentage points, respectively. At the same time, the validation results showed that the HSP-UNet has a good model lightweighting advantage, with the number of parameters and GFLOPs of 0.061 M and 0.066, respectively. This study could contribute to the accurate segmentation of discharge carbon traces and the assessment of the insulation condition of the oil-immersed transformer. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

Figure 1
<p>Surface discharge and carbon traces of different parts inside the transformer.</p>
Full article ">Figure 2
<p>Significant contrast, size differences, and complex edges of the samples.</p>
Full article ">Figure 3
<p>The micro-robot for transformer internal inspection.</p>
Full article ">Figure 4
<p>Test platform for carbon trace image acquisition.</p>
Full article ">Figure 5
<p>Examples of two kinds of discharge carbon traces.</p>
Full article ">Figure 6
<p>Comparison of carbon trace image with and without the AHE.</p>
Full article ">Figure 6 Cont.
<p>Comparison of carbon trace image with and without the AHE.</p>
Full article ">Figure 7
<p>Network structure of the proposed HSP-UNet.</p>
Full article ">Figure 8
<p>Structure of the grouped HPA module.</p>
Full article ">Figure 9
<p>Structure of the CA module.</p>
Full article ">Figure 10
<p>Structure of the SCA.</p>
Full article ">Figure 11
<p>Segmentation comparison of the dendritic carbon traces.</p>
Full article ">Figure 12
<p>Segmentation comparison of the clustered carbon traces.</p>
Full article ">Figure 12 Cont.
<p>Segmentation comparison of the clustered carbon traces.</p>
Full article ">Figure 13
<p>Segmentation performance with samples in different light conditions.</p>
Full article ">Figure 14
<p>Segmentation performance with samples of different sizes.</p>
Full article ">Figure 15
<p>Grad-CAM comparison of the HSP-UNet ablation test.</p>
Full article ">
19 pages, 7665 KiB  
Article
Chestnut Burr Segmentation for Yield Estimation Using UAV-Based Imagery and Deep Learning
by Gabriel A. Carneiro, Joaquim Santos, Joaquim J. Sousa, António Cunha and Luís Pádua
Drones 2024, 8(10), 541; https://doi.org/10.3390/drones8100541 - 1 Oct 2024
Viewed by 587
Abstract
Precision agriculture (PA) has advanced agricultural practices, offering new opportunities for crop management and yield optimization. The use of unmanned aerial vehicles (UAVs) in PA enables high-resolution data acquisition, which has been adopted across different agricultural sectors. However, its application for decision support [...] Read more.
Precision agriculture (PA) has advanced agricultural practices, offering new opportunities for crop management and yield optimization. The use of unmanned aerial vehicles (UAVs) in PA enables high-resolution data acquisition, which has been adopted across different agricultural sectors. However, its application for decision support in chestnut plantations remains under-represented. This study presents the initial development of a methodology for segmenting chestnut burrs from UAV-based imagery to estimate its productivity in point cloud data. Deep learning (DL) architectures, including U-Net, LinkNet, and PSPNet, were employed for chestnut burr segmentation in UAV images captured at a 30 m flight height, with YOLOv8m trained for comparison. Two datasets were used for training and to evaluate the models: one newly introduced in this study and an existing dataset. U-Net demonstrated the best performance, achieving an F1-score of 0.56 and a counting accuracy of 0.71 on the proposed dataset, using a combination of both datasets during training. The primary challenge encountered was that burrs often tend to grow in clusters, leading to unified regions in segmentation, making object detection potentially more suitable for counting. Nevertheless, the results show that DL architectures can generate masks for point cloud segmentation, supporting precise chestnut tree production estimation in future studies. Full article
Show Figures

Figure 1

Figure 1
<p>Methodological pipeline for chestnut yield estimation from UAV imagery, including data acquisition and processing, imagery segmentation, and point cloud processing.</p>
Full article ">Figure 2
<p>Example of a UAV image on the chestnut grove, and the image split into 48 patches.</p>
Full article ">Figure 3
<p>Examples of masks obtained using the threshold approach. Each row represents a sample: the original image (<b>a</b>), the resulting mask (<b>b</b>), and the overlapping visualization (<b>c</b>). Threshold method applied to chestnut trees with phytosanitary issues (first row), and to healthy chestnut trees (second row).</p>
Full article ">Figure 4
<p>Examples of the transformation applied to the proposed dataset to make it suitable for training object detection models. Red bounding boxes represent areas of chestnut burrs.</p>
Full article ">Figure 5
<p>General overview of the architectures of the selected segmentation models (LinkNet, U-Net, and PSPNet).</p>
Full article ">Figure 6
<p>Segmentation examples on Dataset 1 for each segmentation model trained on Dataset 1 and by merging both datasets.</p>
Full article ">Figure 7
<p>Segmentation examples on Dataset 2 for each segmentation model trained on Dataset 2 and by merging both datasets.</p>
Full article ">Figure 8
<p>Example of occluded chestnut burr (highlighted in the red box) that was not annotated in Dataset 2 and the segmentation results in the different models.</p>
Full article ">
19 pages, 14422 KiB  
Article
YOLO-SegNet: A Method for Individual Street Tree Segmentation Based on the Improved YOLOv8 and the SegFormer Network
by Tingting Yang, Suyin Zhou, Aijun Xu, Junhua Ye and Jianxin Yin
Agriculture 2024, 14(9), 1620; https://doi.org/10.3390/agriculture14091620 - 15 Sep 2024
Viewed by 687
Abstract
In urban forest management, individual street tree segmentation is a fundamental method to obtain tree phenotypes, which is especially critical. Most existing tree image segmentation models have been evaluated on smaller datasets and lack experimental verification on larger, publicly available datasets. Therefore, this [...] Read more.
In urban forest management, individual street tree segmentation is a fundamental method to obtain tree phenotypes, which is especially critical. Most existing tree image segmentation models have been evaluated on smaller datasets and lack experimental verification on larger, publicly available datasets. Therefore, this paper, based on a large, publicly available urban street tree dataset, proposes YOLO-SegNet for individual street tree segmentation. In the first stage of the street tree object detection task, the BiFormer attention mechanism was introduced into the YOLOv8 network to increase the contextual information extraction and improve the ability of the network to detect multiscale and multishaped targets. In the second-stage street tree segmentation task, the SegFormer network was proposed to obtain street tree edge information more efficiently. The experimental results indicate that our proposed YOLO-SegNet method, which combines YOLOv8+BiFormer and SegFormer, achieved a 92.0% mean intersection over union (mIoU), 95.9% mean pixel accuracy (mPA), and 97.4% accuracy on a large, publicly available urban street tree dataset. Compared with those of the fully convolutional neural network (FCN), lite-reduced atrous spatial pyramid pooling (LR-ASPP), pyramid scene parsing network (PSPNet), UNet, DeepLabv3+, and HRNet, the mIoUs of our YOLO-SegNet increased by 10.5, 9.7, 5.0, 6.8, 4.5, and 2.7 percentage points, respectively. The proposed method can effectively support smart agroforestry development. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>(<b>A</b>) is the number distribution of street tree images; (<b>B</b>) is the street tree image annotation.</p>
Full article ">Figure 2
<p>Examples of street tree object detection and instance segmentation annotated images for different tree species.</p>
Full article ">Figure 3
<p>YOLO-SegNet model. The CBS is the basic module, including the Conv2d layer, BatchNorm2d layer, and Sigmoid Linear Unit (SiLU) layer. The function of the CBS module is to introduce a cross-stage partial connection to improve the feature expression ability and information transfer efficiency. The role of the Spatial Pyramid Pooling Fast (SPPF) module is to fuse larger-scale global information to improve the performance of object detection. The bottleneck block can reduce the computational complexity and the number of parameters.</p>
Full article ">Figure 4
<p>(<b>A</b>) The overall architecture of BiFormer; (<b>B</b>) details of a BiFormer block.</p>
Full article ">Figure 5
<p>(<b>a</b>) Vanilla attention. (<b>b</b>–<b>d</b>) Local window [<a href="#B40-agriculture-14-01620" class="html-bibr">40</a>,<a href="#B42-agriculture-14-01620" class="html-bibr">42</a>], axial stripe [<a href="#B39-agriculture-14-01620" class="html-bibr">39</a>], and dilated window [<a href="#B41-agriculture-14-01620" class="html-bibr">41</a>,<a href="#B42-agriculture-14-01620" class="html-bibr">42</a>]. (<b>e</b>) Deformable attention [<a href="#B43-agriculture-14-01620" class="html-bibr">43</a>]. (<b>f</b>) Bilevel routing attention, BRA [<a href="#B6-agriculture-14-01620" class="html-bibr">6</a>].</p>
Full article ">Figure 6
<p>Gathering key–value pairs in the top <span class="html-italic">k</span> related windows.</p>
Full article ">Figure 7
<p>(<b>A</b>,<b>B</b>) are the loss function curves of the object detection network on the train and validation sets, respectively; (<b>C</b>,<b>D</b>) are the loss function curves of tree classification on the train and validation sets, respectively; (<b>E</b>–<b>H</b>) are the change curves of the four segmentation indicator values on the validation set, respectively.</p>
Full article ">Figure 8
<p>(<b>A</b>) Thermal map examples of YOLOv8 series models and YOLOv8m+BiFormer in the training process; (<b>B</b>) example results of the different object detection models on the test set.</p>
Full article ">Figure 9
<p>(<b>A</b>) The training loss function curves of the segmentation models without the object detection module. (<b>B</b>) The training loss function curves of the segmentation models with the object detection module.</p>
Full article ">Figure 10
<p>Performance of different segmentation models on the validation and test sets: (<b>A<sub>1</sub></b>,<b>A<sub>2</sub></b>) the segmentation results on the validation set; (<b>B<sub>1</sub></b>,<b>B<sub>2</sub></b>) the segmentation results on the test set.</p>
Full article ">Figure 11
<p>Results of the different segmentation models on the test set.</p>
Full article ">
14 pages, 5108 KiB  
Article
Soldering Defect Segmentation Method for PCB on Improved UNet
by Zhongke Li and Xiaofang Liu
Appl. Sci. 2024, 14(16), 7370; https://doi.org/10.3390/app14167370 - 21 Aug 2024
Viewed by 408
Abstract
Despite being indispensable devices in the electronic manufacturing industry, printed circuit boards (PCBs) may develop various soldering defects in the production process, which seriously affect the product’s quality. Due to the substantial background interference in the soldering defect image and the small and [...] Read more.
Despite being indispensable devices in the electronic manufacturing industry, printed circuit boards (PCBs) may develop various soldering defects in the production process, which seriously affect the product’s quality. Due to the substantial background interference in the soldering defect image and the small and irregular shapes of the defects, the accurate segmentation of soldering defects is a challenging task. To address this issue, a method to improve the encoder–decoder network structure of UNet is proposed for PCB soldering defect segmentation. To enhance the feature extraction capabilities of the encoder and focus more on deeper features, VGG16 is employed as the network encoder. Moreover, a hybrid attention module called the DHAM, which combines channel attention and dynamic spatial attention, is proposed to reduce the background interference in images and direct the model’s focus more toward defect areas. Additionally, based on GSConv, the RGSM is introduced and applied in the decoder to enhance the model’s feature fusion capabilities and improve the segmentation accuracy. The experiments demonstrate that the proposed method can effectively improve the segmentation accuracy for PCB soldering defects, achieving an mIoU of 81.74% and mPA of 87.33%, while maintaining a relatively low number of model parameters at only 22.13 M and achieving an FPS of 30.16, thus meeting the real-time detection speed requirements. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>The improved UNet network architecture.</p>
Full article ">Figure 2
<p>The DHAM structure.</p>
Full article ">Figure 3
<p>The GSConv structure.</p>
Full article ">Figure 4
<p>The RGSM structure.</p>
Full article ">Figure 5
<p>Schematic of soldering defects. The first row is the images without defects; the second row is the images with defects. (<b>a</b>) MS. (<b>b</b>) MP. (<b>c</b>) SS. (<b>d</b>) OD. (<b>e</b>) SB.</p>
Full article ">Figure 6
<p>Segmentation results for each method. (<b>a</b>) Original figure. (<b>b</b>) Ground truth. (<b>c</b>) DeepLab3plus. (<b>d</b>) SegFormer-B1. (<b>e</b>) PSPNet-ResNet50. (<b>f</b>) UNet. (<b>g</b>) HRNet-W32. (<b>h</b>) Our method.</p>
Full article ">
23 pages, 25042 KiB  
Article
Segmentation Network for Multi-Shape Tea Bud Leaves Based on Attention and Path Feature Aggregation
by Tianci Chen, Haoxin Li, Jinhong Lv, Jiazheng Chen and Weibin Wu
Agriculture 2024, 14(8), 1388; https://doi.org/10.3390/agriculture14081388 - 17 Aug 2024
Viewed by 457
Abstract
Accurately detecting tea bud leaves is crucial for the automation of tea picking robots. However, challenges arise due to tea stem occlusion and overlapping of buds and leaves, presenting varied shapes of one bud–one leaf targets in the field of view, making precise [...] Read more.
Accurately detecting tea bud leaves is crucial for the automation of tea picking robots. However, challenges arise due to tea stem occlusion and overlapping of buds and leaves, presenting varied shapes of one bud–one leaf targets in the field of view, making precise segmentation of tea bud leaves challenging. To improve the segmentation accuracy of one bud–one leaf targets with different shapes and fine granularity, this study proposes a novel semantic segmentation model for tea bud leaves. The method designs a hierarchical Transformer block based on a self-attention mechanism in the encoding network, which is beneficial for capturing long-range dependencies between features and enhancing the representation of common features. Then, a multi-path feature aggregation module is designed to effectively merge the feature outputs of encoder blocks with decoder outputs, thereby alleviating the loss of fine-grained features caused by downsampling. Furthermore, a refined polarized attention mechanism is employed after the aggregation module to perform polarized filtering on features in channel and spatial dimensions, enhancing the output of fine-grained features. The experimental results demonstrate that the proposed Unet-Enhanced model achieves segmentation performance well on one bud–one leaf targets with different shapes, with a mean intersection over union (mIoU) of 91.18% and a mean pixel accuracy (mPA) of 95.10%. The semantic segmentation network can accurately segment tea bud leaves, providing a decision-making basis for the spatial positioning of tea picking robots. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Tea garden environment and label schematics.</p>
Full article ">Figure 2
<p>Segmentation network architecture. Note: Conv1 × 1: Convolution operation with kernel 1 × 1; Conv3 × 3: Convolution operation with kernel 3 × 3; BN: BatchNomalization; Upsampling2D: feature upsample; ReLU: Activate function; Linear: Linear transformation; Interpolate: Bilinear interpolation.</p>
Full article ">Figure 3
<p>Computation of Transformer block.</p>
Full article ">Figure 4
<p>Path feature aggregation module.</p>
Full article ">Figure 5
<p>Polarized attention mechanism module.</p>
Full article ">Figure 6
<p>Results of training and testing: (<b>a</b>) Loss curve for the training dataset; (<b>b</b>) Statistics of the mIoU for the testing dataset.</p>
Full article ">Figure 7
<p>Comparison of segmentation performance of different models.</p>
Full article ">Figure 8
<p>Comparison of segmentation results for large-size tea bud leaves.</p>
Full article ">Figure 9
<p>Comparison of segmentation results for multi-target and fine-grained tea bud leaves.</p>
Full article ">Figure 10
<p>Comparison of segmentation results of different network models. (<b>A</b>) Original image. (<b>B</b>) Ground truth. (<b>C</b>) DeepLabv3+. (<b>D</b>) PSPNet. (<b>E</b>) Hrnet. (<b>F</b>) Segformer. (<b>G</b>) Unet-Enhanced.</p>
Full article ">Figure 11
<p>Segmentation effect of tea bud leaves with different shapes and fine-grained features. (<b>a</b>) mainly “tea_I”. (<b>b</b>) mainly “tea_V”. (<b>c</b>) mainly “tea_Y”.</p>
Full article ">Figure 12
<p>Failure cases of Unet-Enhanced.</p>
Full article ">Figure 13
<p>Shallow feature visualization.</p>
Full article ">Figure 14
<p>Deep feature visualization.</p>
Full article ">Figure 15
<p>Unet heat map. (<b>a</b>–<b>f</b>) are the test results of different samples.</p>
Full article ">Figure 16
<p>Unet-Enhanced heat map. (<b>a</b>–<b>f</b>) are the test results of different samples.</p>
Full article ">
18 pages, 7039 KiB  
Article
Two-Stage Detection Algorithm for Plum Leaf Disease and Severity Assessment Based on Deep Learning
by Caihua Yao, Ziqi Yang, Peifeng Li, Yuxia Liang, Yamin Fan, Jinwen Luo, Chengmei Jiang and Jiong Mu
Agronomy 2024, 14(7), 1589; https://doi.org/10.3390/agronomy14071589 - 21 Jul 2024
Cited by 2 | Viewed by 879
Abstract
Crop diseases significantly impact crop yields, and promoting specialized control of crop diseases is crucial for ensuring agricultural production stability. Disease identification primarily relies on human visual inspection, which is inefficient, inaccurate, and subjective. This study focused on the plum red spot ( [...] Read more.
Crop diseases significantly impact crop yields, and promoting specialized control of crop diseases is crucial for ensuring agricultural production stability. Disease identification primarily relies on human visual inspection, which is inefficient, inaccurate, and subjective. This study focused on the plum red spot (Polystigma rubrum), proposing a two-stage detection algorithm based on deep learning and assessing the severity of the disease through lesion coverage rate. The specific contributions are as follows: We utilized the object detection model YOLOv8 to strip leaves to eliminate the influence of complex backgrounds. We used an improved U-Net network to segment leaves and lesions. We combined Dice Loss with Focal Loss to address the poor training performance due to the pixel ratio imbalance between leaves and disease spots. For inconsistencies in the size and shape of leaves and lesions, we utilized ODConv and MSCA so that the model could focus on features at different scales. After verification, the accuracy rate of leaf recognition is 95.3%, and the mIoU, mPA, mPrecision, and mRecall of the leaf disease segmentation model are 90.93%, 95.21%, 95.17%, and 95.21%, respectively. This research provides an effective solution for the detection and severity assessment of plum leaf red spot disease under complex backgrounds. Full article
(This article belongs to the Special Issue The Applications of Deep Learning in Smart Agriculture)
Show Figures

Figure 1

Figure 1
<p>Plum leaf dataset under natural conditions.</p>
Full article ">Figure 2
<p>Comparison of plum leaf dataset before and after data augmentation.</p>
Full article ">Figure 3
<p>Red spot-diseased leaves after detection.</p>
Full article ">Figure 4
<p>Flow chart of disease detection.</p>
Full article ">Figure 5
<p>YOLOV8 structure diagram.</p>
Full article ">Figure 6
<p>MOC_UNet Network Architecture.</p>
Full article ">Figure 7
<p>Schematic diagram of ODConv.</p>
Full article ">Figure 8
<p>MSCA structure diagram.</p>
Full article ">Figure 9
<p>YOLOv8 results: (<b>a</b>) Precision; (<b>b</b>) Recall; (<b>c</b>) mAP@0.5.</p>
Full article ">Figure 10
<p>Effectiveness of YOLOv8 on the detection of plum leaves.</p>
Full article ">Figure 11
<p>Comparison of the prediction effect of different segmentation models: (<b>a</b>) Original images; (<b>b</b>) Label images; (<b>c</b>) PSPNet; (<b>d</b>) DeepLabV3+; (<b>e</b>) Segformer; (<b>f</b>) HRNetv2; (<b>g</b>) U-Net; (<b>h</b>) MOC_UNet.</p>
Full article ">Figure 12
<p>Regression of predicted lesion coverage with true values for different models: (<b>a</b>) PSPNet; (<b>b</b>) DeepLabV3+; (<b>c</b>) HRNetv2; (<b>d</b>) Segformer; (<b>e</b>) U-Net; (<b>f</b>) MOC_UNet.</p>
Full article ">
24 pages, 10938 KiB  
Article
Segmentation and Coverage Measurement of Maize Canopy Images for Variable-Rate Fertilization Using the MCAC-Unet Model
by Hailiang Gong, Litong Xiao and Xi Wang
Agronomy 2024, 14(7), 1565; https://doi.org/10.3390/agronomy14071565 - 18 Jul 2024
Viewed by 498
Abstract
Excessive fertilizer use has led to environmental pollution and reduced crop yields, underscoring the importance of research into variable-rate fertilization (VRF) based on digital image technology in precision agriculture. Current methods, which rely on spectral sensors for monitoring and prescription mapping, face significant [...] Read more.
Excessive fertilizer use has led to environmental pollution and reduced crop yields, underscoring the importance of research into variable-rate fertilization (VRF) based on digital image technology in precision agriculture. Current methods, which rely on spectral sensors for monitoring and prescription mapping, face significant technical challenges, high costs, and operational complexities, limiting their widespread adoption. This study presents an automated, intelligent, and precise approach to maize canopy image segmentation using the multi-scale attention and Unet model to enhance VRF decision making, reduce fertilization costs, and improve accuracy. A dataset of maize canopy images under various lighting and growth conditions was collected and subjected to data augmentation and normalization preprocessing. The MCAC-Unet model, built upon the MobilenetV3 backbone network and integrating the convolutional block attention module (CBAM), atrous spatial pyramid pooling (ASPP) multi-scale feature fusion, and content-aware reassembly of features (CARAFE) adaptive upsampling modules, achieved a mean intersection over union (mIOU) of 87.51% and a mean pixel accuracy (mPA) of 93.85% in maize canopy image segmentation. Coverage measurements at a height of 1.1 m indicated a relative error ranging from 3.12% to 6.82%, averaging 4.43%, with a determination coefficient of 0.911, meeting practical requirements. The proposed model and measurement system effectively address the challenges in maize canopy segmentation and coverage assessment, providing robust support for crop monitoring and VRF decision making in complex environments. Full article
Show Figures

Figure 1

Figure 1
<p>Image acquisition schematic.</p>
Full article ">Figure 2
<p>Low-light conditions.</p>
Full article ">Figure 3
<p>High-light conditions.</p>
Full article ">Figure 4
<p>Image annotation.</p>
Full article ">Figure 5
<p>Original and processed images.</p>
Full article ">Figure 6
<p>The structure of the MCAC-Unet network model.</p>
Full article ">Figure 7
<p>The structure of the Unet network model.</p>
Full article ">Figure 8
<p>The structure of the MobileNetV3 network.</p>
Full article ">Figure 9
<p>Depthwise separable convolutions.</p>
Full article ">Figure 10
<p>The inverted residual structure with linear bottleneck.</p>
Full article ">Figure 11
<p>The structure of CBAM.</p>
Full article ">Figure 12
<p>The structure of the atrous spatial pyramid pooling.</p>
Full article ">Figure 13
<p>The structure of the CARAFE module.</p>
Full article ">Figure 14
<p>Model training curves of different backbone networks.</p>
Full article ">Figure 15
<p>Segmentation results of the improved backbone network. (<b>a</b>) Weak light with abundant crop residues and weeds (<b>b</b>) Overlapping crop leaves (<b>c</b>) Unobstructed leaves with minimal crop residues and weeds under normal conditions (<b>d</b>) Overlapping crop leaves with the presence of weeds (<b>e</b>) Strong light with abundant weeds.</p>
Full article ">Figure 16
<p>Model training curves.</p>
Full article ">Figure 17
<p>Segmentation results of the improved network. (<b>a</b>) Weak light with abundant crop residues and weeds (<b>b</b>) Overlapping crop leaves (<b>c</b>) Unobstructed leaves with minimal crop residues and weeds under normal conditions (<b>d</b>) Overlapping crop leaves with the presence of weeds (<b>e</b>) Strong light with abundant weeds.</p>
Full article ">Figure 18
<p>Measurement results at different heights.</p>
Full article ">
17 pages, 4157 KiB  
Article
Segmentation of Apparent Multi-Defect Images of Concrete Bridges Based on PID Encoder and Multi-Feature Fusion
by Yanna Liao, Chaoyang Huang and Yafang Yin
Buildings 2024, 14(5), 1463; https://doi.org/10.3390/buildings14051463 - 17 May 2024
Cited by 1 | Viewed by 713
Abstract
To address the issue of insufficient deep contextual information mining in the semantic segmentation task of multiple defects in concrete bridges, due to the diversity in texture, shape, and scale of the defects as well as significant differences in the background, we propose [...] Read more.
To address the issue of insufficient deep contextual information mining in the semantic segmentation task of multiple defects in concrete bridges, due to the diversity in texture, shape, and scale of the defects as well as significant differences in the background, we propose the Concrete Bridge Apparent Multi-Defect Segmentation Network (PID-MHENet) based on a PID encoder and multi-feature fusion. PID-MHENet consists of a PID encoder, skip connection, and decoder. The PID encoder adopts a multi-branch structure, including an integral branch and a proportional branch with a “thick and long” design principle and a differential branch with a “thin and short” design principle. The PID Aggregation Enhancement (PAE) combines the detail information of the proportional branch and the semantic information of the differential branch to enhance the fusion of contextual information and, at the same time, introduces the self-learning parameters, which can effectively extract the information of the boundary details of the lesions, the texture, and the background differences. The Multi-Feature Fusion Enhancement Decoding Block (MFEDB) in the decoding stage enhances the information and globally fuses the different feature maps introduced by the three-channel skip connection, which improves the segmentation accuracy of the network for the background similarity and the micro-defects. The experimental results show that the mean Pixel accuracy (mPa) and mean Intersection over Union (mIoU) values of PID-MHENet on the concrete bridge multi-defect semantic segmentation dataset improved by 5.17% and 5.46%, respectively, compared to the UNet network. Full article
Show Figures

Figure 1

Figure 1
<p>Design conceptualization.</p>
Full article ">Figure 2
<p>PID-MHENet network structure. The PID-MHENet network consists of a PID encoder and a decoder. The PID encoder consists of a proportional branch (light green part in the figure), an integral branch (light purple part in the figure), and a differential branch (light blue part in the figure).</p>
Full article ">Figure 3
<p>PAE module structure. The notation within the figure includes “⊕” for element-wise addition, and “⊗” for element-wise multiplication.</p>
Full article ">Figure 4
<p>MFEDB module structure. The notation within the figure includes “⊕” for element-wise addition, “⊗” for element-wise multiplication, and “©” for feature map concatenation. EMA and CAE represent the corresponding modules.</p>
Full article ">Figure 5
<p>Upsampling decoder block structure. Within the figure, the notation “©” indicates feature map concatenation.</p>
Full article ">Figure 6
<p>CAE module structure. The notation within the figure includes “⊕” for element-wise addition, “⊗” for element-wise multiplication, and “⊙” indicates the matrix multiplication.</p>
Full article ">Figure 7
<p>Defect pictures and labels.</p>
Full article ">Figure 8
<p>Comparison of confusion matrix visualizations. (<b>a</b>) Experiment 1 confusion matrix; (<b>b</b>) Experiment 2 confusion matrix; (<b>c</b>) Experiment 3 confusion matrix; (<b>d</b>) Experiment 4 confusion matrix.</p>
Full article ">Figure 9
<p>mIoU curve and loss curve. (<b>a</b>) mIoU comparison curve; (<b>b</b>) loss comparison curve.</p>
Full article ">Figure 10
<p>Comparison of experimental visualization results.</p>
Full article ">
20 pages, 4630 KiB  
Article
U-Net with Coordinate Attention and VGGNet: A Grape Image Segmentation Algorithm Based on Fusion Pyramid Pooling and the Dual-Attention Mechanism
by Xiaomei Yi, Yue Zhou, Peng Wu, Guoying Wang, Lufeng Mo, Musenge Chola, Xinyun Fu and Pengxiang Qian
Agronomy 2024, 14(5), 925; https://doi.org/10.3390/agronomy14050925 - 28 Apr 2024
Viewed by 950
Abstract
Currently, the classification of grapevine black rot disease relies on assessing the percentage of affected spots in the total area, with a primary focus on accurately segmenting these spots in images. Particularly challenging are cases in which lesion areas are small and boundaries [...] Read more.
Currently, the classification of grapevine black rot disease relies on assessing the percentage of affected spots in the total area, with a primary focus on accurately segmenting these spots in images. Particularly challenging are cases in which lesion areas are small and boundaries are ill-defined, hampering precise segmentation. In our study, we introduce an enhanced U-Net network tailored for segmenting black rot spots on grape leaves. Leveraging VGG as the U-Net’s backbone, we strategically position the atrous spatial pyramid pooling (ASPP) module at the base of the U-Net to serve as a link between the encoder and decoder. Additionally, channel and spatial dual-attention modules are integrated into the decoder, alongside a feature pyramid network aimed at fusing diverse levels of feature maps to enhance the segmentation of diseased regions. Our model outperforms traditional plant disease semantic segmentation approaches like DeeplabV3+, U-Net, and PSPNet, achieving impressive pixel accuracy (PA) and mean intersection over union (MIoU) scores of 94.33% and 91.09%, respectively. Demonstrating strong performance across various levels of spot segmentation, our method showcases its efficacy in enhancing the segmentation accuracy of black rot spots on grapevines. Full article
Show Figures

Figure 1

Figure 1
<p>Image annotation status: (<b>a</b>) original image; (<b>b</b>) image marking results. Black represents the background, and red represents the lesions.</p>
Full article ">Figure 2
<p>Data augmentation: (<b>a</b>) original image; (<b>b</b>) flipped and added noise; (<b>c</b>) flipped and reduced brightness; (<b>d</b>) added noise and reduced brightness; (<b>e</b>) flipped and shifted and reduced brightness; (<b>f</b>) flipped and shifted.</p>
Full article ">Figure 3
<p>U-Net structure.</p>
Full article ">Figure 4
<p>CVU-Net network structure. The orange color block represents the location in which the ASPP module is added, and the yellow color block represents the location in which the attention mechanism is added.</p>
Full article ">Figure 5
<p>Backbone feature extraction network structure: (<b>a</b>) backbone network feature extraction model; (<b>b</b>) backbone feature extraction partial implementation approach.</p>
Full article ">Figure 6
<p>SENet structure.</p>
Full article ">Figure 7
<p>CA structure.</p>
Full article ">Figure 8
<p>Enhancement of the structure of the part of the feature extraction network. (<b>a</b>) Enhanced feature extraction partial model; (<b>b</b>) enhancement of the feature extraction component implementation approach.</p>
Full article ">Figure 9
<p>ASPP structure.</p>
Full article ">Figure 10
<p>Model average intersection ratio versus learning rate and number of iterations.</p>
Full article ">Figure 11
<p>Segmentation effect of different algorithms: (<b>a</b>) original image; (<b>b</b>) ground truth; (<b>c</b>) U-Net; (<b>d</b>) PSPNet; (<b>e</b>) DeeplabV3+; (<b>f</b>) CVU-Net. The green boxes represent areas where there is a large difference between the different methods.</p>
Full article ">Figure 12
<p>Comparison of segmentation accuracy of each model for graded lesions.</p>
Full article ">
18 pages, 6470 KiB  
Article
Enhanced Tropical Cyclone Precipitation Prediction in the Northwest Pacific Using Deep Learning Models and Ensemble Techniques
by Lunkai He, Qinglan Li, Jiali Zhang, Xiaowei Deng, Zhijian Wu, Yaoming Wang, Pak-Wai Chan and Na Li
Water 2024, 16(5), 671; https://doi.org/10.3390/w16050671 - 25 Feb 2024
Viewed by 1468
Abstract
This study focuses on optimizing precipitation forecast induced by tropical cyclones (TCs) in the Northwest Pacific region, with lead times ranging from 6 to 72 h. The research employs deep learning models, such as U-Net, UNet3+, SE-Net, and SE-UNet3+, which utilize precipitation forecast [...] Read more.
This study focuses on optimizing precipitation forecast induced by tropical cyclones (TCs) in the Northwest Pacific region, with lead times ranging from 6 to 72 h. The research employs deep learning models, such as U-Net, UNet3+, SE-Net, and SE-UNet3+, which utilize precipitation forecast data from the Global Forecast System (GFS) and real-time GFS environmental background data using a U-Net structure. To comprehensively make use of the precipitation forecasts from these models, we additionally use probabilistic matching (PM) and simple averaging (AVR) in rainfall prediction. The precipitation data from the Global Precipitation Measurement (GPM) Mission serves as the rainfall observation. The results demonstrate that the root mean squared errors (RMSEs) of U-Net, UNet3+, SE-UNet, SE-UNet3+, AVR, and PM are lowered by 8.7%, 10.1%, 9.7%, 10.0%, 11.4%, and 11.5%, respectively, when compared with the RMSE of the GFS TC precipitation forecasts, while the mean absolute errors are reduced by 9.6%, 11.3%, 9.0%, 12.0%, 12.8%, and 13.0%, respectively. Furthermore, the neural network model improves the precipitation threat scores (TSs). On average, the TSs of U-Net, UNet3+, SE-UNet, SE-UNet3+, AVR, and PM are raised by 12.8%, 21.3%, 19.3%, 20.7%, 22.5%, and 22.9%, respectively, compared with the GFS model. Notably, AVR and PM outperform all other individual models, with PM’s performance slightly better than AVR’s. The most important feature variables in optimizing TC precipitation forecast in the Northwest Pacific region based on the UNet-based neural network include GFS precipitation forecast data, land and sea masks, latitudinal winds at 500 hPa, and vertical winds at 500 hPa. Full article
(This article belongs to the Section Hydrology)
Show Figures

Figure 1

Figure 1
<p>TC point sample positions are denoted by “×” for model training and “+” for model testing. The colors of symbols “×” and “+”, ranging from light to dark, represent the TC intensity from weak to strong. The red line refers to the track of TC Ma-on.</p>
Full article ">Figure 2
<p>Structures of (<b>a</b>) U-Net and SE-UNet and (<b>b</b>) UNet3+ and SE-UNet3+.</p>
Full article ">Figure 3
<p>Boxplots of RMSEs for different models with various lead times. The box plot shows the median (line inside the box) and the upper and lower quartiles (top and bottom of the box), while the whiskers extend to the minimum and maximum non-outlier values. Outliers are indicated by dots beyond the whiskers.</p>
Full article ">Figure 4
<p>RMSEs of different models for various TC levels: (<b>a</b>) all TC points, (<b>b</b>) TD, (<b>c</b>) TS, (<b>d</b>) STS, (<b>e</b>) TY, and (<b>f</b>) SSTY.</p>
Full article ">Figure 5
<p>The spatial distribution of precipitation prediction RMSE (mm) by PM and GFS models with different lead times: (<b>a</b>) PM with 24 h, (<b>b</b>) PM with 48 h, (<b>c</b>) PM with 72 h, (<b>d</b>) GFS with 24 h, (<b>e</b>) GFS with 48 h, (<b>f</b>) GFS with 72 h. The spatial distribution of the RMSE (mm) difference in precipitation prediction between GFS model and PM model with different lead times: (<b>g</b>) 24 h, (<b>h</b>) 48 h, and (<b>i</b>) 72 h.</p>
Full article ">Figure 6
<p>Boxplots of TSs for precipitation prediction at different precipitation thresholds: (<b>a</b>) 10 mm/day, (<b>b</b>) 25 mm/day, (<b>c</b>) 50 mm/day, and (<b>d</b>) 100 mm/day.</p>
Full article ">Figure 7
<p>TSs in precipitation prediction by different models with different lead times for various precipitation thresholds: (<b>a</b>) 10 mm/day, (<b>b</b>) 25 mm/day, (<b>c</b>) 50 mm/day, and (<b>d</b>) 100 mm/day.</p>
Full article ">Figure 8
<p>(<b>a</b>) RMSE and (<b>b</b>) MAE for precipitation prediction for TC Ma-on by different models.</p>
Full article ">Figure 9
<p>Comparison between the accumulated precipitation forecasts (mm) by PM and GFS models with different lead times, and the precipitation observations by GPM for TC Ma-on at 113° E, 20.5° N: precipitation forecasts by PM with different lead times of (<b>a</b>) 24 h, (<b>d</b>) 48 h, (<b>g</b>) 72 h; precipitation forecasts by GFS with different lead times of (<b>b</b>) 24 h, (<b>e</b>) 48 h, (<b>h</b>) 72 h; accumulated precipitation observation by GPM within different periods of (<b>c</b>) 24 h, (<b>f</b>) 48 h, (<b>i</b>) 72 h. The red star denotes the TC Ma-on location.</p>
Full article ">Figure 10
<p>The first 10 significant features for 24 h accumulated precipitation prediction by the models of (<b>a</b>) U-Net, (<b>b</b>) SE-UNet, (<b>c</b>) UNet3+, and (<b>d</b>) SE-UNet3+.</p>
Full article ">
15 pages, 2064 KiB  
Article
Portrait Semantic Segmentation Method Based on Dual Modal Information Complementarity
by Guang Feng and Chong Tang
Appl. Sci. 2024, 14(4), 1439; https://doi.org/10.3390/app14041439 - 9 Feb 2024
Viewed by 779
Abstract
Semantic segmentation of human images is a research hotspot in the field of computer vision. At present, the semantic segmentation models based on U-net generally lack the ability to capture the spatial information of images. At the same time, semantic incompatibility exists because [...] Read more.
Semantic segmentation of human images is a research hotspot in the field of computer vision. At present, the semantic segmentation models based on U-net generally lack the ability to capture the spatial information of images. At the same time, semantic incompatibility exists because the feature maps of encoder and decoder are directly connected in the skip connection stage. In addition, in low light scenes such as at night, it is easy for false segmentation and segmentation accuracy to appear. To solve the above problems, a portrait semantic segmentation method based on dual-modal information complementarity is proposed. The encoder adopts a double branch structure, and uses a SK-ASSP module that can adaptively adjust the convolution weights of different receptor fields to extract features in RGB and gray image modes respectively, and carries out cross-modal information complementarity and feature fusion. A hybrid attention mechanism is used in the jump connection phase to capture both the channel and coordinate context information of the image. Experiments on human matting dataset show that the PA and MIoU coefficients of this algorithm model reach 96.58% and 94.48% respectively, which is better than U-net benchmark model and other mainstream semantic segmentation models. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>The network structure of U-net.</p>
Full article ">Figure 2
<p>The network structure of the method in this article.</p>
Full article ">Figure 3
<p>The structure of the feature SK-ASSP module.</p>
Full article ">Figure 4
<p>Structure of Cross-modal complementarity module.</p>
Full article ">Figure 5
<p>Hierarchical attention mechanism module.</p>
Full article ">Figure 6
<p>Samples of data enhancement in this paper. (<b>a</b>) means the original images, (<b>b</b>) means the the color adjustment images, (<b>c</b>) means the background replacement images.</p>
Full article ">Figure 7
<p>Segmentation results of different models. <b>a</b> is the original image; <b>b</b>–<b>f</b> is the segmentation effect of U-net, LinkNet, PortraitNet, Deeplab v3+ and Trans UNet; and <b>g</b> is the segmentation effect of this paper’s segmentation model.</p>
Full article ">
16 pages, 3202 KiB  
Article
Machine Learning-Based Estimation of Tropical Cyclone Intensity from Advanced Technology Microwave Sounder Using a U-Net Algorithm
by Zichao Liang, Yong-Keun Lee, Christopher Grassotti, Lin Lin and Quanhua Liu
Remote Sens. 2024, 16(1), 77; https://doi.org/10.3390/rs16010077 - 24 Dec 2023
Viewed by 1399
Abstract
A U-Net algorithm was used to retrieve surface pressure and wind speed over the ocean within tropical cyclones (TCs) and their neighboring areas using NOAA-20 Advanced Technology Microwave Sounder (ATMS) reprocessed Sensor Data Record (SDR) brightness temperatures (TBs) and geolocation information. For TC [...] Read more.
A U-Net algorithm was used to retrieve surface pressure and wind speed over the ocean within tropical cyclones (TCs) and their neighboring areas using NOAA-20 Advanced Technology Microwave Sounder (ATMS) reprocessed Sensor Data Record (SDR) brightness temperatures (TBs) and geolocation information. For TC locations, International Best Track Archive for Climate Stewardship (IBTrACS) data have been used over the North Atlantic Ocean and West Pacific Ocean between 2018 and 2021. The European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) surface pressure and wind speed were employed as reference labels. Preliminary results demonstrated that the visualizations for wind speed and pressure matched the prediction and ERA5 location. The residual biases and standard deviations between the predicted and reference labels were about 0.15 m/s and 1.95 m/s, respectively, for wind speed and 0.48 hPa and 2.67 hPa, respectively, for surface pressure, after applying cloud screening for each ATMS pixel. This indicates that the U-Net model is effective for surface wind speed and surface pressure estimates over general ocean conditions. Full article
(This article belongs to the Special Issue Advances in Remote Sensing and Atmospheric Optics)
Show Figures

Figure 1

Figure 1
<p>U-Net Architecture in this study. Detailed information is described in <a href="#sec2-remotesensing-16-00077" class="html-sec">Section 2</a> Methodology.</p>
Full article ">Figure 2
<p>Flowchart of the data preprocessing.</p>
Full article ">Figure 3
<p>Loss Curve for U-Net Training and Validation Loss converging over 500 epochs.</p>
Full article ">Figure 4
<p>Single sample residual histograms (U-Net prediction—ERA5) for (<b>a</b>,<b>b</b>) surface wind speed and surface pressure residuals, respectively, for sample valid on 10 October 2018 at 06 UTC, while (<b>c</b>,<b>d</b>) contain similar residuals but for the sample valid on 14 September 2018 at 06 UTC.</p>
Full article ">Figure 5
<p>U-Net prediction and ERA5 surface wind speed maps. (<b>a</b>,<b>b</b>) represent ERA5 and U-Net predicted wind speed, respectively, of sample valid on 10 October 2018 at 06 UTC (Leslie), while (<b>c</b>,<b>d</b>) represent ERA5 and U-Net predicted wind speed, respectively, of sample valid on 14 September 2018 at 06 UTC (Joyce in the middle and Helene on the right-side).</p>
Full article ">Figure 6
<p>U-Net prediction and ERA5 surface pressure maps. (<b>a</b>,<b>b</b>) represent ERA5 and U-Net predicted surface pressure, respectively, of sample valid on 10 October 2018 at 06 UTC, while (<b>c</b>,<b>d</b>) represent ERA5 and U-Net predicted surface pressure, respectively, of sample valid on 14 September 2018 at 06 UTC.</p>
Full article ">Figure 7
<p>Scatterplots of U-Net prediction vs. ERA5 for (<b>a</b>) wind speed (m/s) and (<b>b</b>) surface pressure (hPa) across all 27 test samples. The pixels included in this analysis were selected from within a 350 km radius circle centered on the TC. The data distribution changes from dense to sparse as the color shifts from yellow to blue. (R: Pearson correlation coefficients; SD: standard deviation; N: number of selected pixels).</p>
Full article ">
18 pages, 4806 KiB  
Article
Extracting Citrus in Southern China (Guangxi Region) Based on the Improved DeepLabV3+ Network
by Hao Li, Jia Zhang, Jia Wang, Zhongke Feng, Boyi Liang, Nina Xiong, Junping Zhang, Xiaoting Sun, Yibing Li and Shuqi Lin
Remote Sens. 2023, 15(23), 5614; https://doi.org/10.3390/rs15235614 - 3 Dec 2023
Cited by 1 | Viewed by 1691
Abstract
China is one of the countries with the largest citrus cultivation areas, and its citrus industry has received significant attention due to its substantial economic benefits. Traditional manual forestry surveys and remote sensing image classification tasks are labor-intensive and time-consuming, resulting in low [...] Read more.
China is one of the countries with the largest citrus cultivation areas, and its citrus industry has received significant attention due to its substantial economic benefits. Traditional manual forestry surveys and remote sensing image classification tasks are labor-intensive and time-consuming, resulting in low efficiency. Remote sensing technology holds great potential for obtaining spatial information on citrus orchards on a large scale. This study proposes a lightweight model for citrus plantation extraction that combines the DeepLabV3+ model with the convolutional block attention module (CBAM) attention mechanism, with a focus on the phenological growth characteristics of citrus in the Guangxi region. The objective is to address issues such as inaccurate extraction of citrus edges in high-resolution images, misclassification and omissions caused by intra-class differences, as well as the large number of network parameters and long training time found in classical semantic segmentation models. To reduce parameter count and improve training speed, the MobileNetV2 lightweight network is used as a replacement for the Xception backbone network in DeepLabV3+. Additionally, the CBAM is introduced to extract citrus features more accurately and efficiently. Moreover, in consideration of the growth characteristics of citrus, this study augments the feature input with additional channels to better capture and utilize key phenological features of citrus, thereby enhancing the accuracy of citrus recognition. The results demonstrate that the improved DeepLabV3+ model exhibits high reliability in citrus recognition and extraction, achieving an overall accuracy (OA) of 96.23%, a mean pixel accuracy (mPA) of 83.79%, and a mean intersection over union (mIoU) of 85.40%. These metrics represent an improvement of 11.16%, 14.88%, and 14.98%, respectively, compared to the original DeepLabV3+ model. Furthermore, when compared to classical semantic segmentation models, such as UNet and PSPNet, the proposed model achieves higher recognition accuracy. Additionally, the improved DeepLabV3+ model demonstrates a significant reduction in both parameters and training time. Generalization experiments conducted in Nanning, Guangxi Province, further validate the model’s strong generalization capabilities. Overall, this study emphasizes extraction accuracy, reduction in parameter count, adherence to timeliness requirements, and facilitation of rapid and accurate extraction of citrus plantation areas, presenting promising application prospects. Full article
(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)
Show Figures

Figure 1

Figure 1
<p>Study area. (<b>a</b>) Geographic location of the study area. (<b>b</b>) Main study area, i.e., Yangshuo County, Guangxi Province. (<b>c</b>,<b>d</b>) show the labeled areas of citrus samples (marked by yellow and green blocks). The images used are GF-2 images with pseudo-color components (R = near-infrared, G = red, B = green).</p>
Full article ">Figure 2
<p>Structure of improved DeepLabV3+ model.</p>
Full article ">Figure 3
<p>Structure of CBAM: (<b>a</b>) Channel attention module; (<b>b</b>) Spatial attention module; (<b>c</b>) CBAM.</p>
Full article ">Figure 4
<p>Comparison of extraction accuracy of various models for citrus.</p>
Full article ">Figure 5
<p>Citrus extraction results using four different models, where the black area is the background area, the gray is the citrus sample labeled area, and the white is the citrus area extracted by the models. Among the three special plots selected, plot (<b>a</b>) contains roads and water, plot (<b>b</b>) contains complex and fragmentary citrus planting areas, and plot (<b>c</b>) contains concentrated citrus planting areas.</p>
Full article ">Figure 6
<p>Results of model testing in Nanning City.</p>
Full article ">
19 pages, 6565 KiB  
Article
Recurrent Residual Deformable Conv Unit and Multi-Head with Channel Self-Attention Based on U-Net for Building Extraction from Remote Sensing Images
by Wenling Yu, Bo Liu, Hua Liu and Guohua Gou
Remote Sens. 2023, 15(20), 5048; https://doi.org/10.3390/rs15205048 - 20 Oct 2023
Cited by 4 | Viewed by 1243
Abstract
Considering the challenges associated with accurately identifying building shape features and distinguishing between building and non-building features during the extraction of buildings from remote sensing images using deep learning, we propose a novel method for building extraction based on U-Net, incorporating a recurrent [...] Read more.
Considering the challenges associated with accurately identifying building shape features and distinguishing between building and non-building features during the extraction of buildings from remote sensing images using deep learning, we propose a novel method for building extraction based on U-Net, incorporating a recurrent residual deformable convolution unit (RDCU) module and augmented multi-head self-attention (AMSA). By replacing conventional convolution modules with an RDCU, which adopts a deformable convolutional neural network within a residual network structure, the proposed method enhances the module’s capacity to learn intricate details such as building shapes. Furthermore, AMSA is introduced into the skip connection function to enhance feature expression and positions through content–position enhancement operations and content–content enhancement operations. Moreover, AMSA integrates an additional fusion channel attention mechanism to aid in identifying cross-channel feature expression Intersection over Union (IoU) score differences. For the Massachusetts dataset, the proposed method achieves an Intersection over Union (IoU) score of 89.99%, PA (Pixel Accuracy) score of 93.62%, and Recall score of 89.22%. For the WHU Satellite dataset I, the proposed method achieves an IoU score of 86.47%, PA score of 92.45%, and Recall score of 91.62%, For the INRIA dataset, the proposed method achieves an IoU score of 80.47%, PA score of 90.15%, and Recall score of 85.42%. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

Figure 1
<p>The framework of the proposed method. The encoder–decoder uses RDCU module for feature extraction. The skip connection part is fused after passing through AMSA module, and outputs building segmentation mask after the classifier.</p>
Full article ">Figure 2
<p>Different variants of the convolutional units including (<b>a</b>) the forward convolutional unit, (<b>b</b>) the ResNet block, and (<b>c</b>) the RDCU.</p>
Full article ">Figure 3
<p>The submodule of RDCU. The DCnov in the figure is denoted as deformable convolution.</p>
Full article ">Figure 4
<p>The structure of AMSA. Attentional weights are calculated at the channel and spatial dimensions.</p>
Full article ">Figure 5
<p>An example of the Massachusetts building dataset. (<b>a</b>) is the example of the original image, and (<b>b</b>) is the label of (<b>a</b>).</p>
Full article ">Figure 6
<p>Building images from four different regions in the Massachusetts building dataset: (<b>a</b>) a building image from Venice, (<b>b</b>) a building image from New York, (<b>c</b>) the building image from Los Angeles, and (<b>d</b>) a building image from Cairo.</p>
Full article ">Figure 7
<p>Building images from five different regions in the Massachusetts building dataset: (<b>a</b>) a building image from Austin, (<b>b</b>) a building image from Chicago, (<b>c</b>) a building image from Los Kisap, (<b>d</b>) a building image from Tyrol, and (<b>e</b>) a building image from Vienna.</p>
Full article ">Figure 8
<p>The local results of building detection using different methods in the Massachusetts dataset.</p>
Full article ">Figure 8 Cont.
<p>The local results of building detection using different methods in the Massachusetts dataset.</p>
Full article ">Figure 9
<p>The local results of building detection using different methods in the WHU Satellite dataset I.</p>
Full article ">Figure 9 Cont.
<p>The local results of building detection using different methods in the WHU Satellite dataset I.</p>
Full article ">Figure 10
<p>The local results of building detection using different methods in the INRIA dataset.</p>
Full article ">Figure 10 Cont.
<p>The local results of building detection using different methods in the INRIA dataset.</p>
Full article ">Figure 11
<p>WHU Aerial imagery dataset: ① is the training area in this dataset, ② is the validation area in this dataset, and ③ and ④ are the test area in this dataset.</p>
Full article ">Figure 12
<p>Samples of building extraction results using different models with the WHU Aerial imagery dataset (ablation study).</p>
Full article ">
27 pages, 13192 KiB  
Article
An Efficient and Automated Image Preprocessing Using Semantic Segmentation for Improving the 3D Reconstruction of Soybean Plants at the Vegetative Stage
by Yongzhe Sun, Linxiao Miao, Ziming Zhao, Tong Pan, Xueying Wang, Yixin Guo, Dawei Xin, Qingshan Chen and Rongsheng Zhu
Agronomy 2023, 13(9), 2388; https://doi.org/10.3390/agronomy13092388 - 14 Sep 2023
Cited by 2 | Viewed by 1546
Abstract
The investigation of plant phenotypes through 3D modeling has emerged as a significant field in the study of automated plant phenotype acquisition. In 3D model construction, conventional image preprocessing methods exhibit low efficiency and inherent inefficiencies, which increases the difficulty of model construction. [...] Read more.
The investigation of plant phenotypes through 3D modeling has emerged as a significant field in the study of automated plant phenotype acquisition. In 3D model construction, conventional image preprocessing methods exhibit low efficiency and inherent inefficiencies, which increases the difficulty of model construction. In order to ensure the accuracy of the 3D model, while reducing the difficulty of image preprocessing and improving the speed of 3D reconstruction, deep learning semantic segmentation technology was used in the present study to preprocess original images of soybean plants. Additionally, control experiments involving soybean plants of different varieties and different growth periods were conducted. Models based on manual image preprocessing and models based on image segmentation were established. Point cloud matching, distance calculation and model matching degree calculation were carried out. In this study, the DeepLabv3+, Unet, PSPnet and HRnet networks were used to conduct semantic segmentation of the original images of soybean plants in the vegetative stage (V), and Unet network exhibited the optimal test effect. The values of mIoU, mPA, mPrecision and mRecall reached 0.9919, 0.9953, 0.9965 and 0.9953. At the same time, by comparing the distance results and matching accuracy results between the models and the reference models, a conclusion could be drawn that semantic segmentation can effectively improve the challenges of image preprocessing and long reconstruction time, greatly improve the robustness of noise input and ensure the accuracy of the model. Semantic segmentation plays a crucial role as a fundamental component in enabling efficient and automated image preprocessing for 3D reconstruction of soybean plants during the vegetative stage. In the future, semantic segmentation will provide a solution for the pre-processing of 3D reconstruction for other crops. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>An overview of the proposed method.</p>
Full article ">Figure 2
<p>Soybean 3D reconstruction image acquisition. (<b>a</b>) Soybean image acquisition platform; (<b>b</b>) image acquisition method flowchart.</p>
Full article ">Figure 3
<p>Semantic segmentation model architecture. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure 3 Cont.
<p>Semantic segmentation model architecture. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure 4
<p>The process of 3D reconstruction of soybean plants.</p>
Full article ">Figure 5
<p>The process of model comparison.</p>
Full article ">Figure 6
<p>The training loss and train mIoU variation curves during the training process of the four models. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure 6 Cont.
<p>The training loss and train mIoU variation curves during the training process of the four models. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure 7
<p>Histogram of approximate distance between comparison model and reference model of DN251 soybean plant at different stages. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is Cloud/Cloud Dist. Right is Cloud/Mesh Dist.).</p>
Full article ">Figure 7 Cont.
<p>Histogram of approximate distance between comparison model and reference model of DN251 soybean plant at different stages. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is Cloud/Cloud Dist. Right is Cloud/Mesh Dist.).</p>
Full article ">Figure 8
<p>The schematic of the point cloud model of HN51 soybean plants at different stages using both methods. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is soybean 3D point cloud models based on segmentation image. Right is soybean 3D point cloud models based on manually preprocessed image).</p>
Full article ">Figure 8 Cont.
<p>The schematic of the point cloud model of HN51 soybean plants at different stages using both methods. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is soybean 3D point cloud models based on segmentation image. Right is soybean 3D point cloud models based on manually preprocessed image).</p>
Full article ">Figure 9
<p>Local schematic diagram of HN 51 soybean plant point cloud model in different stages using the two methods. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is soybean 3D point cloud models based on segmentation image. Right is soybean 3D point cloud models based on manually preprocessed image).</p>
Full article ">Figure 9 Cont.
<p>Local schematic diagram of HN 51 soybean plant point cloud model in different stages using the two methods. (<b>a</b>) V1 stage; (<b>b</b>) V2 stage; (<b>c</b>) V3 stage; (<b>d</b>) V4 stage; (<b>e</b>) V5 stage. (Left is soybean 3D point cloud models based on segmentation image. Right is soybean 3D point cloud models based on manually preprocessed image).</p>
Full article ">Figure 10
<p>Using the two methods, the point cloud diagram of DN 252 soybean plants in V5 and HN 48 soybean plants in V5. (<b>a</b>) DN 252 soybean plant at V5 stage; (<b>b</b>) HN 48 soybean plant at V5 stage.</p>
Full article ">Figure 10 Cont.
<p>Using the two methods, the point cloud diagram of DN 252 soybean plants in V5 and HN 48 soybean plants in V5. (<b>a</b>) DN 252 soybean plant at V5 stage; (<b>b</b>) HN 48 soybean plant at V5 stage.</p>
Full article ">Figure A1
<p>The confusion matrix diagram of the true value and the predicted value of the training set. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure A2
<p>Test results of the four models DeepLabv3+, Unet, PSPnet and HRnet. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">Figure A2 Cont.
<p>Test results of the four models DeepLabv3+, Unet, PSPnet and HRnet. (<b>a</b>) DeepLabv3+; (<b>b</b>) Unet; (<b>c</b>) PSPnet; (<b>d</b>) HRNet.</p>
Full article ">
Back to TopTop