[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (17)

Search Parameters:
Keywords = SimConv

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 15120 KiB  
Article
Violence-YOLO: Enhanced GELAN Algorithm for Violence Detection
by Wenbin Xu, Dingju Zhu, Renfeng Deng, KaiLeung Yung and Andrew W. H. Ip
Appl. Sci. 2024, 14(15), 6712; https://doi.org/10.3390/app14156712 - 1 Aug 2024
Viewed by 371
Abstract
Violence is a serious threat to societal health; preventing violence in airports, airplanes, and spacecraft is crucial. This study proposes the Violence-YOLO model to detect violence accurately in real time in complex environments, enhancing public safety. The model is based on YOLOv9’s Generalized [...] Read more.
Violence is a serious threat to societal health; preventing violence in airports, airplanes, and spacecraft is crucial. This study proposes the Violence-YOLO model to detect violence accurately in real time in complex environments, enhancing public safety. The model is based on YOLOv9’s Generalized Efficient Layer Aggregation Network (GELAN-C). A multilayer SimAM is incorporated into GELAN’s neck to identify attention regions in the scene. YOLOv9 modules are combined with RepGhostNet and GhostNet. Two modules, RepNCSPELAN4_GB and RepNCSPELAN4_RGB, are innovatively proposed and introduced. The shallow convolution in the backbone is replaced with GhostConv, reducing computational complexity. Additionally, an ultra-lightweight upsampler, Dysample, is introduced to enhance performance and reduce overhead. Finally, Focaler-IoU addresses the neglect of simple and difficult samples, improving training accuracy. The datasets are derived from RWF-2000 and Hockey. Experimental results show that Violence-YOLO outperforms GELAN-C. [email protected] increases by 0.9%, computational load decreases by 12.3%, and model size is reduced by 12.4%, which is significant for embedded hardware such as the Raspberry Pi. Violence-YOLO can be deployed to monitor public places such as airports, effectively handling complex backgrounds and ensuring accurate and fast detection of violent behavior. In addition, we achieved 84.4% mAP on the Pascal VOC dataset, which is a significant reduction in model parameters compared to the previously refined detector. This study offers insights for real-time detection of violent behaviors in public environments. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Network structure of GELAN-C.</p>
Full article ">Figure 2
<p>Network structure of Violence-YOLO.</p>
Full article ">Figure 3
<p>The processes of Conv (<b>a</b>) and GhostConv (<b>b</b>).</p>
Full article ">Figure 4
<p>Ghost bottleneck. <b>Left</b>: Ghost bottleneck with stride = 1; <b>Right</b>: Ghost bottleneck with stride = 2.</p>
Full article ">Figure 5
<p>The structure of RepGhost bottleneck in training (<b>a</b>) and in reasoning (<b>b</b>).</p>
Full article ">Figure 6
<p>The structure of RepNCSPELAN4_GB and RepNCSPELAN4_RGB.</p>
Full article ">Figure 7
<p>Dysample network structure: (<b>a</b>) sampling-based dynamic upsampling; (<b>b</b>) sampling point generator in DySample. The input feature, upsample feature, generated offset, and original grid are denoted by <math display="inline"><semantics> <mi>χ</mi> </semantics></math>, <math display="inline"><semantics> <msup> <mi>χ</mi> <mo>′</mo> </msup> </semantics></math>, <span class="html-italic">G</span>, and <span class="html-italic">O</span>, respectively. <math display="inline"><semantics> <mi>σ</mi> </semantics></math> denotes the sigmoid function, <math display="inline"><semantics> <mrow> <mi>s</mi> <mi>h</mi> </mrow> </semantics></math> represents the sampled height, <math display="inline"><semantics> <mrow> <mi>s</mi> <mi>w</mi> </mrow> </semantics></math> represents the sampled width, and <math display="inline"><semantics> <mrow> <mi>g</mi> <msup> <mi>s</mi> <mn>2</mn> </msup> </mrow> </semantics></math> represents the number of channels after the feature graph passes through the linear layer.</p>
Full article ">Figure 8
<p>The connection pattern diagram of SimAM network.</p>
Full article ">Figure 9
<p>Violence detection data set: first row taken from Hockey Dataset, second row taken from RWF-2000 dataset, and third row taken from MixUp data after enhancement.</p>
Full article ">Figure 10
<p>Confusion matrix comparative effectiveness diagram on Violence data set. On the left is the confusion matrix of GELAN-C. On the right is the confusion matrix of Violence-YOLO.</p>
Full article ">Figure 11
<p>Comparison effect diagram of PR curves on Violence data set. On the left are the PR curves of GELAN-C. On the right are the PR curves of Violence-YOLO.</p>
Full article ">Figure 12
<p>Comparison of detection performance on Violence data set. (On the upper is a graph of the test results of GELAN-C, and on the lower is a graph of the test results of Violence-YOLO.)</p>
Full article ">Figure 13
<p>Detection results of the proposed violence detection model (Violence-YOLO) on the RWF-2000 dataset. The first two rows are taken from key frames in violent videos where our model correctly predicts the presence of violence. The third row, taken from key frames in nonviolent videos, suffers from mispredictions, where large crowds and low-quality surveillance footage may lead to incorrect predictions. 1–3, 4–6, and 7–9 show three keyframes from three different videos respectively.</p>
Full article ">
26 pages, 18629 KiB  
Article
Advanced UAV Material Transportation and Precision Delivery Utilizing the Whale-Swarm Hybrid Algorithm (WSHA) and APCR-YOLOv8 Model
by Yuchen Wu, Zhijian Wei, Huilin Liu, Jiawei Qi, Xu Su, Jiqiang Yang and Qinglin Wu
Appl. Sci. 2024, 14(15), 6621; https://doi.org/10.3390/app14156621 - 29 Jul 2024
Viewed by 506
Abstract
This paper proposes an effective material delivery algorithm to address the challenges associated with Unmanned Aerial Vehicle (UAV) material transportation and delivery, which include complex route planning, low detection precision, and hardware limitations. This novel approach integrates the Whale-Swarm Hybrid Algorithm (WSHA) with [...] Read more.
This paper proposes an effective material delivery algorithm to address the challenges associated with Unmanned Aerial Vehicle (UAV) material transportation and delivery, which include complex route planning, low detection precision, and hardware limitations. This novel approach integrates the Whale-Swarm Hybrid Algorithm (WSHA) with the APCR-YOLOv8 model to enhance efficiency and accuracy. For path planning, the placement paths are transformed into a Generalized Traveling Salesman Problem (GTSP) to be able to compute solutions. The Whale Optimization Algorithm (WOA) is improved for balanced global and local searches, combined with an Artificial Bee Colony (ABC) Algorithm and adaptive weight adjustment to quicken convergence and reduce path costs. For precise placement, the YOLOv8 model is first enhanced by adding the SimAM attention mechanism to the C2f module in the detection head, focusing on target features. Secondly, GhoHGNetv2 using GhostConv is the backbone of YOLOv8 to ensure accuracy while reducing model Params and FLOPs. Finally, a Lightweight Shared Convolutional Detection Head (LSCDHead) further reduces Params and FLOPs through shared convolution. Experimental results show that WSHA reduces path costs by 9.69% and narrows the gap between the best and worst paths by about 34.39%, compared to the Improved Whale Optimization Algorithm (IWOA). APCR-YOLOv8 reduces Params and FLOPs by 44.33% and 34.57%, respectively, with [email protected] increasing from 88.5 to 92.4 and FPS reaching 151.3. This approach can satisfy the requirements for real-time responsiveness while effectively preventing missed, false, and duplicate detections during the inspection of emergency airdrop stations. In conclusion, combining bionic optimization algorithms and image processing significantly enhances the efficiency and precision of material placement in emergency management. Full article
(This article belongs to the Special Issue Advanced Research and Application of Unmanned Aerial Vehicles)
Show Figures

Figure 1

Figure 1
<p>Flowchart of Whale-Swarm Hybrid Algorithm.</p>
Full article ">Figure 2
<p>APCR-YOLOv8 network structure diagram.</p>
Full article ">Figure 3
<p>SimAM Attention Mechanism.</p>
Full article ">Figure 4
<p>C2f_SimAM: C2f module with the addition of the SimAM attention mechanism.</p>
Full article ">Figure 5
<p>GhostConv Schematic.</p>
Full article ">Figure 6
<p>GhoHGNetv2 internal module structure. (<b>a</b>) HGStem module structure; (<b>b</b>) GhoHGBlock module structure (shortcut = True); (<b>c</b>) GhoHGBlock module structure (shortcut = False).</p>
Full article ">Figure 7
<p>LSCDHead structure and internal modules. (<b>a</b>) LSCDHead structure; (<b>b</b>) Internal structure of GnConv and ShConv.</p>
Full article ">Figure 8
<p>Comparison of the variation of the shortest path lengths of the four datasets. (<b>a</b>) City (17,11); (<b>b</b>) City (24,15); (<b>c</b>) City (31,16); (<b>d</b>) City (39,25).</p>
Full article ">Figure 8 Cont.
<p>Comparison of the variation of the shortest path lengths of the four datasets. (<b>a</b>) City (17,11); (<b>b</b>) City (24,15); (<b>c</b>) City (31,16); (<b>d</b>) City (39,25).</p>
Full article ">Figure 9
<p>Four dataset path-planning diagrams. (<b>a</b>) City (17,11); (<b>b</b>) City (24,15); (<b>c</b>) City (31,16); (<b>d</b>) City (39,25).</p>
Full article ">Figure 9 Cont.
<p>Four dataset path-planning diagrams. (<b>a</b>) City (17,11); (<b>b</b>) City (24,15); (<b>c</b>) City (31,16); (<b>d</b>) City (39,25).</p>
Full article ">Figure 10
<p>City (39,25) path planning and real space mapping.</p>
Full article ">Figure 11
<p>Example dataset. (<b>a</b>) Original image; (<b>b</b>) Original image labelling; (<b>c</b>) Data-enhanced image; (<b>d</b>) Data-enhanced image labelling. Note: The red boxes indicate the position of the target object to be detected.</p>
Full article ">Figure 12
<p>Illustration of the physical situation of UAV load drop.</p>
Full article ">Figure 13
<p>Comparison of YOLOv8 and APCR-YOLOv8 heat maps. (<b>a</b>) Original plot of the dataset; (<b>b</b>) heat map of YOLOv8; (<b>c</b>) heat map of APCR-YOLOv8.</p>
Full article ">Figure 14
<p>Example analyses of emergency airdrop station detection using APCR-YOLOv8. (<b>a</b>) Solving the false detection problem by accurately distinguishing between actual and false targets; (<b>b</b>) Solving the tiny target missed detection problem by enhancing the detection of small and difficult-to-detect targets; (<b>c</b>) Solving the repeat detection problem by avoiding repeated detections of the same target; (<b>d</b>) Enhancing detection confidence with higher confidence scores in the bounding boxes.</p>
Full article ">
21 pages, 8219 KiB  
Article
An Improved Fire and Smoke Detection Method Based on YOLOv8n for Smart Factories
by Ziyang Zhang, Lingye Tan and Tiong Lee Kong Robert
Sensors 2024, 24(15), 4786; https://doi.org/10.3390/s24154786 - 24 Jul 2024
Viewed by 358
Abstract
Factories play a crucial role in economic and social development. However, fire disasters in factories greatly threaten both human lives and properties. Previous studies about fire detection using deep learning mostly focused on wildfire detection and ignored the fires that happened in factories. [...] Read more.
Factories play a crucial role in economic and social development. However, fire disasters in factories greatly threaten both human lives and properties. Previous studies about fire detection using deep learning mostly focused on wildfire detection and ignored the fires that happened in factories. In addition, lots of studies focus on fire detection, while smoke, the important derivative of a fire disaster, is not detected by such algorithms. To better help smart factories monitor fire disasters, this paper proposes an improved fire and smoke detection method based on YOLOv8n. To ensure the quality of the algorithm and training process, a self-made dataset including more than 5000 images and their corresponding labels is created. Then, nine advanced algorithms are selected and tested on the dataset. YOLOv8n exhibits the best detection results in terms of accuracy and detection speed. ConNeXtV2 is then inserted into the backbone to enhance inter-channel feature competition. RepBlock and SimConv are selected to replace the original Conv and improve computational ability and memory bandwidth. For the loss function, CIoU is replaced by MPDIoU to ensure an efficient and accurate bounding box. Ablation tests show that our improved algorithm achieves better performance in all four metrics reflecting accuracy: precision, recall, F1, and mAP@50. Compared with the original model, whose four metrics are approximately 90%, the modified algorithm achieves above 95%. mAP@50 in particular reaches 95.6%, exhibiting an improvement of approximately 4.5%. Although complexity improves, the requirements of real-time fire and smoke monitoring are satisfied. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>The structure of the YOLOv8n algorithm.</p>
Full article ">Figure 2
<p>The architecture of ConvNeXt V2.</p>
Full article ">Figure 3
<p>The architecture of Simconv, RepConv, and RepBlock.</p>
Full article ">Figure 4
<p>Schematic diagram of MPDIoU.</p>
Full article ">Figure 5
<p>The structure of the improved YOLOv8n for factory fire and smoke detection.</p>
Full article ">Figure 6
<p>The after-process after collection of images using Visual Similarity Duplicate Image Finder.</p>
Full article ">Figure 7
<p>Examples of factory fire disaster images inside.</p>
Full article ">Figure 8
<p>Examples of factory fire disaster images outside.</p>
Full article ">Figure 9
<p>Visualization results of the self-made datasets and labels. (<b>a</b>) The number of labels for fire and smoke labels; (<b>b</b>) the size of the labels; (<b>c</b>) the distribution of labels’ centroid locations of the total image; (<b>d</b>) the distribution of labels’ size of the total image.</p>
Full article ">Figure 10
<p>Improved YOLOv8 vs. the other methods: bar charts of FPS and mAP@0.5.</p>
Full article ">Figure 11
<p>Precision-recall curve and precision-confidence curve.</p>
Full article ">Figure 12
<p>The curve of precision–epochs, recall–epochs, and mAP–epochs.</p>
Full article ">Figure 13
<p>Visible experiments of improved and original algorithms for various indoor environments in factories.</p>
Full article ">Figure 14
<p>Visible experiments of improved and original algorithms for various outdoor environments in factories.</p>
Full article ">
21 pages, 7359 KiB  
Article
SA-ConvNeXt: A Hybrid Approach for Flower Image Classification Using Selective Attention Mechanism
by Henghui Mo and Linjing Wei
Mathematics 2024, 12(14), 2151; https://doi.org/10.3390/math12142151 - 9 Jul 2024
Viewed by 410
Abstract
In response to the current lack of annotations for flower images and insufficient focus on key image features in traditional fine-grained flower image classification based on deep learning, this study proposes the SA-ConvNeXt flower image classification model. Initially, in the image preprocessing stage, [...] Read more.
In response to the current lack of annotations for flower images and insufficient focus on key image features in traditional fine-grained flower image classification based on deep learning, this study proposes the SA-ConvNeXt flower image classification model. Initially, in the image preprocessing stage, a padding algorithm was used to prevent image deformation and loss of detail caused by scaling. Subsequently, the model was integrated using multi-level feature extraction within the Efficient Channel Attention (ECA) mechanism, forming an M-ECA structure to capture channel features at different levels; a pixel attention mechanism was also introduced to filter out irrelevant or noisy information in the images. Following this, a parameter-free attention module (SimAM) was introduced after deep convolution in the ConvNeXt Block to reweight the input features. SANet, which combines M-ECA and pixel attention mechanisms, was employed at the end of the module to further enhance the model’s dynamic extraction capability of channel and pixel features. Considering the model’s generalization capability, transfer learning was utilized to migrate the pretrained weights of ConvNeXt on the ImageNet dataset to the SA-ConvNeXt model. During training, the Focal Loss function and the Adam optimizer were used to address sample imbalance and reduce gradient fluctuations, thereby enhancing training stability. Finally, the Grad-CAM++ technique was used to generate heatmaps of classification predictions, facilitating the visualization of effective features and deepening the understanding of the model’s focus areas. Comparative experiments were conducted on the Oxford Flowers102 flower image dataset. Compared to existing flower image classification technologies, SA-ConvNeXt performed excellently, achieving a high accuracy of 96.7% and a recall rate of 98.2%, with improvements of 4.0% and 3.7%, respectively, compared to the original ConvNeXt. The results demonstrate that SA-ConvNeXt can effectively capture more accurate key features of flower images, providing an effective technical means for flower recognition and classification. Full article
Show Figures

Figure 1

Figure 1
<p>Process of image preprocessing.</p>
Full article ">Figure 2
<p>Before and after padding on all sides; (<b>a</b>) original figure and (<b>b</b>) picture after four weeks of filling.</p>
Full article ">Figure 3
<p>The architecture of ConvNeXt.</p>
Full article ">Figure 4
<p>Before and after improved ConvNeXt Block; (<b>a</b>) former ConvNeXt Block and (<b>b</b>) improved ConvNeXt Block.</p>
Full article ">Figure 5
<p>The architecture of ConvNeXt.</p>
Full article ">Figure 6
<p>The architecture of M-ECA.</p>
Full article ">Figure 7
<p>The architecture of SANet.</p>
Full article ">Figure 8
<p>The architecture of PA.</p>
Full article ">Figure 9
<p>Oxford Flower102 dataset status.</p>
Full article ">Figure 10
<p>Oxford Flowers102 floral dataset.</p>
Full article ">Figure 11
<p>Average accuracy comparison chart.</p>
Full article ">Figure 12
<p>Average recall comparison chart.</p>
Full article ">Figure 13
<p>Average loss rate comparison chart.</p>
Full article ">Figure 14
<p>Confusion matrix of the SA-ConvNeXt when predicting the Oxford Flowers102 test set.</p>
Full article ">Figure 15
<p>SA-ConvNeXt model detection results.</p>
Full article ">Figure 16
<p>Heatmap analysis of different model detection. (<b>a</b>) Original; (<b>b</b>) ConvNeXt Heat Map; (<b>c</b>) SA-ConvNeXt Heat Map.</p>
Full article ">
0 pages, 15263 KiB  
Article
SLGA-YOLO: A Lightweight Castings Surface Defect Detection Method Based on Fusion-Enhanced Attention Mechanism and Self-Architecture
by Chengjun Wang and Yifan Wang
Sensors 2024, 24(13), 4088; https://doi.org/10.3390/s24134088 - 24 Jun 2024
Viewed by 452
Abstract
Castings’ surface-defect detection is a crucial machine vision-based automation technology. This paper proposes a fusion-enhanced attention mechanism and efficient self-architecture lightweight YOLO (SLGA-YOLO) to overcome the existing target detection algorithms’ poor computational efficiency and low defect-detection accuracy. We used the SlimNeck module to [...] Read more.
Castings’ surface-defect detection is a crucial machine vision-based automation technology. This paper proposes a fusion-enhanced attention mechanism and efficient self-architecture lightweight YOLO (SLGA-YOLO) to overcome the existing target detection algorithms’ poor computational efficiency and low defect-detection accuracy. We used the SlimNeck module to improve the neck module and reduce redundant information interference. The integration of simplified attention module (SimAM) and Large Separable Kernel Attention (LSKA) fusion strengthens the attention mechanism, improving the detection performance, while significantly reducing computational complexity and memory usage. To enhance the generalization ability of the model’s feature extraction, we replaced part of the basic convolutional blocks with the self-designed GhostConvML (GCML) module, based on the addition of p2 detection. We also constructed the Alpha-EIoU loss function to accelerate model convergence. The experimental results demonstrate that the enhanced algorithm increases the average detection accuracy ([email protected]) by 3% and the average detection accuracy ([email protected]:0.95) by 1.6% in the castings’ surface defects dataset. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

Figure 1
<p>SLGA-YOLO network structure diagram.</p>
Full article ">Figure 2
<p>The structures of (<b>a</b>) GSConv, (<b>b</b>) VoV-GSCSP, and (<b>c</b>) DWConv.</p>
Full article ">Figure 3
<p>The structures of (<b>a</b>) LSKA attention mechanism; ⊗ represents Hadamard product, k represents the maximum receptive field, and d represents the dilation rate. (<b>b</b>) SPPF-LSKA.</p>
Full article ">Figure 4
<p>SimAM attention module structure.</p>
Full article ">Figure 5
<p>The structure of GCML.</p>
Full article ">Figure 6
<p>The structure of CBM.</p>
Full article ">Figure 7
<p>Scatter plot of comparative experimental results.</p>
Full article ">Figure 8
<p>Comparison of detection results before and after improvement: (<b>a</b>) <math display="inline"> <semantics> <mrow> <mi mathvariant="normal">m</mi> <mi mathvariant="normal">A</mi> <mi mathvariant="normal">P</mi> <mo>@</mo> <mn>0.5</mn> </mrow> </semantics> </math>, (<b>b</b>) <math display="inline"> <semantics> <mrow> <mi mathvariant="normal">m</mi> <mi mathvariant="normal">A</mi> <mi mathvariant="normal">P</mi> <mo>@</mo> <mn>0.5</mn> <mo>:</mo> <mn>0.95</mn> </mrow> </semantics> </math>, (<b>c</b>) P-R YOLOv8, and (<b>d</b>) P-R improved YOLOv8 (SLGA-YOLO).</p>
Full article ">Figure 9
<p>The comparison of detection results: (<b>a</b>) YOLOv8 and (<b>b</b>) improved YOLOv8 (SLGA-YOLO).</p>
Full article ">Figure 10
<p>(<b>a</b>,<b>b</b>) Comparison of detection results between YOLOv8 and Improved YOLOv8 (SLGA-YOLO) on our enhanced dataset, respectively.</p>
Full article ">Figure 11
<p>Experimental results: (<b>a</b>) mAP@0.5, (<b>b</b>) mAP@0.5: 0.95, (<b>c</b>) Precision, (<b>d</b>) Recall, and (<b>e</b>) Box_loss.</p>
Full article ">Figure 12
<p>Comparative results of experiments with different models.</p>
Full article ">
13 pages, 5632 KiB  
Article
Defect Identification of 316L Stainless Steel in Selective Laser Melting Process Based on Deep Learning
by Wei Yang, Xinji Gan and Jinqian He
Processes 2024, 12(6), 1054; https://doi.org/10.3390/pr12061054 - 22 May 2024
Cited by 1 | Viewed by 733
Abstract
In additive manufacturing, such as Selective Laser Melting (SLM), identifying fabrication defects poses a significant challenge. Existing identification algorithms often struggle to meet the precision requirements for defect detection. To accurately identify small-scale defects in SLM, this paper proposes a deep learning model [...] Read more.
In additive manufacturing, such as Selective Laser Melting (SLM), identifying fabrication defects poses a significant challenge. Existing identification algorithms often struggle to meet the precision requirements for defect detection. To accurately identify small-scale defects in SLM, this paper proposes a deep learning model based on the original YOLOv5 network architecture for enhanced defect identification. Specifically, we integrate a small target identification layer into the network to improve the recognition of minute anomalies like keyholes. Additionally, a similarity attention module (SimAM) is introduced to enhance the model’s sensitivity to channel and spatial features, facilitating the identification of dense target regions. Furthermore, the SPD-Conv module is employed to reduce information loss within the network and enhance the model’s identification rate. During the testing phase, a set of sample photos is randomly selected to evaluate the efficacy of the proposed model, utilizing training and test sets derived from a pre-existing defect database. The model’s performance in multi-category recognition is measured using the average accuracy metric. Test results demonstrate that the improved YOLOv5 model achieves a mean average precision (mAP) of 89.8%, surpassing the mAP of the original YOLOv5 network by 1.7% and outperforming other identification networks in terms of accuracy. Notably, the improved YOLOv5 model exhibits superior capability in identifying small-sized defects. Full article
(This article belongs to the Special Issue Additive Manufacturing of Materials: Process and Applications)
Show Figures

Figure 1

Figure 1
<p>Defect-type labeling (The green box stands for LOF (Lack of Fusion), the yellow box stands for unmelted powder, the blue box stands for keyhole).</p>
Full article ">Figure 2
<p>Model structure for adding a small-target detection layer.</p>
Full article ">Figure 3
<p>Full 3-D weights for attention.</p>
Full article ">Figure 4
<p>SDP-Conv structure.</p>
Full article ">Figure 5
<p>SLM defect detection of 316L material based on YOLOv5.</p>
Full article ">Figure 6
<p>Results of the improved YOLOv5: (<b>a</b>) the training and validation loss of the improved model; (<b>b</b>) the mAP@0.5 of the model before and after improvement.</p>
Full article ">Figure 7
<p>Comparison of YOLOv5 model recognition effects. (<b>a</b>–<b>d</b>) Recognition effect of the original YOLOv5; (<b>e</b>–<b>h</b>) recognition effect of the improved YOLOv5.</p>
Full article ">
17 pages, 3364 KiB  
Article
A Novel Lightweight Model for Underwater Image Enhancement
by Botao Liu, Yimin Yang, Ming Zhao and Min Hu
Sensors 2024, 24(10), 3070; https://doi.org/10.3390/s24103070 - 11 May 2024
Viewed by 1053
Abstract
Underwater images suffer from low contrast and color distortion. In order to improve the quality of underwater images and reduce storage and computational resources, this paper proposes a lightweight model Rep-UWnet to enhance underwater images. The model consists of a fully connected convolutional [...] Read more.
Underwater images suffer from low contrast and color distortion. In order to improve the quality of underwater images and reduce storage and computational resources, this paper proposes a lightweight model Rep-UWnet to enhance underwater images. The model consists of a fully connected convolutional network and three densely connected RepConv blocks in series, with the input images connected to the output of each block with a Skip connection. First, the original underwater image is subjected to feature extraction by the SimSPPF module and is processed through feature summation with the original one to be produced as the input image. Then, the first convolutional layer with a kernel size of 3 × 3, generates 64 feature maps, and the multi-scale hybrid convolutional attention module enhances the useful features by reweighting the features of different channels. Second, three RepConv blocks are connected to reduce the number of parameters in extracting features and increase the test speed. Finally, a convolutional layer with 3 kernels generates enhanced underwater images. Our method reduces the number of parameters from 2.7 M to 0.45 M (around 83% reduction) but outperforms state-of-the-art algorithms by extensive experiments. Furthermore, we demonstrate our Rep-UWnet effectively improves high-level vision tasks like edge detection and single image depth estimation. This method not only surpasses the contrast method in objective quality, but also significantly improves the contrast, colorimetry, and clarity of underwater images in subjective quality. Full article
Show Figures

Figure 1

Figure 1
<p>Structure of Rep-UWnet model.</p>
Full article ">Figure 2
<p>(<b>a</b>) The multiscale hybrid convolutional attention module; (<b>b</b>) spatial attention; and (<b>c</b>) channel attention are shown.</p>
Full article ">Figure 3
<p>RepVGG Block structure, (<b>a</b>) down sampling; (<b>b</b>) normal; (<b>c</b>) structural re-parameterization process diagram.</p>
Full article ">Figure 3 Cont.
<p>RepVGG Block structure, (<b>a</b>) down sampling; (<b>b</b>) normal; (<b>c</b>) structural re-parameterization process diagram.</p>
Full article ">Figure 4
<p>Subjective comparison of Rep-UWnet with existing methods and SOTA models for underwater image enhancement performance on EUVP and UFO 120 datasets. (<b>a</b>) input; (<b>b</b>) CLAHE [<a href="#B11-sensors-24-03070" class="html-bibr">11</a>]; (<b>c</b>) DCP [<a href="#B13-sensors-24-03070" class="html-bibr">13</a>]; (<b>d</b>) HE [<a href="#B6-sensors-24-03070" class="html-bibr">6</a>]; (<b>e</b>) ILBA [<a href="#B7-sensors-24-03070" class="html-bibr">7</a>]; (<b>f</b>) UDCP [<a href="#B6-sensors-24-03070" class="html-bibr">6</a>]; (<b>g</b>) Deep SESR [<a href="#B16-sensors-24-03070" class="html-bibr">16</a>]; (<b>h</b>) FUnIE-GAN [<a href="#B26-sensors-24-03070" class="html-bibr">26</a>]; (<b>i</b>) U-GAN [<a href="#B14-sensors-24-03070" class="html-bibr">14</a>]; (<b>j</b>) ours; (<b>k</b>) label.</p>
Full article ">Figure 5
<p>Ablation experiments on the EUVP dataset. “w/o” indicates that the removal of the corresponding loss term in the experiment. (<b>a</b>) Original underwater image (<b>b</b>) with SSIM loss removed, (<b>c</b>) with VGG loss removed, (<b>d</b>) with MSE loss removed, (<b>e</b>) image generated by the proposed methods, and (<b>f</b>) reference image.</p>
Full article ">Figure 6
<p>Example of EUVP dark dataset, single image depth estimation, and edge detection for real-world underwater images. (<b>a</b>) The original image, (<b>b</b>) is the result of single-image depth estimation and edge detection of (<b>a</b>), (<b>c</b>) is the image after enhancement by the method in this paper, and (<b>d</b>) is the result of single-image depth estimation and edge detection of (<b>c</b>).</p>
Full article ">
17 pages, 5379 KiB  
Article
RDD-YOLO: Road Damage Detection Algorithm Based on Improved You Only Look Once Version 8
by Yue Li, Chang Yin, Yutian Lei, Jiale Zhang and Yiting Yan
Appl. Sci. 2024, 14(8), 3360; https://doi.org/10.3390/app14083360 - 16 Apr 2024
Viewed by 1329
Abstract
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces [...] Read more.
The detection of road damage is highly important for traffic safety and road maintenance. Conventional detection approaches frequently require significant time and expenditure, the accuracy of detection cannot be guaranteed, and they are prone to misdetection or omission problems. Therefore, this paper introduces an enhanced version of the You Only Look Once version 8 (YOLOv8) road damage detection algorithm called RDD-YOLO. First, the simple attention mechanism (SimAM) is integrated into the backbone, which successfully improves the model’s focus on crucial details within the input image, enabling the model to capture features of road damage more accurately, thus enhancing the model’s precision. Second, the neck structure is optimized by replacing traditional convolution modules with GhostConv. This reduces redundant information, lowers the number of parameters, and decreases computational complexity while maintaining the model’s excellent performance in damage recognition. Last, the upsampling algorithm in the neck is improved by replacing the nearest interpolation with more accurate bilinear interpolation. This enhances the model’s capacity to maintain visual details, providing clearer and more accurate outputs for road damage detection tasks. Experimental findings on the RDD2022 dataset show that the proposed RDD-YOLO model achieves an mAP50 and mAP50-95 of 62.5% and 36.4% on the validation set, respectively. Compared to baseline, this represents an improvement of 2.5% and 5.2%. The F1 score on the test set reaches 69.6%, a 2.8% improvement over the baseline. The proposed method can accurately locate and detect road damage, save labor and material resources, and offer guidance for the assessment and upkeep of road damage. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

Figure 1
<p>Object detection algorithm classification.</p>
Full article ">Figure 2
<p>YOLO algorithm evolution timeline [<a href="#B16-applsci-14-03360" class="html-bibr">16</a>,<a href="#B18-applsci-14-03360" class="html-bibr">18</a>,<a href="#B19-applsci-14-03360" class="html-bibr">19</a>,<a href="#B20-applsci-14-03360" class="html-bibr">20</a>,<a href="#B22-applsci-14-03360" class="html-bibr">22</a>,<a href="#B23-applsci-14-03360" class="html-bibr">23</a>].</p>
Full article ">Figure 3
<p>YOLOv8 network architecture.</p>
Full article ">Figure 4
<p>Improved YOLOv8 network architecture.</p>
Full article ">Figure 5
<p>Comparisons of different attention in different dimensions [<a href="#B2-applsci-14-03360" class="html-bibr">2</a>]. The identical color signifies the use of a sole scalar applied to every channel, spatial position, or individual point across those features.</p>
Full article ">Figure 6
<p>Comparisons of Conv and GhostConv [<a href="#B3-applsci-14-03360" class="html-bibr">3</a>]. (<b>a</b>) The convolutional layer. (<b>b</b>) The Ghost module.</p>
Full article ">Figure 7
<p>Nearest interpolation diagram.</p>
Full article ">Figure 8
<p>Comparison diagram of corner alignment and edge alignment.</p>
Full article ">Figure 9
<p>Bilinear interpolation diagram.</p>
Full article ">Figure 10
<p>Example images for road damage categories.</p>
Full article ">Figure 11
<p>RDD2022 data statistics: distribution of images and labels based on countries.</p>
Full article ">Figure 12
<p>IoU calculation diagram.</p>
Full article ">Figure 13
<p>mAP50 and mAP50-95 curves of different YOLO algorithms in validation set.</p>
Full article ">Figure 14
<p>Sample images of road damage detection for each category.</p>
Full article ">
17 pages, 11471 KiB  
Article
CNTCB-YOLOv7: An Effective Forest Fire Detection Model Based on ConvNeXtV2 and CBAM
by Yiqing Xu, Jiaming Li, Long Zhang, Hongying Liu and Fuquan Zhang
Fire 2024, 7(2), 54; https://doi.org/10.3390/fire7020054 - 12 Feb 2024
Cited by 5 | Viewed by 2039
Abstract
In the context of large-scale fire areas and complex forest environments, the task of identifying the subtle features and aspects of fire can pose a significant challenge for the deep learning model. As a result, to enhance the model’s ability to represent features [...] Read more.
In the context of large-scale fire areas and complex forest environments, the task of identifying the subtle features and aspects of fire can pose a significant challenge for the deep learning model. As a result, to enhance the model’s ability to represent features and its precision in detection, this study initially introduces ConvNeXtV2 and Conv2Former to the You Only Look Once version 7 (YOLOv7) algorithm, separately, and then compares the results with the original YOLOv7 algorithm through experiments. After comprehensive comparison, the proposed ConvNeXtV2-YOLOv7 based on ConvNeXtV2 exhibits a superior performance in detecting forest fires. Additionally, in order to further focus the network on the crucial information in the task of detecting forest fires and minimize irrelevant background interference, the efficient layer aggregation network (ELAN) structure in the backbone network is enhanced by adding four attention mechanisms: the normalization-based attention module (NAM), simple attention mechanism (SimAM), global attention mechanism (GAM), and convolutional block attention module (CBAM). The experimental results, which demonstrate the suitability of ELAN combined with the CBAM module for forest fire detection, lead to the proposal of a new method for forest fire detection called CNTCB-YOLOv7. The CNTCB-YOLOv7 algorithm outperforms the YOLOv7 algorithm, with an increase in accuracy of 2.39%, recall rate of 0.73%, and average precision (AP) of 1.14%. Full article
(This article belongs to the Special Issue Intelligent Forest Fire Prediction and Detection)
Show Figures

Figure 1

Figure 1
<p>Dateset of forest fires: (<b>a</b>,<b>b</b>,<b>d</b>) fire images; and (<b>c</b>) non-fire image.</p>
Full article ">Figure 2
<p>YOLOv7 Model Architecture.</p>
Full article ">Figure 3
<p>MP module.</p>
Full article ">Figure 4
<p>SPPCSPC module.</p>
Full article ">Figure 5
<p>Block structures of ConvNeXt V1 and ConvNeXt V2. In ConvNeXt V2, the GRN layer (in green) was added after the dimension-expansion MLP layer and the LayerScale (in red) was dropped.</p>
Full article ">Figure 6
<p>Self attention mechanism and convolutional modulation operation.</p>
Full article ">Figure 7
<p>The network structure of ConNeXtV2-YOLOv7.</p>
Full article ">Figure 8
<p>The structure of introducing the attention mechanism for ELAN.</p>
Full article ">Figure 9
<p>ELAN-CBAM structure.</p>
Full article ">Figure 10
<p>Test image results with a large range of forest fires: (<b>a</b>,<b>c</b>) YOLOv7 algorithm; and (<b>b</b>,<b>d</b>) CNTCB-YOLOv7 algorithm.</p>
Full article ">Figure 11
<p>Test image results with a complex forest background: (<b>a</b>,<b>c</b>) YOLOv7 algorithm; and (<b>b</b>,<b>d</b>) CNTCB-YOLOv7 algorithm.</p>
Full article ">
16 pages, 9203 KiB  
Article
Integration of ShuffleNet V2 and YOLOv5s Networks for a Lightweight Object Detection Model of Electric Bikes within Elevators
by Jingfang Su, Minrui Yang and Xinliang Tang
Electronics 2024, 13(2), 394; https://doi.org/10.3390/electronics13020394 - 18 Jan 2024
Viewed by 1468
Abstract
The entry of electric bikes into elevators poses safety risks. This article proposes a lightweight object detection model for edge deployment in elevator environments specifically designed for electric bikes. Based on the YOLOv5s network, the backbone network replaces the original CSPDarknet53 with a [...] Read more.
The entry of electric bikes into elevators poses safety risks. This article proposes a lightweight object detection model for edge deployment in elevator environments specifically designed for electric bikes. Based on the YOLOv5s network, the backbone network replaces the original CSPDarknet53 with a lightweight multilayer ShuffleNet V2 convolutional neural network, achieving a lightweight backbone network. Swin Transformer modules are introduced between layers to enhance the feature expression capability of images, and a SimAM attention mechanism is applied at the end layer to further improve the feature extraction capability of the backbone network. In the neck network, lightweight and depth-balanced GSConv and VoV-GSCSP modules replace several Conv and C3 basic convolutional modules, reducing the parameter count while enhancing the cross-scale connection and fusion capabilities of feature maps. The prediction network uses the faster-converging and more accurate EIOU error function as the position loss function for iterative training. This article conducts various lightweighting comparison experiments and ablation experiments on the improved object detection model. The experimental results demonstrate that the proposed object detection model, with a model size of only 2.6 megabytes and 1.1 million parameters, achieves a frame rate of 106 frames per second and a detection accuracy of 95.5%. This represents an 84.8% reduction in computational load compared to the original YOLOv5s model. The model’s volume and parameter count are reduced by 81.0% and 84.3%, respectively, with only a 0.9% decrease in mAP. The improved object detection model proposed in this paper can meet the real-time detection requirements for electric bikes in elevator scenarios, providing a feasible technical solution for its deployment on edge devices within elevators. Full article
(This article belongs to the Special Issue Novel Methods for Object Detection and Segmentation)
Show Figures

Figure 1

Figure 1
<p>YOLOv5s algorithm framework.</p>
Full article ">Figure 2
<p>Improved lightweight YOLOv5 s model.</p>
Full article ">Figure 3
<p>ShuffleNet V2 network structure (<b>a</b>) Unit1 (<b>b</b>) Unit2.</p>
Full article ">Figure 4
<p>Channel ShuffleNet.</p>
Full article ">Figure 5
<p>Swin Transformer module.</p>
Full article ">Figure 6
<p>Swin Transformer window division.</p>
Full article ">Figure 7
<p>SW-MSA window division.</p>
Full article ">Figure 8
<p>SimAM attention mechanism.</p>
Full article ">Figure 9
<p>GSConv module.</p>
Full article ">Figure 10
<p>VoV-GSCSP module. (<b>a</b>) GS Bottleneck, (<b>b</b>) VoV-GSCSP.</p>
Full article ">Figure 11
<p>The YOLOv5s algorithm. (<b>a</b>) shows the detection results of electric bikes without occlusion, (<b>b</b>) displays the detection results for bicycles, and (<b>c</b>–<b>f</b>) illustrate the detection results of electric bikes with partial occlusion scenarios.</p>
Full article ">Figure 11 Cont.
<p>The YOLOv5s algorithm. (<b>a</b>) shows the detection results of electric bikes without occlusion, (<b>b</b>) displays the detection results for bicycles, and (<b>c</b>–<b>f</b>) illustrate the detection results of electric bikes with partial occlusion scenarios.</p>
Full article ">Figure 12
<p>The improved YOLOv5s algorithm. (<b>a</b>) shows the detection results of electric bikes without occlusion, (<b>b</b>) displays the detection results for bicycles, and (<b>c</b>–<b>f</b>) illustrate the detection results of electric bikes with partial occlusion scenarios.</p>
Full article ">
24 pages, 8995 KiB  
Article
GBSG-YOLOv8n: A Model for Enhanced Personal Protective Equipment Detection in Industrial Environments
by Chenyang Shi, Donglin Zhu, Jiaying Shen, Yangyang Zheng and Changjun Zhou
Electronics 2023, 12(22), 4628; https://doi.org/10.3390/electronics12224628 - 12 Nov 2023
Cited by 1 | Viewed by 2035
Abstract
The timely and accurate detection of whether or not workers in an industrial environment are correctly wearing personal protective equipment (PPE) is paramount for worker safety. However, current PPE detection faces multiple inherent challenges, including complex backgrounds, varying target size ranges, and relatively [...] Read more.
The timely and accurate detection of whether or not workers in an industrial environment are correctly wearing personal protective equipment (PPE) is paramount for worker safety. However, current PPE detection faces multiple inherent challenges, including complex backgrounds, varying target size ranges, and relatively low accuracy. In response to these challenges, this study presents a novel PPE safety detection model based on YOLOv8n, called GBSG-YOLOv8n. First, the global attention mechanism (GAM) is introduced to enhance the feature extraction capability of the backbone network. Second, the path aggregation network (PANet) structure is optimized in the Neck network, strengthening the model’s feature learning ability and achieving multi-scale feature fusion, further improving detection accuracy. Additionally, a new SimC2f structure has been designed to handle image features and more effectively improve detection efficiency. Finally, GhostConv is adopted to optimize the convolution operations, effectively reducing the model’s computational complexity. Experimental results demonstrate that, compared to the original YOLOv8n model, the proposed GBSG-YOLOv8n model in this study achieved a 3% improvement in the mean Average Precision (mAP), with a significant reduction in model complexity. This validates the model’s practicality in complex industrial environments, enabling a more effective detection of workers’ PPE usage and providing reliable protection for achieving worker safety. This study emphasizes the significant potential of computer vision technology in enhancing worker safety and provides a robust reference for future research regarding industrial safety. Full article
Show Figures

Figure 1

Figure 1
<p>YOLOv8n network structure diagram.</p>
Full article ">Figure 2
<p>GBSG-YOLOv8n network structure diagram.</p>
Full article ">Figure 3
<p>GAM structure diagram.</p>
Full article ">Figure 4
<p>Schematic of the different feature fusion structures: (<b>a</b>) PANet; (<b>b</b>) BiFPN.</p>
Full article ">Figure 5
<p>The architecture of BiFPN.</p>
Full article ">Figure 6
<p>ELAN structure.</p>
Full article ">Figure 7
<p>C2f structure.</p>
Full article ">Figure 8
<p>SimC2f structure.</p>
Full article ">Figure 9
<p>(<b>a</b>) Traditional convolution; (<b>b</b>) GhostConv module.</p>
Full article ">Figure 10
<p>Sampling example.</p>
Full article ">Figure 11
<p>(<b>a</b>) PPE category diagram; (<b>b</b>) distribution plot of x and y coordinates.</p>
Full article ">Figure 12
<p>(<b>a</b>) YOLOv8n P-R curve; (<b>b</b>) GBSG-YOLOv8n P-R curve.</p>
Full article ">Figure 13
<p>Ablation experiment result: (<b>a</b>) bar chart for accuracy; (<b>b</b>) bar chart for parameters; (<b>c</b>) bar chart for FLOPS; (<b>d</b>) bar chart for weight; (<b>e</b>) bar chart for inference time.</p>
Full article ">Figure 14
<p>Comparative experimental results: (<b>a</b>) line graph showing accuracy; (<b>b</b>) bar chart for parameters; (<b>c</b>) bar chart for flops; (<b>d</b>) bar chart for weight.</p>
Full article ">Figure 14 Cont.
<p>Comparative experimental results: (<b>a</b>) line graph showing accuracy; (<b>b</b>) bar chart for parameters; (<b>c</b>) bar chart for flops; (<b>d</b>) bar chart for weight.</p>
Full article ">Figure 15
<p>(<b>a</b>) Factory layout; (<b>b</b>) system deployment diagram; (<b>c</b>) system display interface for workers wearing PPE correctly; (<b>d</b>) system display interface for workers wearing PPE incorrectly.</p>
Full article ">Figure 15 Cont.
<p>(<b>a</b>) Factory layout; (<b>b</b>) system deployment diagram; (<b>c</b>) system display interface for workers wearing PPE correctly; (<b>d</b>) system display interface for workers wearing PPE incorrectly.</p>
Full article ">Figure 16
<p>(<b>a</b>–<b>d</b>) are real-time monitoring images of the site.</p>
Full article ">
21 pages, 14737 KiB  
Article
Optimizing Road Safety: Advancements in Lightweight YOLOv8 Models and GhostC2f Design for Real-Time Distracted Driving Detection
by Yingjie Du, Xiaofeng Liu, Yuwei Yi and Kun Wei
Sensors 2023, 23(21), 8844; https://doi.org/10.3390/s23218844 - 31 Oct 2023
Cited by 12 | Viewed by 2794
Abstract
The rapid detection of distracted driving behaviors is crucial for enhancing road safety and preventing traffic accidents. Compared with the traditional methods of distracted-driving-behavior detection, the YOLOv8 model has been proven to possess powerful capabilities, enabling it to perceive global information more swiftly. [...] Read more.
The rapid detection of distracted driving behaviors is crucial for enhancing road safety and preventing traffic accidents. Compared with the traditional methods of distracted-driving-behavior detection, the YOLOv8 model has been proven to possess powerful capabilities, enabling it to perceive global information more swiftly. Currently, the successful application of GhostConv in edge computing and embedded systems further validates the advantages of lightweight design in real-time detection using large models. Effectively integrating lightweight strategies into YOLOv8 models and reducing their impact on model performance has become a focal point in the field of real-time distracted driving detection based on deep learning. Inspired by GhostConv, this paper presents an innovative GhostC2f design, aiming to integrate the idea of linear transformation to generate more feature maps without additional computation into YOLOv8 for real-time distracted-driving-detection tasks. The goal is to reduce model parameters and computational load. Additionally, enhancements have been made to the path aggregation network (PAN) to amplify multi-level feature fusion and contextual information propagation. Furthermore, simple attention mechanisms (SimAMs) are introduced to perform self-normalization on each feature map, emphasizing feature maps with valuable information and suppressing redundant information interference in complex backgrounds. Lastly, the nine distinct distracted driving types in the publicly available SFDDD dataset were expanded to 14 categories, and nighttime scenarios were introduced. The results indicate a 5.1% improvement in model accuracy, with model weight size and computational load reduced by 36.7% and 34.6%, respectively. During 30 real vehicle tests, the distracted-driving-detection accuracy reached 91.9% during daylight and 90.3% at night, affirming the exceptional performance of the proposed model in assisting distracted driving detection when driving and contributing to accident-risk reduction. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

Figure 1
<p>Structure diagram of YOLOv8n.</p>
Full article ">Figure 2
<p>GhostConv structure.</p>
Full article ">Figure 3
<p>Two lightweight constructions and C2f construction. (<b>a</b>) GhostBottleneck (stride = 1) construction; (<b>b</b>) GhostBottleneck (stride = 2) construction; (<b>c</b>) C2f construction.</p>
Full article ">Figure 4
<p>GhostC2f structure.</p>
Full article ">Figure 5
<p>Structure diagram of PAN and BiFPN. (<b>a</b>) Structure of PAN; (<b>b</b>) structure of BiFPN.</p>
Full article ">Figure 6
<p>The BiFPN structure.</p>
Full article ">Figure 7
<p>Similarity-based attention mechanism structure.</p>
Full article ">Figure 8
<p>Fourteen distracted driving behaviors.</p>
Full article ">Figure 9
<p>Data-enhancement diagram. (<b>a</b>) Adjusting brightness, (<b>b</b>) adjusting saturation, (<b>c</b>) adding noise, and (<b>d</b>) random panning.</p>
Full article ">Figure 10
<p>Label data volume and label distribution of distracted driving behaviors.</p>
Full article ">Figure 11
<p>PR diagram of YOLOv8n.</p>
Full article ">Figure 12
<p>PR diagram of YOLO-LBS.</p>
Full article ">Figure 13
<p>Comparison of mAP with different model weights.</p>
Full article ">Figure 14
<p>Feature visualization maps. (<b>a</b>) YOLOv8n + Lightweighting; (<b>b</b>) YOLOv8n + Lightweighting + BiFPN.</p>
Full article ">Figure 15
<p>Grad-CAM visualization. (<b>a</b>) YOLOv8n + Lightweighting + BiFPN; (<b>b</b>) YOLO-LBS.</p>
Full article ">Figure 16
<p>Comparison results with mainstream models.</p>
Full article ">Figure 17
<p>Data-collection procedure.</p>
Full article ">Figure 18
<p>Test results of seven distracted driving behaviors.</p>
Full article ">Figure 19
<p>Test results of another seven distracted driving behaviors.</p>
Full article ">
16 pages, 4431 KiB  
Article
Maize Disease Classification System Design Based on Improved ConvNeXt
by Han Li, Mingyang Qi, Baoxia Du, Qi Li, Haozhang Gao, Jun Yu, Chunguang Bi, Helong Yu, Meijing Liang, Guanshi Ye and You Tang
Sustainability 2023, 15(20), 14858; https://doi.org/10.3390/su152014858 - 13 Oct 2023
Cited by 3 | Viewed by 1375
Abstract
Maize diseases have a great impact on agricultural productivity, making the classification of maize diseases a popular research area. Despite notable advancements in maize disease classification achieved via deep learning techniques, challenges such as low accuracy and identification difficulties still persist. To address [...] Read more.
Maize diseases have a great impact on agricultural productivity, making the classification of maize diseases a popular research area. Despite notable advancements in maize disease classification achieved via deep learning techniques, challenges such as low accuracy and identification difficulties still persist. To address these issues, this study introduced a convolutional neural network model named Sim-ConvNeXt, which incorporated a parameter-free SimAM attention module. The integration of this attention mechanism enhanced the ability of the downsample module to extract essential features of maize diseases, thereby improving classification accuracy. Moreover, transfer learning was employed to expedite model training and improve the classification performance. To evaluate the efficacy of the proposed model, a publicly accessible dataset with eight different types of maize diseases was utilized. Through the application of data augmentation techniques, including image resizing, hue, cropping, rotation, and edge padding, the dataset was expanded to comprise 17,670 images. Subsequently, a comparative analysis was conducted between the improved model and other models, wherein the approach demonstrated an accuracy rate of 95.2%. Notably, this performance represented a 1.2% enhancement over the ConvNeXt model and a 1.5% improvement over the advanced Swin Transformer model. Furthermore, the precision, recall, and F1 scores of the improved model demonstrated respective increases of 1.5% in each metric compared to the ConvNeXt model. Notably, using the Flask framework, a website for maize disease classification was developed, enabling accurate prediction of uploaded maize disease images. Full article
Show Figures

Figure 1

Figure 1
<p>Design of Sim-ConvNeXt model.</p>
Full article ">Figure 2
<p>Inverted bottleneck design in MobileNetV2 and ConvNeXt. (<b>a</b>) MobileNetV2; (<b>b</b>) ConvNeXt.</p>
Full article ">Figure 3
<p>Micro-design of Swin Transformer and ConvNeXt. (<b>a</b>) Swin Transformer block; (<b>b</b>) ConvNeXt block.</p>
Full article ">Figure 4
<p>Three types of attention modules. (<b>a</b>) Channel-wise attention; (<b>b</b>) Spatial-wise attention; (<b>c</b>) Full 3D weights for attention.</p>
Full article ">Figure 5
<p>Improved downsample module. (<b>a</b>) Downsample module; (<b>b</b>) Improved downsample module.</p>
Full article ">Figure 6
<p>Maize disease classification system.</p>
Full article ">Figure 7
<p>Data augmentation. (<b>a</b>) original; (<b>b</b>) resize; (<b>c</b>) hue; (<b>d</b>) crop; (<b>e</b>) rotate; and (<b>f</b>) padding.</p>
Full article ">Figure 8
<p>Validation accuracy performance in different model training.</p>
Full article ">Figure 9
<p>Visualizing the classification performance of different models.</p>
Full article ">Figure 10
<p>Confusion matrices of different models. (<b>a</b>) ResNet34; (<b>b</b>) ResNeXt50; (<b>c</b>) MobileNetV2; (<b>d</b>) DenseNet121; (<b>e</b>) ViT; (<b>f</b>) Swin-T; (<b>g</b>) ConvNeXt-T; and (<b>h</b>) Sim-ConvNeXt.</p>
Full article ">Figure 10 Cont.
<p>Confusion matrices of different models. (<b>a</b>) ResNet34; (<b>b</b>) ResNeXt50; (<b>c</b>) MobileNetV2; (<b>d</b>) DenseNet121; (<b>e</b>) ViT; (<b>f</b>) Swin-T; (<b>g</b>) ConvNeXt-T; and (<b>h</b>) Sim-ConvNeXt.</p>
Full article ">Figure 11
<p>Maize disease prediction results.</p>
Full article ">
15 pages, 7090 KiB  
Article
Improved YOLOv8-Seg Network for Instance Segmentation of Healthy and Diseased Tomato Plants in the Growth Stage
by Xiang Yue, Kai Qi, Xinyi Na, Yang Zhang, Yanhua Liu and Cuihong Liu
Agriculture 2023, 13(8), 1643; https://doi.org/10.3390/agriculture13081643 - 21 Aug 2023
Cited by 25 | Viewed by 7270
Abstract
The spread of infections and rot are crucial factors in the decrease in tomato production. Accurately segmenting the affected tomatoes in real-time can prevent the spread of illnesses. However, environmental factors and surface features can affect tomato segmentation accuracy. This study suggests an [...] Read more.
The spread of infections and rot are crucial factors in the decrease in tomato production. Accurately segmenting the affected tomatoes in real-time can prevent the spread of illnesses. However, environmental factors and surface features can affect tomato segmentation accuracy. This study suggests an improved YOLOv8s-Seg network to perform real-time and effective segmentation of tomato fruit, surface color, and surface features. The feature fusion capability of the algorithm was improved by replacing the C2f module with the RepBlock module (stacked by RepConv), adding SimConv convolution (using the ReLU function instead of the SiLU function as the activation function) before two upsampling in the feature fusion network, and replacing the remaining conventional convolution with SimConv. The F1 score was 88.7%, which was 1.0%, 2.8%, 0.8%, and 1.1% higher than that of the YOLOv8s-Seg algorithm, YOLOv5s-Seg algorithm, YOLOv7-Seg algorithm, and Mask RCNN algorithm, respectively. Meanwhile, the segment mean average precision (segment mAP@0.5) was 92.2%, which was 2.4%, 3.2%, 1.8%, and 0.7% higher than that of the YOLOv8s-Seg algorithm, YOLOv5s-Seg algorithm, YOLOv7-Seg algorithm, and Mask RCNN algorithm. The algorithm can perform real-time instance segmentation of tomatoes with an inference time of 3.5 ms. This approach provides technical support for tomato health monitoring and intelligent harvesting. Full article
Show Figures

Figure 1

Figure 1
<p>Example images. (<b>a</b>) young fruit, (<b>b</b>) immature, (<b>c</b>) half-ripe, (<b>d</b>) ripe, (<b>e</b>,<b>f</b>) ripe, immature, (<b>g</b>) umbilical rot, (<b>h</b>) grey mold, (<b>i</b>) crack, (<b>j</b>) virus disease, (<b>k</b>) late blight, (<b>l</b>) bacterial canker.</p>
Full article ">Figure 2
<p>Image sharpening. (<b>a</b>,<b>c</b>) Original image. (<b>b</b>,<b>d</b>) Sharpened image.</p>
Full article ">Figure 3
<p>Data enhancement. (<b>a</b>) sharpened image; (<b>b</b>) sharpened image adjusted brightness (brightened); (<b>c</b>) sharpened image adjusted brightness (darkened); (<b>d</b>) sharpened image subjected to mirroring operation; (<b>e</b>) sharpened image rotated by 180°; (<b>f</b>) sharpened image rotated by 30°.</p>
Full article ">Figure 4
<p>Annotation of tomatoes.</p>
Full article ">Figure 5
<p>Structure of YOLACT network.</p>
Full article ">Figure 6
<p>Structure of tomato segmentation network based on improved YOLOv8s-Seg. The ability to fuse features of the network was improved by replacing the C2f module with the RepBlock module, adding SimConv convolution before two upsampling in the neck module, and replacing the remaining conventional convolution with SimConv.</p>
Full article ">Figure 7
<p>Structure of the C3 module.</p>
Full article ">Figure 8
<p>Structure of the C2f module in the neck.</p>
Full article ">Figure 9
<p>Mosaic data enhancement.</p>
Full article ">Figure 10
<p>Examples of instance segmentation of tomatoes. (<b>a</b>): ripe tomatoes and immature tomatoes shaded by leaves, half-ripe tomatoes with intact fruit characteristics; (<b>b</b>): immature tomatoes with overlapping fruit and ripe tomatoes with intact fruit characteristics; (<b>c</b>): immature tomatoes affected by changes in light; (<b>d</b>): example segmentation of tomatoes shaded by leaves; (<b>e</b>): example segmentation of overlapping fruit; (<b>f</b>): example segmentation of tomatoes affected by changes in light; (<b>g</b>): immature and ripe tomatoes affected by changes in angle; (<b>h</b>): immature tomatoes, half-ripe tomatoes, and young fruit; (<b>i</b>): cracked tomatoes, ripe tomatoes. (<b>j</b>): example segmentation of tomatoes affected by changes in angle; (<b>k</b>): example segmentation for immature tomatoes, half-ripe tomatoes, and young fruit; (<b>l</b>): example segmentation for cracked tomatoes and ripe tomatoes.</p>
Full article ">Figure 11
<p>Comparison of Segment mAP<sub>@0.5</sub> for five algorithms.</p>
Full article ">
13 pages, 1355 KiB  
Article
A Flame Detection Algorithm Based on Improved YOLOv7
by Guibao Yan, Jialin Guo, Dongyi Zhu, Shuming Zhang, Rui Xing, Zhangshu Xiao and Qichao Wang
Appl. Sci. 2023, 13(16), 9236; https://doi.org/10.3390/app13169236 - 14 Aug 2023
Cited by 1 | Viewed by 1923
Abstract
Flame recognition is of great significance in fire prevention. However, current algorithms for flame detection have some problems, such as missing detection and false detection, and the detection accuracy cannot satisfy the requirements for fire prevention. In order to further the above problems, [...] Read more.
Flame recognition is of great significance in fire prevention. However, current algorithms for flame detection have some problems, such as missing detection and false detection, and the detection accuracy cannot satisfy the requirements for fire prevention. In order to further the above problems, we propose a flame detection algorithm based on an improved YOLOv7 network. In our algorithm, we replace a convolution of the MP-1 module with a SimAM structure, which is a parameter-free attention mechanism. In this way, the missing detection problem can be improved. Furthermore, we use a ConvNeXt-based CNeB module to replace a convolution of the ELAN-W module for increasing detection accuracy and the false detection problem in complex environments. Finally, we evaluate the performance of our algorithm through a large number of test cases, and the data set used in our experiments was constructed by combining several publicly available data sets for various application scenarios. The experimental results indicate that compared with the original YOLOv7 algorithm, our proposed algorithm can achieve a 7% increase in the aspect of mAP_0.5 and a 4.1% increase in the aspect of F1 score. Full article
Show Figures

Figure 1

Figure 1
<p>Comparisons of different attention steps: (<b>a</b>) Channel-wise attention; (<b>b</b>) Spatial-wise attention. (<b>c</b>) Full 3-D weights for attention.</p>
Full article ">Figure 2
<p>ConvNeXt network architecture.</p>
Full article ">Figure 3
<p>Improvement of YOLOv7 overall structure.</p>
Full article ">Figure 4
<p>Comparisons of the original MP-1 module and the improved MP-1 module: (<b>a</b>) Original MP-1 module; (<b>b</b>) Improved MP-1 module.</p>
Full article ">Figure 5
<p>Comparisons of the original ELAN-W module and the improved ELAN-W module. (<b>a</b>) Original ELAN-W module; (<b>b</b>) Improved ELAN-W module.</p>
Full article ">Figure 6
<p>Some example images of dataset.</p>
Full article ">Figure 7
<p>Comparisons of mAP, recall curves, and loss curves of various models: (<b>a</b>) mAP curves; (<b>b</b>) Recall curves; (<b>c</b>) Loss curves.</p>
Full article ">Figure 8
<p>Comparison of test results.</p>
Full article ">
Back to TopTop