[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (32,860)

Search Parameters:
Keywords = deep learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1413 KiB  
Article
Cheminformatic Identification of Tyrosyl-DNA Phosphodiesterase 1 (Tdp1) Inhibitors: A Comparative Study of SMILES-Based Supervised Machine Learning Models
by Conan Hong-Lun Lai, Alex Pak Ki Kwok and Kwong-Cheong Wong
J. Pers. Med. 2024, 14(9), 981; https://doi.org/10.3390/jpm14090981 (registering DOI) - 15 Sep 2024
Abstract
Background: Tyrosyl-DNA phosphodiesterase 1 (Tdp1) repairs damages in DNA induced by abortive topoisomerase 1 activity; however, maintenance of genetic integrity may sustain cellular division of neoplastic cells. It follows that Tdp1-targeting chemical inhibitors could synergize well with existing chemotherapy drugs to deny cancer [...] Read more.
Background: Tyrosyl-DNA phosphodiesterase 1 (Tdp1) repairs damages in DNA induced by abortive topoisomerase 1 activity; however, maintenance of genetic integrity may sustain cellular division of neoplastic cells. It follows that Tdp1-targeting chemical inhibitors could synergize well with existing chemotherapy drugs to deny cancer growth; therefore, identification of Tdp1 inhibitors may advance precision medicine in oncology. Objective: Current computational research efforts focus primarily on molecular docking simulations, though datasets involving three-dimensional molecular structures are often hard to curate and computationally expensive to store and process. We propose the use of simplified molecular input line entry system (SMILES) chemical representations to train supervised machine learning (ML) models, aiming to predict potential Tdp1 inhibitors. Methods: An open-sourced consensus dataset containing the inhibitory activity of numerous chemicals against Tdp1 was obtained from Kaggle. Various ML algorithms were trained, ranging from simple algorithms to ensemble methods and deep neural networks. For algorithms requiring numerical data, SMILES were converted to chemical descriptors using RDKit, an open-sourced Python cheminformatics library. Results: Out of 13 optimized ML models with rigorously tuned hyperparameters, the random forest model gave the best results, yielding a receiver operating characteristics-area under curve of 0.7421, testing accuracy of 0.6815, sensitivity of 0.6444, specificity of 0.7156, precision of 0.6753, and F1 score of 0.6595. Conclusions: Ensemble methods, especially the bootstrap aggregation mechanism adopted by random forest, outperformed other ML algorithms in classifying Tdp1 inhibitors from non-inhibitors using SMILES. The discovery of Tdp1 inhibitors could unlock more treatment regimens for cancer patients, allowing for therapies tailored to the patient’s condition. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Precision Oncology)
19 pages, 5161 KiB  
Article
Underwater Acoustic Orthogonal Frequency-Division Multiplexing Communication Using Deep Neural Network-Based Receiver: River Trial Results
by Sabna Thenginthody Hassan, Peng Chen, Yue Rong and Kit Yan Chan
Sensors 2024, 24(18), 5995; https://doi.org/10.3390/s24185995 (registering DOI) - 15 Sep 2024
Abstract
In this article, a deep neural network (DNN)-based underwater acoustic (UA) communication receiver is proposed. Conventional orthogonal frequency-division multiplexing (OFDM) receivers perform channel estimation using linear interpolation. However, due to the significant delay spread in multipath UA channels, the frequency response often exhibits [...] Read more.
In this article, a deep neural network (DNN)-based underwater acoustic (UA) communication receiver is proposed. Conventional orthogonal frequency-division multiplexing (OFDM) receivers perform channel estimation using linear interpolation. However, due to the significant delay spread in multipath UA channels, the frequency response often exhibits strong non-linearity between pilot subcarriers. Since the channel delay profile is generally unknown, this non-linearity cannot be modeled precisely. A neural network (NN)-based receiver effectively tackles this challenge by learning and compensating for the non-linearity through NN training. The performance of the DNN-based UA communication receiver was tested recently in river trials in Western Australia. The results obtained from the trials prove that the DNN-based receiver performs better than the conventional least-squares (LS) estimator-based receiver. This paper suggests that UA communication using DNN receivers holds great potential for revolutionizing underwater communication systems, enabling higher data rates, improved reliability, and enhanced adaptability to changing underwater conditions. Full article
(This article belongs to the Special Issue Advanced Acoustic Sensing Technology)
19 pages, 14422 KiB  
Article
YOLO-SegNet: A Method for Individual Street Tree Segmentation Based on the Improved YOLOv8 and the SegFormer Network
by Tingting Yang, Suyin Zhou, Aijun Xu, Junhua Ye and Jianxin Yin
Agriculture 2024, 14(9), 1620; https://doi.org/10.3390/agriculture14091620 (registering DOI) - 15 Sep 2024
Abstract
In urban forest management, individual street tree segmentation is a fundamental method to obtain tree phenotypes, which is especially critical. Most existing tree image segmentation models have been evaluated on smaller datasets and lack experimental verification on larger, publicly available datasets. Therefore, this [...] Read more.
In urban forest management, individual street tree segmentation is a fundamental method to obtain tree phenotypes, which is especially critical. Most existing tree image segmentation models have been evaluated on smaller datasets and lack experimental verification on larger, publicly available datasets. Therefore, this paper, based on a large, publicly available urban street tree dataset, proposes YOLO-SegNet for individual street tree segmentation. In the first stage of the street tree object detection task, the BiFormer attention mechanism was introduced into the YOLOv8 network to increase the contextual information extraction and improve the ability of the network to detect multiscale and multishaped targets. In the second-stage street tree segmentation task, the SegFormer network was proposed to obtain street tree edge information more efficiently. The experimental results indicate that our proposed YOLO-SegNet method, which combines YOLOv8+BiFormer and SegFormer, achieved a 92.0% mean intersection over union (mIoU), 95.9% mean pixel accuracy (mPA), and 97.4% accuracy on a large, publicly available urban street tree dataset. Compared with those of the fully convolutional neural network (FCN), lite-reduced atrous spatial pyramid pooling (LR-ASPP), pyramid scene parsing network (PSPNet), UNet, DeepLabv3+, and HRNet, the mIoUs of our YOLO-SegNet increased by 10.5, 9.7, 5.0, 6.8, 4.5, and 2.7 percentage points, respectively. The proposed method can effectively support smart agroforestry development. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>(<b>A</b>) is the number distribution of street tree images; (<b>B</b>) is the street tree image annotation.</p>
Full article ">Figure 2
<p>Examples of street tree object detection and instance segmentation annotated images for different tree species.</p>
Full article ">Figure 3
<p>YOLO-SegNet model. The CBS is the basic module, including the Conv2d layer, BatchNorm2d layer, and Sigmoid Linear Unit (SiLU) layer. The function of the CBS module is to introduce a cross-stage partial connection to improve the feature expression ability and information transfer efficiency. The role of the Spatial Pyramid Pooling Fast (SPPF) module is to fuse larger-scale global information to improve the performance of object detection. The bottleneck block can reduce the computational complexity and the number of parameters.</p>
Full article ">Figure 4
<p>(<b>A</b>) The overall architecture of BiFormer; (<b>B</b>) details of a BiFormer block.</p>
Full article ">Figure 5
<p>(<b>a</b>) Vanilla attention. (<b>b</b>–<b>d</b>) Local window [<a href="#B40-agriculture-14-01620" class="html-bibr">40</a>,<a href="#B42-agriculture-14-01620" class="html-bibr">42</a>], axial stripe [<a href="#B39-agriculture-14-01620" class="html-bibr">39</a>], and dilated window [<a href="#B41-agriculture-14-01620" class="html-bibr">41</a>,<a href="#B42-agriculture-14-01620" class="html-bibr">42</a>]. (<b>e</b>) Deformable attention [<a href="#B43-agriculture-14-01620" class="html-bibr">43</a>]. (<b>f</b>) Bilevel routing attention, BRA [<a href="#B6-agriculture-14-01620" class="html-bibr">6</a>].</p>
Full article ">Figure 6
<p>Gathering key–value pairs in the top <span class="html-italic">k</span> related windows.</p>
Full article ">Figure 7
<p>(<b>A</b>,<b>B</b>) are the loss function curves of the object detection network on the train and validation sets, respectively; (<b>C</b>,<b>D</b>) are the loss function curves of tree classification on the train and validation sets, respectively; (<b>E</b>–<b>H</b>) are the change curves of the four segmentation indicator values on the validation set, respectively.</p>
Full article ">Figure 8
<p>(<b>A</b>) Thermal map examples of YOLOv8 series models and YOLOv8m+BiFormer in the training process; (<b>B</b>) example results of the different object detection models on the test set.</p>
Full article ">Figure 9
<p>(<b>A</b>) The training loss function curves of the segmentation models without the object detection module. (<b>B</b>) The training loss function curves of the segmentation models with the object detection module.</p>
Full article ">Figure 10
<p>Performance of different segmentation models on the validation and test sets: (<b>A<sub>1</sub></b>,<b>A<sub>2</sub></b>) the segmentation results on the validation set; (<b>B<sub>1</sub></b>,<b>B<sub>2</sub></b>) the segmentation results on the test set.</p>
Full article ">Figure 11
<p>Results of the different segmentation models on the test set.</p>
Full article ">
22 pages, 1360 KiB  
Article
Evaluation of the Performance of Neural and Non-Neural Methods to Classify the Severity of Work Accidents Occurring in the Footwear Industry Complex
by Jonhatan Magno Norte da Silva, Maria Luiza da Silva Braz, Joel Gomes da Silva, Lucas Gomes Miranda Bispo, Wilza Karla dos Santos Leite and Elamara Marama de Araujo Vieira
Appl. Syst. Innov. 2024, 7(5), 85; https://doi.org/10.3390/asi7050085 (registering DOI) - 15 Sep 2024
Abstract
In the footwear industry, occupational risks are significant, and work accidents are frequent. Professionals in the field prepare documents and reports about these accidents, but the need for more time and resources limits learning based on past incidents. Machine learning (ML) and deep [...] Read more.
In the footwear industry, occupational risks are significant, and work accidents are frequent. Professionals in the field prepare documents and reports about these accidents, but the need for more time and resources limits learning based on past incidents. Machine learning (ML) and deep learning (DL) methods have been applied to analyze data from these documents, identifying accident patterns and classifying the damage’s severity. However, evaluating the performance of these methods in different economic sectors is crucial. This study examined neural and non-neural methods for classifying the severity of workplace accidents in the footwear industry complex. The random forest (RF) and extreme gradient boosting (XGBoost) methods were the most effective non-neural methods. The neural methods 1D convolutional neural networks (1D-CNN) and bidirectional long short-term memory (Bi-LSTM) showed superior performance, with parameters above 98% and 99%, respectively, although with a longer training time. It is concluded that using these methods is viable for classifying accidents in the footwear industry. The methods can classify new accidents and simulate scenarios, demonstrating their adaptability and reliability in different economic sectors for accident prevention. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
19 pages, 3484 KiB  
Article
Efficient Visual-Aware Fashion Recommendation Using Compressed Node Features and Graph-Based Learning
by Umar Subhan Malhi, Junfeng Zhou, Abdur Rasool and Shahbaz Siddeeq
Mach. Learn. Knowl. Extr. 2024, 6(3), 2111-2129; https://doi.org/10.3390/make6030104 (registering DOI) - 15 Sep 2024
Abstract
In fashion e-commerce, predicting item compatibility using visual features remains a significant challenge. Current recommendation systems often struggle to incorporate high-dimensional visual data into graph-based learning models effectively. This limitation presents a substantial opportunity to enhance the precision and effectiveness of fashion recommendations. [...] Read more.
In fashion e-commerce, predicting item compatibility using visual features remains a significant challenge. Current recommendation systems often struggle to incorporate high-dimensional visual data into graph-based learning models effectively. This limitation presents a substantial opportunity to enhance the precision and effectiveness of fashion recommendations. In this paper, we present the Visual-aware Graph Convolutional Network (VAGCN). This novel framework helps improve how visual features can be incorporated into graph-based learning systems for fashion item compatibility predictions. The VAGCN framework employs a deep-stacked autoencoder to convert the input image’s high-dimensional raw CNN visual features into more manageable low-dimensional representations. In addition to improving feature representation, the GCN can also reason more intelligently about predictions, which would not be possible without this compression. The GCN encoder processes nodes in the graph to capture structural and feature correlation. Following the GCN encoder, the refined embeddings are input to a multi-layer perceptron (MLP) to calculate compatibility scores. The approach extends to using neighborhood information only during the testing phase to help with training efficiency and generalizability in practical scenarios, a key characteristic of our model. By leveraging its ability to capture latent visual features and neighborhood-based learning, VAGCN thoroughly investigates item compatibility across various categories. This method significantly improves predictive accuracy, consistently outperforming existing benchmarks. These contributions tackle significant scalability and computational efficiency challenges, showcasing the potential transformation of recommendation systems through enhanced feature representation, paving the way for further innovations in the fashion domain. Full article
(This article belongs to the Special Issue Machine Learning in Data Science)
Show Figures

Figure 1

Figure 1
<p>The diagram illustrates the model for predicting the compatibility scores of fashion items. (<b>a</b>) Feature extraction using CNN-F architecture produces a 4096-dimensional vector <span class="html-italic">x</span> from each fashion image, capturing detailed image features. (<b>b</b>) A deep-stacked autoencoder transforms these features into a latent space <span class="html-italic">y</span>, optimizing for subsequent processing. (<b>c</b>) We construct a relational graph by effectively merging item interactions (e.g., ’also viewed’, ’also bought’, ’bought together’) with node latent features, thereby enhancing data relational insights. (<b>d</b>) A tailored GCN encoder refines the graph, and then an edge prediction layer computes item compatibility scores.</p>
Full article ">Figure 2
<p>The diagram illustrates the GCN workflow, which begins with the initial input graph, which contains 256 node features. Subsequent graph convolution layers reduce the dimensionality to 64 features, utilizing ReLU activations for nonlinear transformations. The process concludes in the embedding space, where an MLP decoder utilizes node embeddings to calculate compatibility scores between nodes.</p>
Full article ">Figure 3
<p>ROC Curves for the Women’s and Men’s category interactions across different <span class="html-italic">k</span> values, showing the AUC metrics.</p>
Full article ">Figure 4
<p>Training and Validation Loss Trends for the VAGCN, demonstrating effective error minimization and consistent performance throughout the training process.</p>
Full article ">Figure 5
<p>Test Accuracy versus Neighborhood Size (k-values) for Men’s and Women’s Categories, indicating accuracy stabilization beyond k = 20 for various interaction types.</p>
Full article ">Figure 6
<p>Training loss and accuracy comparison between the GCN configurations (256, 128, 64) and Uniform GCN (256 × 3), showcasing the superior stability and efficiency of the decreasing layer size model.</p>
Full article ">Figure 7
<p>Performance comparison of different learning rates, indicating how a 0.01 learning rate achieves an optimal balance between rapid convergence and robust generalization.</p>
Full article ">Figure 8
<p>Visualization of link predictions in a women’s fashion test dataset, illustrating compatibility scores between item pairs. The scores are normalized between 0 and 1, where values closer to 1 indicate high compatibility and values closer to 0 denote low compatibility.</p>
Full article ">
20 pages, 20184 KiB  
Article
Snow Cover Extraction from Landsat 8 OLI Based on Deep Learning with Cross-Scale Edge-Aware and Attention Mechanism
by Zehao Yu, Hanying Gong, Shiqiang Zhang and Wei Wang
Remote Sens. 2024, 16(18), 3430; https://doi.org/10.3390/rs16183430 (registering DOI) - 15 Sep 2024
Abstract
Snow cover distribution is of great significance for climate change and water resource management. Current deep learning-based methods for extracting snow cover from remote sensing images face challenges such as insufficient local detail awareness and inadequate utilization of global semantic information. In this [...] Read more.
Snow cover distribution is of great significance for climate change and water resource management. Current deep learning-based methods for extracting snow cover from remote sensing images face challenges such as insufficient local detail awareness and inadequate utilization of global semantic information. In this study, a snow cover extraction algorithm integrating cross-scale edge perception and an attention mechanism on the U-net model architecture is proposed. The cross-scale edge perception module replaces the original jump connection of U-net, enhances the low-level image features by introducing edge detection on the shallow feature scale, and enhances the detail perception via branch separation and fusion features on the deep feature scale. Meanwhile, parallel channel and spatial attention mechanisms are introduced in the model encoding stage to adaptively enhance the model’s attention to key features and improve the efficiency of utilizing global semantic information. The method was evaluated on the publicly available CSWV_S6 optical remote sensing dataset, and the accuracy of 98.14% indicates that the method has significant advantages over existing methods. Snow extraction from Landsat 8 OLI images of the upper reaches of the Irtysh River was achieved with satisfactory accuracy rates of 95.57% (using two, three, and four bands) and 96.65% (using two, three, four, and six bands), indicating its strong potential for automated snow cover extraction over larger areas. Full article
Show Figures

Figure 1

Figure 1
<p>True color CSWV_S6 data synthesized from the red, green, and blue bands (the numbering in the figure corresponds to the original naming in the acquired files).</p>
Full article ">Figure 2
<p>(<b>a</b>) RGB composite of Landsat 8 imagery (red: band 4, green: band 3, blue: band 2). (<b>b</b>) Land cover types.</p>
Full article ">Figure 3
<p>CEFCSAU-net network model architecture. The input size was (512, 512, C), where C denotes the number of channels, and experiments in this paper utilized either 3 or 4; during the model’s operation on a GPU, intermediate feature maps were stored as tensors.</p>
Full article ">Figure 4
<p>Attention mechanism module for channel and space mixing. Here, (H, W, C) represent the height, width, and number of channels of the feature data, respectively, with values determined by input features at different stages. CF and CF’ denote feature maps from various intermediate operations within the channel attention mechanism. Cat Sf, Sf, Sf’ represent feature maps from different intermediate operations of the spatial attention mechanism. The SA feature denotes the feature map post-spatial attention mechanism, the CA feature represents those post-channel attention mechanisms, and the CSA feature illustrates feature maps following the CSA module.</p>
Full article ">Figure 5
<p>Cross-scale edge-aware feature fusion module. Sobelx F, Sobely F, and Laplacian F denote feature maps resulting from various edge detection operations. Shallow F refers to feature maps following shallow feature convolution. Fusion F illustrates feature maps resulting from the fusion of shallow and deep features. Deep F’ represents feature maps after a series of operations on deep features.</p>
Full article ">Figure 6
<p>Snow extraction results of CSWV_S6 data on different segmentation models (a set of two rows is the same image data, and rows two, four, and six are zoomed-in images of local details corresponding to rows one, three, and five. The blue area is snow, the white area is non-snow, and the red area is false detection).</p>
Full article ">Figure 7
<p>Snow extraction results from different deep learning models for Landsat 8 OLI imagery (blue areas are snow, white areas are non-snow, and red areas are false detections).</p>
Full article ">Figure 8
<p>(<b>a</b>) CSWV_S6 test set scores on different models for each type of metrics and (<b>b</b>) Landsat 8 OLI test set scores for various metrics on different models.</p>
Full article ">Figure 9
<p>Score results of the three CSWV_S6 test sets’ example data on the evaluation metrics on each model, with 0.08% of snow image elements in the first row of data, 0.95% of snow image elements in the second row of data, and 1.73% of data in the third row of data.</p>
Full article ">Figure 10
<p>Map of CEFCSAU-net model’s snow extraction in the cloud–snow confusion scenario of CSWV_S6 test set.</p>
Full article ">Figure 11
<p>Heat map comparing the mean values of ablation experiments on the test set with different data sets: (<b>a</b>) CSWV_6 dataset, (<b>b</b>) Landsat8 OLI dataset.</p>
Full article ">Figure 12
<p>(<b>a</b>) Input data image; (<b>b</b>) feature map of the first 8 channels before the intermediate feature data first pass through the CSA module; (<b>c</b>) feature map of the first 8 channels after the feature data first pass through the CSA module.</p>
Full article ">Figure 13
<p>This figure displays average feature maps following skip connections at various stages under different configurations of the CEFCSAU-net model. In this figure, (<b>a1</b>–<b>a4</b>) represent the model configuration without both the CSA and CEF modules; (<b>b1</b>–<b>b4</b>) indicate configurations without the CSA module yet including the CEF module; and (<b>c1</b>–<b>c4</b>) depict configurations featuring both CSA and CEF modules. The dimensions of the four columns of feature maps are sequentially 512 × 512, 256 × 256, 128 × 128, and 64 × 64.</p>
Full article ">
15 pages, 3249 KiB  
Article
The InterVision Framework: An Enhanced Fine-Tuning Deep Learning Strategy for Auto-Segmentation in Head and Neck
by Byongsu Choi, Chris J. Beltran, Sang Kyun Yoo, Na Hye Kwon, Jin Sung Kim and Justin Chunjoo Park
J. Pers. Med. 2024, 14(9), 979; https://doi.org/10.3390/jpm14090979 (registering DOI) - 15 Sep 2024
Abstract
Adaptive radiotherapy (ART) workflows are increasingly adopted to achieve dose escalation and tissue sparing under dynamic anatomical conditions. However, recontouring and time constraints hinder the implementation of real-time ART workflows. Various auto-segmentation methods, including deformable image registration, atlas-based segmentation, and deep learning-based segmentation [...] Read more.
Adaptive radiotherapy (ART) workflows are increasingly adopted to achieve dose escalation and tissue sparing under dynamic anatomical conditions. However, recontouring and time constraints hinder the implementation of real-time ART workflows. Various auto-segmentation methods, including deformable image registration, atlas-based segmentation, and deep learning-based segmentation (DLS), have been developed to address these challenges. Despite the potential of DLS methods, clinical implementation remains difficult due to the need for large, high-quality datasets to ensure model generalizability. This study introduces an InterVision framework for segmentation. The InterVision framework can interpolate or create intermediate visuals between existing images to generate specific patient characteristics. The InterVision model is trained in two steps: (1) generating a general model using the dataset, and (2) tuning the general model using the dataset generated from the InterVision framework. The InterVision framework generates intermediate images between existing patient image slides using deformable vectors, effectively capturing unique patient characteristics. By creating a more comprehensive dataset that reflects these individual characteristics, the InterVision model demonstrates the ability to produce more accurate contours compared to general models. Models are evaluated using the volumetric dice similarity coefficient (VDSC) and the Hausdorff distance 95% (HD95%) for 18 structures in 20 test patients. As a result, the Dice score was 0.81 ± 0.05 for the general model, 0.82 ± 0.04 for the general fine-tuning model, and 0.85 ± 0.03 for the InterVision model. The Hausdorff distance was 3.06 ± 1.13 for the general model, 2.81 ± 0.77 for the general fine-tuning model, and 2.52 ± 0.50 for the InterVision model. The InterVision model showed the best performance compared to the general model. The InterVision framework presents a versatile approach adaptable to various tasks where prior information is accessible, such as in ART settings. This capability is particularly valuable for accurately predicting complex organs and targets that pose challenges for traditional deep learning algorithms. Full article
(This article belongs to the Section Methodology, Drug and Device Discovery)
Show Figures

Figure 1

Figure 1
<p>The proposed InterVision framework. (1) illustrates the general model training using the original dataset, the training set and the validation set is divided using the original dataset. (2) illustrates the progress of the general fine-tuning model. The general fine-tuning model is using 1 personalized patient data for the training. For the evaluation, other fraction of the personalized patient data will be used. (3) shows the workflow of the InterVision framework. (3-1), (3-2) and (3-3) show the process of generating InterVision dataset.</p>
Full article ">Figure 2
<p>Conceptual representation of generating the InterVision dataset. A deformable vector is created by comparing each slide. Utilizing this deformable vector, we generate intermediate images between each slide. Consequently, we nearly doubled the size of the personalized dataset.</p>
Full article ">Figure 3
<p>Concept of calculating deformation vectors using control points. Images within the original image are repositioned based on the deformation vectors derived from each control point. The degree of deformation applied to a voxel increases as its proximity to the control point decreases.</p>
Full article ">Figure 4
<p>The architecture of Swin-Unet comprises an encoder, bottleneck, decoder, and skip connections. All components—the encoder, bottleneck, and decoder—are constructed using Swin Transformer blocks.</p>
Full article ">Figure 5
<p>Overview of the Swim Transformer block structure.</p>
Full article ">Figure 6
<p>Visual results of the optic chiasm (<b>a</b>), L cochlea (<b>b</b>) and L parotid (<b>c</b>) achieved by the general model, the general fine-tuning model and the InterVision model comparing with the manual contours in yellow.</p>
Full article ">
18 pages, 2857 KiB  
Article
AnyFace++: Deep Multi-Task, Multi-Domain Learning for Efficient Face AI
by Tomiris Rakhimzhanova, Askat Kuzdeuov and Huseyin Atakan Varol
Sensors 2024, 24(18), 5993; https://doi.org/10.3390/s24185993 (registering DOI) - 15 Sep 2024
Abstract
Accurate face detection and subsequent localization of facial landmarks are mandatory steps in many computer vision applications, such as emotion recognition, age estimation, and gender identification. Thanks to advancements in deep learning, numerous facial applications have been developed for human faces. However, most [...] Read more.
Accurate face detection and subsequent localization of facial landmarks are mandatory steps in many computer vision applications, such as emotion recognition, age estimation, and gender identification. Thanks to advancements in deep learning, numerous facial applications have been developed for human faces. However, most have to employ multiple models to accomplish several tasks simultaneously. As a result, they require more memory usage and increased inference time. Also, less attention is paid to other domains, such as animals and cartoon characters. To address these challenges, we propose an input-agnostic face model, AnyFace++, to perform multiple face-related tasks concurrently. The tasks are face detection and prediction of facial landmarks for human, animal, and cartoon faces, including age estimation, gender classification, and emotion recognition for human faces. We trained the model using deep multi-task, multi-domain learning with a heterogeneous cost function. The experimental results demonstrate that AnyFace++ generates outcomes comparable to cutting-edge models designed for specific domains. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

Figure 1
<p>The AnyFace++ network architecture is built on the YOLOv8 backbone network, includes its two existing output layers (object classification and bounding box regression) and also introduces new output layers (facial landmark regression, age regression, gender classification, and emotion classification).</p>
Full article ">Figure 2
<p>Examples of unlabeled faces in the validation set of Wider Face, detected by AnyFace++. The red bounding boxes are ground truth. The green bounding boxes with confidence scores are predictions: (<b>a</b>) dark faces, (<b>b</b>) blurry faces, and (<b>c</b>) a toy face.</p>
Full article ">Figure 3
<p>Examples of predictions by the multi-domain, multi-task face AI model, AnyFace++.</p>
Full article ">Figure 4
<p>Examples of predictions by AnyFace++: (<b>a</b>) underwater animals, (<b>b</b>) dolls, and (<b>c</b>) facelike objects.</p>
Full article ">
21 pages, 2749 KiB  
Article
Identification of Flow Pressure-Driven Leakage Zones Using Improved EDNN-PP-LCNetV2 with Deep Learning Framework in Water Distribution System
by Bo Dong, Shihu Shu and Dengxin Li
Processes 2024, 12(9), 1992; https://doi.org/10.3390/pr12091992 (registering DOI) - 15 Sep 2024
Abstract
This study introduces a novel deep learning framework for detecting leakage in water distribution systems (WDSs). The key innovation lies in a two-step process: First, the WDS is partitioned using a K-means clustering algorithm based on pressure sensitivity analysis. Then, an encoder–decoder neural [...] Read more.
This study introduces a novel deep learning framework for detecting leakage in water distribution systems (WDSs). The key innovation lies in a two-step process: First, the WDS is partitioned using a K-means clustering algorithm based on pressure sensitivity analysis. Then, an encoder–decoder neural network (EDNN) model is employed to extract and process the pressure and flow sensitivities. The core of the framework is the PP-LCNetV2 architecture that ensures the model’s lightweight, which is optimized for CPU devices. This combination ensures rapid, accurate leakage detection. Three cases are employed to evaluate the method. By applying data augmentation techniques, including the demand and measurement noises, the framework demonstrates robustness across different noise levels. Compared with other methods, the results show this method can efficiently detect over 90% of leakage across different operating conditions while maintaining a higher recognition of the magnitude of leakages. This research offers a significant improvement in computational efficiency and detection accuracy over existing approaches. Full article
(This article belongs to the Section Process Control and Monitoring)
Show Figures

Figure 1

Figure 1
<p>The flowchart of the general framework for partitioning and detecting leakages.</p>
Full article ">Figure 2
<p>The flowchart of EDNN-PP-LCNet: (<b>a</b>) EDNN; (<b>b</b>) PP-LCNetV2.</p>
Full article ">Figure 3
<p>The partitioning strategy in network A.</p>
Full article ">Figure 4
<p>The partitioning strategy in network B.</p>
Full article ">Figure 5
<p>The partitioning strategy in network C.</p>
Full article ">
21 pages, 3867 KiB  
Article
County-Level Cultivated Land Quality Evaluation Using Multi-Temporal Remote Sensing and Machine Learning Models: From the Perspective of National Standard
by Dingding Duan, Xinru Li, Yanghua Liu, Qingyan Meng, Chengming Li, Guotian Lin, Linlin Guo, Peng Guo, Tingting Tang, Huan Su, Weifeng Ma, Shikang Ming and Yadong Yang
Remote Sens. 2024, 16(18), 3427; https://doi.org/10.3390/rs16183427 (registering DOI) - 15 Sep 2024
Abstract
Scientific evaluation of cultivated land quality (CLQ) is necessary for promoting rational utilization of cultivated land and achieving one of the Sustainable Development Goals (SDGs): Zero Hunger. However, the CLQ evaluation system proposed in previous studies was diversified, and the methods were inefficient. [...] Read more.
Scientific evaluation of cultivated land quality (CLQ) is necessary for promoting rational utilization of cultivated land and achieving one of the Sustainable Development Goals (SDGs): Zero Hunger. However, the CLQ evaluation system proposed in previous studies was diversified, and the methods were inefficient. In this study, based on China’s first national standard “Cultivated Land Quality Grade” (GB/T 33469-2016), we constructed a unified county-level CLQ evaluation system by selecting 15 indicators from five aspects—site condition, environmental condition, physicochemical property, nutrient status and field management—and used the Delphi method to calculate the membership degree of the indicators. Taking Jimo district of Shandong Province, China, as a case study, we compared the performance of three machine learning models, including random forest, AdaBoost, and support vector regression, to evaluate CLQ using multi-temporal remote sensing data. The comprehensive index method was used to reveal the spatial distribution of CLQ. The results showed that the CLQ evaluation based on multi-temporal remote sensing data and machine learning model was efficient and reliable, and the evaluation results had a significant positive correlation with crop yield (r was 0.44, p < 0.001). The proportions of cultivated land of high-, medium- and poor-quality were 27.43%, 59.37% and 13.20%, respectively. The CLQ in the western part of the study area was better, while it was worse in the eastern and central parts. The main limiting factors include irrigation capacity and texture configuration. Accordingly, a series of targeted measures and policies were suggested, such as strengthening the construction of farmland water conservancy facilities, deep tillage of soil and continuing to construct well-facilitated farmland. This study proposed a fast and reliable method for evaluating CLQ, and the results are helpful to promote the protection of cultivated land and ensure food security. Full article
Show Figures

Figure 1

Figure 1
<p>Summary map of the study area. (<b>a</b>) Geographical location of Shandong province in China, (<b>b</b>) geographical location of Jimo district in Shandong province, (<b>c</b>) terrain feature of Jimo district and (<b>d</b>) spatial distribution of cultivated land and soil sampling points.</p>
Full article ">Figure 2
<p>Technology roadmap.</p>
Full article ">Figure 3
<p>Optimal prediction results of CLQ evaluation indicators: (<b>a</b>) soil organic matter (SOM), (<b>b</b>) soil pH, (<b>c</b>) available phosphorus (AP), (<b>d</b>) available potassium (AK) and (<b>e</b>) soil bulk density (SBD).</p>
Full article ">Figure 4
<p>Relationship between crop yield, CLQ index (<b>a</b>) and CLQ grade (<b>b</b>).</p>
Full article ">Figure 5
<p>Spatial distribution of CLQ grade and level in Jimo district. DX: Daxin Street; LIS: Lingshan Street; LC: Lancun Street; TJ: Tongji Street; CH: Chaohai Street; TH: Tianheng town; JK: Jinkou town; BA: Beian Street; LOS: Longshan Street; HX: Huanxiu Street; YSD: Yifengdian town; ASW: Aoshanwei Street; DBL: Duanbolan town; LQ: Longquan Street; and WQ: Wenquan Street.</p>
Full article ">Figure 6
<p>Spatial distribution of CLQ factor obstacle degree.</p>
Full article ">Figure 7
<p>Average and maximum obstacle degrees of CLQ evaluation indicators.</p>
Full article ">
18 pages, 588 KiB  
Article
A Combinatorial Strategy for API Completion: Deep Learning and Heuristics
by Yi Liu, Yiming Yin, Jia Deng, Weimin Li and Zhichao Peng
Electronics 2024, 13(18), 3669; https://doi.org/10.3390/electronics13183669 (registering DOI) - 15 Sep 2024
Abstract
Remembering software library components and mastering their application programming interfaces (APIs) is a daunting task for programmers, due to the sheer volume of available libraries. API completion tools, which predict subsequent APIs based on code context, are essential for improving development efficiency. Existing [...] Read more.
Remembering software library components and mastering their application programming interfaces (APIs) is a daunting task for programmers, due to the sheer volume of available libraries. API completion tools, which predict subsequent APIs based on code context, are essential for improving development efficiency. Existing API completion techniques, however, face specific weaknesses that limit their performance. Pattern-based code completion methods that rely on statistical information excel in extracting common usage patterns of API sequences. However, they often struggle to capture the semantics of the surrounding code. In contrast, deep-learning-based approaches excel in understanding the semantics of the code but may miss certain common usages that can be easily identified by pattern-based methods. Our insight into overcoming these challenges is based on the complementarity between these two types of approaches. This paper proposes a combinatorial method of API completion that aims to exploit the strengths of both pattern-based and deep-learning-based approaches. The basic idea is to utilize a confidence-based selector to determine which type of approach should be utilized to generate predictions. Pattern-based approaches will only be applied if the frequency of a particular pattern exceeds a pre-defined threshold, while in other cases, deep learning models will be utilized to generate the API completion results. The results showed that our approach dramatically improved the accuracy and mean reciprocal rank (MRR) in large-scale experiments, highlighting its utility. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of transformer model structure.</p>
Full article ">Figure 2
<p>An example where the Transformer model predicts correctly but the n-gram method does not, where the red ? denotes the API to be completed.</p>
Full article ">Figure 3
<p>An example where the n-gram method correctly predicts but the Transformer model does not, where the red ? denotes the API to be completed.</p>
Full article ">Figure 4
<p>Overview of the combinatorial strategy that combines deep learning and heuristics for API completion.</p>
Full article ">Figure 5
<p>Schematic diagram of API call sequence.</p>
Full article ">Figure 6
<p>Schematic diagram of the combinatorial strategy for API prediction.</p>
Full article ">Figure 7
<p>The accuracy (<b>left</b>) and MRR (<b>right</b>) of the 5-gram, Transformer, and DLH-API models.</p>
Full article ">Figure 8
<p>The accuracy of absolute frequency (<b>left</b>) and relative frequency (<b>right</b>) at different thresholds.</p>
Full article ">Figure 9
<p>Comparison of accuracy (<b>left</b>) and MRR (<b>right</b>) for different values of <span class="html-italic">n</span>.</p>
Full article ">
18 pages, 1502 KiB  
Article
Efficient Multi-View Graph Convolutional Network with Self-Attention for Multi-Class Motor Imagery Decoding
by Xiyue Tan, Dan Wang, Meng Xu, Jiaming Chen and Shuhan Wu
Bioengineering 2024, 11(9), 926; https://doi.org/10.3390/bioengineering11090926 (registering DOI) - 15 Sep 2024
Abstract
Research on electroencephalogram-based motor imagery (MI-EEG) can identify the limbs of subjects that generate motor imagination by decoding EEG signals, which is an important issue in the field of brain–computer interface (BCI). Existing deep-learning-based classification methods have not been able to entirely employ [...] Read more.
Research on electroencephalogram-based motor imagery (MI-EEG) can identify the limbs of subjects that generate motor imagination by decoding EEG signals, which is an important issue in the field of brain–computer interface (BCI). Existing deep-learning-based classification methods have not been able to entirely employ the topological information among brain regions, and thus, the classification performance needs further improving. In this paper, we propose a multi-view graph convolutional attention network (MGCANet) with residual learning structure for multi-class MI decoding. Specifically, we design a multi-view graph convolution spatial feature extraction method based on the topological relationship of brain regions to achieve more comprehensive information aggregation. During the modeling, we build an adaptive weight fusion (Awf) module to adaptively merge feature from different brain views to improve classification accuracy. In addition, the self-attention mechanism is introduced for feature selection to expand the receptive field of EEG signals to global dependence and enhance the expression of important features. The proposed model is experimentally evaluated on two public MI datasets and achieved a mean accuracy of 78.26% (BCIC IV 2a dataset) and 73.68% (OpenBMI dataset), which significantly outperforms representative comparative methods in classification accuracy. Comprehensive experiment results verify the effectiveness of our proposed method, which can provide novel perspectives for MI decoding. Full article
(This article belongs to the Section Biosignal Processing)
27 pages, 19966 KiB  
Article
An Underwater Crack Detection System Combining New Underwater Image-Processing Technology and an Improved YOLOv9 Network
by Xinbo Huang, Chenxi Liang, Xinyu Li and Fei Kang
Sensors 2024, 24(18), 5981; https://doi.org/10.3390/s24185981 (registering DOI) - 15 Sep 2024
Viewed by 194
Abstract
Underwater cracks are difficult to detect and observe, posing a major challenge to crack detection. Currently, deep learning-based underwater crack detection methods rely heavily on a large number of crack images that are difficult to collect due to their complex and hazardous underwater [...] Read more.
Underwater cracks are difficult to detect and observe, posing a major challenge to crack detection. Currently, deep learning-based underwater crack detection methods rely heavily on a large number of crack images that are difficult to collect due to their complex and hazardous underwater environments. This study proposes a new underwater image-processing method that combines a novel white balance method and bilateral filtering denoising method to transform underwater crack images into high-quality above-water images with original crack features. Crack detection is then performed based on an improved YOLOv9-OREPA model. Through experiments, it is found that the new image-processing method proposed in this study significantly improves the evaluation indicators of new images, compared with other methods. The improved YOLOv9-OREPA also exhibits a significantly improved performance. The experimental results demonstrate that the method proposed in this study is a new approach suitable for detecting underwater cracks in dams and achieves the goal of transforming underwater images into above-water images. Full article
Show Figures

Figure 1

Figure 1
<p>Underwater imaging principle: (<b>a</b>) underwater light loss diagram; (<b>b</b>) underwater attenuation rate graph of different light rays.</p>
Full article ">Figure 2
<p>Principle of the new white balance method.</p>
Full article ">Figure 3
<p>Five white balance algorithms: (<b>a</b>) mean white balance; (<b>b</b>) perfect reflection; (<b>c</b>) ours; (<b>d</b>) color cast; (<b>e</b>) gray world.</p>
Full article ">Figure 4
<p>Four denoising methods: (<b>a</b>) bilateral filtering; (<b>b</b>) median filtering; (<b>c</b>) mean filtering; (<b>d</b>) Gaussian filtering.</p>
Full article ">Figure 5
<p>New underwater image-processing workflow.</p>
Full article ">Figure 6
<p>Infrastructure diagram of YOLOv9.</p>
Full article ">Figure 7
<p>Overall architecture of PGI.</p>
Full article ">Figure 8
<p>Structure diagrams of three network architectures: (<b>a</b>) CSPNeT; (<b>b</b>) GELAN; (<b>c</b>) ELAN.</p>
Full article ">Figure 9
<p>Specific steps of the OREPA process.</p>
Full article ">Figure 10
<p>Four components proposed in OREPA: (<b>a</b>) frequency prior filter; (<b>b</b>) linear depth separable convolution; (<b>c</b>) 1 × 1 convolution with heavy parameters; (<b>d</b>) linear deep stem cells.</p>
Full article ">Figure 11
<p>Comparison of different convolutional layers during training: (<b>a</b>) standard convolutional layer (without intermediate parameterization); (<b>b</b>) typical reparameterization; (<b>c</b>) OREPA.</p>
Full article ">Figure 12
<p>Infrastructure diagram of the YOLOv9-OREPA model.</p>
Full article ">Figure 13
<p>Calculation principle of the loss function.</p>
Full article ">Figure 14
<p>Experimental facilities: (<b>a</b>) laboratory pool; (<b>b</b>) crack defect wall; (<b>c</b>) underwater robots used.</p>
Full article ">Figure 15
<p>Experimental process image.</p>
Full article ">Figure 16
<p>Five underwater images with crack features under low light conditions.</p>
Full article ">Figure 17
<p>Image white balance operations: (<b>a</b>) original image; (<b>b</b>) mean white balance; (<b>c</b>) perfect reflection; (<b>d</b>) ours; (<b>e</b>) color cast; (<b>f</b>) gray world.</p>
Full article ">Figure 18
<p>The white balance processing results of the remaining four original images: (<b>a</b>) processing results of <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>b; (<b>b</b>) processing results of <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>c; (<b>c</b>) processing results of <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>d; (<b>d</b>) processing results of <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>e.</p>
Full article ">Figure 19
<p>Images processed via our proposed method: (<b>a</b>) original picture from <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>a; (<b>b</b>) original picture from <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>b; (<b>c</b>) original picture from <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>c; (<b>d</b>) original picture from <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>d; (<b>e</b>) original picture from <a href="#sensors-24-05981-f016" class="html-fig">Figure 16</a>e.</p>
Full article ">Figure 20
<p>Image denoising process diagram: (<b>a</b>) input image; (<b>b</b>) bilateral filtering; (<b>c</b>) median filtering; (<b>d</b>) mean filtering (1 × 1 convolutional kernel); (<b>e</b>) mean filtering (3 × 3 convolutional kernel); (<b>f</b>) mean filtering (5 × 5 convolutional kernel); (<b>g</b>) mean filtering (7 × 7 convolutional kernel); (<b>h</b>) Gaussian filtering (1 × 1 convolutional kernel); (<b>i</b>) Gaussian filtering (3 × 3 convolutional kernel); (<b>j</b>) Gaussian filtering (5 × 5 convolutional kernel); (<b>k</b>) Gaussian filtering (7 × 7 convolutional kernel).</p>
Full article ">Figure 21
<p>Various performance indicators of different models: (<b>a</b>) Train_loss; (<b>b</b>) Val_loss; (<b>c</b>) precision; (<b>d</b>) recall; (<b>e</b>) <span class="html-italic">mAP</span><sub>0.5</sub>; (<b>f</b>) <span class="html-italic">mAP</span><sub>0.5:0.95</sub>.</p>
Full article ">Figure 21 Cont.
<p>Various performance indicators of different models: (<b>a</b>) Train_loss; (<b>b</b>) Val_loss; (<b>c</b>) precision; (<b>d</b>) recall; (<b>e</b>) <span class="html-italic">mAP</span><sub>0.5</sub>; (<b>f</b>) <span class="html-italic">mAP</span><sub>0.5:0.95</sub>.</p>
Full article ">Figure 22
<p>Other performance indicators of different models: (<b>a</b>) the precision–recall curve; (<b>b</b>) <span class="html-italic">F</span><sub>1</sub> curve.</p>
Full article ">Figure 23
<p>Crack detection results of different YOLOv9 models.</p>
Full article ">Figure 24
<p>Crack detection results of two experimental images.</p>
Full article ">Figure 25
<p>The detection process of other underwater and surface crack images: (<b>a</b>,<b>b</b>) underwater crack images under low light conditions; (<b>c</b>,<b>d</b>) underwater crack images with significant color differences; (<b>e</b>,<b>f</b>) above-water images.</p>
Full article ">
16 pages, 2651 KiB  
Article
Application of the ALRW-DDPG Algorithm in Offshore Oil–Gas–Water Separation Control
by Xiaoyong He, Han Pang, Boying Liu and Yuqing Chen
Energies 2024, 17(18), 4623; https://doi.org/10.3390/en17184623 (registering DOI) - 14 Sep 2024
Viewed by 277
Abstract
With the offshore oil–gas fields entering a decline phase, the high-efficiency separation of oil–gas–water mixtures becomes a significant challenge. As essential equipment for separation, the three-phase separators play a key role in offshore oil–gas production. However, level control is critical in the operation [...] Read more.
With the offshore oil–gas fields entering a decline phase, the high-efficiency separation of oil–gas–water mixtures becomes a significant challenge. As essential equipment for separation, the three-phase separators play a key role in offshore oil–gas production. However, level control is critical in the operation of three-phase gravity separators on offshore facilities, as it directly affects the efficacy and safety of the separation process. This paper introduces an advanced deep deterministic policy gradient with the adaptive learning rate weights (ALRW-DDPG) control algorithm, which improves the convergence and stability of the conventional DDPG algorithm. An adaptive learning rate weight function has been meticulously designed, and an ALRW-DDPG algorithm network has been constructed to simulate three-phase separator liquid level control. The effectiveness of the ALRW-DDPG algorithm is subsequently validated through simulation experiments. The results show that the ALRW-DDPG algorithm achieves a 15.38% improvement in convergence rate compared to the traditional DDPG algorithm, and the control error is significantly smaller than that of PID and DDPG algorithms. Full article
(This article belongs to the Special Issue Advances in Ocean Energy Technologies and Applications)
22 pages, 3992 KiB  
Article
A Lightweight Cotton Verticillium Wilt Hazard Level Real-Time Assessment System Based on an Improved YOLOv10n Model
by Juan Liao, Xinying He, Yexiong Liang, Hui Wang, Haoqiu Zeng, Xiwen Luo, Xiaomin Li, Lei Zhang, He Xing and Ying Zang
Agriculture 2024, 14(9), 1617; https://doi.org/10.3390/agriculture14091617 (registering DOI) - 14 Sep 2024
Viewed by 228
Abstract
Compared to traditional manual methods for assessing the cotton verticillium wilt (CVW) hazard level, utilizing deep learning models for foliage segmentation can significantly improve the evaluation accuracy. However, instance segmentation methods for images with complex backgrounds often suffer from low accuracy and delayed [...] Read more.
Compared to traditional manual methods for assessing the cotton verticillium wilt (CVW) hazard level, utilizing deep learning models for foliage segmentation can significantly improve the evaluation accuracy. However, instance segmentation methods for images with complex backgrounds often suffer from low accuracy and delayed segmentation. To address this issue, an improved model, YOLO-VW, with high accuracy, high efficiency, and a light weight, was proposed for CVW hazard level assessment based on the YOLOv10n model. (1) It replaced conventional convolutions with the lightweight GhostConv, reducing the computational time. (2) The STC module based on the Swin Transformer enhanced the expression of foliage and disease spot boundary features, further reducing the model size. (3) It integrated a squeeze-and-excitation (SE) attention mechanism to suppress irrelevant background information. (4) It employed the stochastic gradient descent (SGD) optimizer to enhance the performance and shorten the detection time. The improved CVW severity assessment model was then deployed on a server, and a real-time detection application (APP) for CVW severity assessment was developed based on this model. The results indicated the following. (1) The YOLO-VW model achieved a mean average precision (mAP) of 89.2% and a frame per second (FPS) rate of 157.98 f/s in assessing CVW, representing improvements of 2.4% and 21.37 f/s over the original model, respectively. (2) The YOLO-VW model’s parameters and floating point operations per second (FLOPs) were 1.59 M and 7.8 G, respectively, compressed by 44% and 33.9% compared to the original YOLOv10n model. (3) After deploying the YOLO-VW model on a smartphone, the processing time for each image was 2.42 s, and the evaluation accuracy under various environmental conditions reached 85.5%, representing a 15% improvement compared to the original YOLOv10n model. Based on these findings, YOLO-VW meets the requirements for real-time detection, offering greater robustness, efficiency, and portability in practical applications. This model provides technical support for controlling CVW and developing cotton varieties resistant to verticillium wilt. Full article
(This article belongs to the Section Digital Agriculture)
Back to TopTop