[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (717)

Search Parameters:
Keywords = MobileNet V2

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 3843 KiB  
Article
An Efficient One-Dimensional Texture Representation Approach for Lung Disease Diagnosis
by Abrar Alabdulwahab, Hyun-Cheol Park, Heon Jeong and Sang-Woong Lee
Appl. Sci. 2024, 14(22), 10661; https://doi.org/10.3390/app142210661 - 18 Nov 2024
Abstract
The remarkable increase in published medical imaging datasets for chest X-rays has significantly improved the performance of deep learning techniques to classify lung diseases efficiently. However, large datasets require special arrangements to make them suitable, accessible, and practically usable in remote clinics and [...] Read more.
The remarkable increase in published medical imaging datasets for chest X-rays has significantly improved the performance of deep learning techniques to classify lung diseases efficiently. However, large datasets require special arrangements to make them suitable, accessible, and practically usable in remote clinics and emergency rooms. Additionally, it increases the computational time and image-processing complexity. This study investigates the efficiency of converting the 2D chest X-ray into one-dimensional texture representation data using descriptive statistics and local binary patterns, enabling the use of feed-forward neural networks to efficiently classify lung diseases within a short time and with cost effectiveness. This method bridges diagnostic gaps in healthcare services and improves patient outcomes in remote hospitals and emergency rooms. It also could reinforce the crucial role of technology in advancing healthcare. Utilizing the Guangzhou and PA datasets, our one-dimensional texture representation achieved 99% accuracy with a training time of 10.85 s and 0.19 s for testing. In the PA dataset, it achieved 96% accuracy with a training time of 38.14 s and a testing time of 0.17 s, outperforming EfficientNet, EfficientNet-V2-Small, and MobileNet-V3-Small. Therefore, this study suggests that the dimensional texture representation is fast and effective for lung disease classification. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>The overall methodology for 1DTR uses statistical methods and LBP to extract the texture features of the input images.</p>
Full article ">Figure 2
<p>The process involves extracting features from an input image by calculating its mean, standard deviation, and binary pattern. Interpolation is used to ensure all features are the same length. The 1D data are then preprocessed using a standard scaler, and the string labels are converted into integers using a label encoder. Finally, the preprocessed data are passed to an FFNN model.</p>
Full article ">Figure 3
<p>We normalized the histogram of LBP values to use it as a 1D feature. The <span class="html-italic">x</span>-axis represents the number of bins and the <span class="html-italic">y</span>-axis represents the frequency.</p>
Full article ">Figure 4
<p>Guangzhou dataset samples: Normal, Bacteria, and Virus. Where R is the right side of the body.</p>
Full article ">Figure 5
<p>PA dataset samples: Normal, COVID, and Pneumonia. Where R is the right side of the body and L is the left side of the body.</p>
Full article ">Figure 6
<p>Feature visualization of Guangzhou dataset and PA dataset.</p>
Full article ">Figure 7
<p>Confusion matrix of each experiment, comparing the performance of 1DTR with the neural network models (EfficientNet, EfficientNet-V2-Small, and MobileNet-V3-Small). (<b>a</b>) 1DTR Guangzhou dataset confusion matrix. (<b>b</b>) EfficientNet Guangzhou dataset confusion matrix. (<b>c</b>) EfficientNet-V2-Small Guangzhou dataset confusion matrix. (<b>d</b>) MobileNet-V3-Small Guangzhou dataset confusion matrix. (<b>e</b>) 1DTR PA dataset confusion matrix. (<b>f</b>) EfficientNet PA dataset confusion matrix. (<b>g</b>) EfficientNet-V2-Small PA dataset confusion matrix. (<b>h</b>) MobileNet-V3-Small PA dataset confusion matrix.</p>
Full article ">Figure 7 Cont.
<p>Confusion matrix of each experiment, comparing the performance of 1DTR with the neural network models (EfficientNet, EfficientNet-V2-Small, and MobileNet-V3-Small). (<b>a</b>) 1DTR Guangzhou dataset confusion matrix. (<b>b</b>) EfficientNet Guangzhou dataset confusion matrix. (<b>c</b>) EfficientNet-V2-Small Guangzhou dataset confusion matrix. (<b>d</b>) MobileNet-V3-Small Guangzhou dataset confusion matrix. (<b>e</b>) 1DTR PA dataset confusion matrix. (<b>f</b>) EfficientNet PA dataset confusion matrix. (<b>g</b>) EfficientNet-V2-Small PA dataset confusion matrix. (<b>h</b>) MobileNet-V3-Small PA dataset confusion matrix.</p>
Full article ">
29 pages, 2424 KiB  
Article
Hybrid-DC: A Hybrid Framework Using ResNet-50 and Vision Transformer for Steel Surface Defect Classification in the Rolling Process
by Minjun Jeong, Minyeol Yang and Jongpil Jeong
Electronics 2024, 13(22), 4467; https://doi.org/10.3390/electronics13224467 - 14 Nov 2024
Viewed by 291
Abstract
This study introduces Hybrid-DC, a hybrid deep-learning model integrating ResNet-50 and Vision Transformer (ViT) for high-accuracy steel surface defect classification. Hybrid-DC leverages ResNet-50 for efficient feature extraction at both low and high levels and utilizes ViT’s global context learning to enhance classification precision. [...] Read more.
This study introduces Hybrid-DC, a hybrid deep-learning model integrating ResNet-50 and Vision Transformer (ViT) for high-accuracy steel surface defect classification. Hybrid-DC leverages ResNet-50 for efficient feature extraction at both low and high levels and utilizes ViT’s global context learning to enhance classification precision. A unique hybrid attention layer and an attention fusion mechanism enable Hybrid-DC to adapt to the complex, variable patterns typical of steel surface defects. Experimental evaluations demonstrate that Hybrid-DC achieves substantial accuracy improvements and significantly reduced loss compared to traditional models like MobileNetV2 and ResNet, with a validation accuracy reaching 0.9944. The results suggest that this model, characterized by rapid convergence and stable learning, can be applied for real-time quality control in steel manufacturing and other high-precision industries, enhancing automated defect detection efficiency. Full article
(This article belongs to the Special Issue Fault Detection Technology Based on Deep Learning)
Show Figures

Figure 1

Figure 1
<p>Steel manufacturing process.</p>
Full article ">Figure 2
<p>Hybrid-DC framework.</p>
Full article ">Figure 3
<p>Six types of defects.</p>
Full article ">Figure 4
<p>MobileNetV2 model.</p>
Full article ">Figure 5
<p>ResNet model.</p>
Full article ">Figure 6
<p>ViT model.</p>
Full article ">Figure 7
<p>Hybrid-DC model.</p>
Full article ">Figure 8
<p>MobileNetV2 model.</p>
Full article ">Figure 9
<p>ResNet model.</p>
Full article ">Figure 10
<p>ViT model.</p>
Full article ">Figure 11
<p>Hybrid-DC model.</p>
Full article ">Figure 12
<p>Crazing defect.</p>
Full article ">Figure 13
<p>Inclusion defect.</p>
Full article ">Figure 14
<p>Patches defect.</p>
Full article ">Figure 15
<p>Pitted defect.</p>
Full article ">Figure 16
<p>Rolled defect.</p>
Full article ">Figure 17
<p>Scratches defect.</p>
Full article ">
27 pages, 6796 KiB  
Article
A Hybrid Deep Learning and Machine Learning Approach with Mobile-EfficientNet and Grey Wolf Optimizer for Lung and Colon Cancer Histopathology Classification
by Raquel Ochoa-Ornelas, Alberto Gudiño-Ochoa and Julio Alberto García-Rodríguez
Cancers 2024, 16(22), 3791; https://doi.org/10.3390/cancers16223791 - 11 Nov 2024
Viewed by 444
Abstract
Background: Lung and colon cancers are among the most prevalent and lethal malignancies worldwide, underscoring the urgent need for advanced diagnostic methodologies. This study aims to develop a hybrid deep learning and machine learning framework for the classification of Colon Adenocarcinoma, Colon Benign [...] Read more.
Background: Lung and colon cancers are among the most prevalent and lethal malignancies worldwide, underscoring the urgent need for advanced diagnostic methodologies. This study aims to develop a hybrid deep learning and machine learning framework for the classification of Colon Adenocarcinoma, Colon Benign Tissue, Lung Adenocarcinoma, Lung Benign Tissue, and Lung Squamous Cell Carcinoma from histopathological images. Methods: Current approaches primarily rely on the LC25000 dataset, which, due to image augmentation, lacks the generalizability required for real-time clinical applications. To address this, Contrast Limited Adaptive Histogram Equalization (CLAHE) was applied to enhance image quality, and 1000 new images from the National Cancer Institute GDC Data Portal were introduced into the Colon Adenocarcinoma, Lung Adenocarcinoma, and Lung Squamous Cell Carcinoma classes, replacing augmented images to increase dataset diversity. A hybrid feature extraction model combining MobileNetV2 and EfficientNetB3 was optimized using the Grey Wolf Optimizer (GWO), resulting in the Lung and Colon histopathological classification technique (MEGWO-LCCHC). Cross-validation and hyperparameter tuning with Optuna were performed on various machine learning models, including XGBoost, LightGBM, and CatBoost. Results: The MEGWO-LCCHC technique achieved high classification accuracy, with the lightweight DNN model reaching 94.8%, LightGBM at 93.9%, XGBoost at 93.5%, and CatBoost at 93.3% on the test set. Conclusions: The findings suggest that our approach enhances classification performance and offers improved generalizability for real-world clinical applications. The proposed MEGWO-LCCHC framework shows promise as a robust tool in cancer diagnostics, advancing the application of AI in oncology. Full article
(This article belongs to the Special Issue Image Analysis and Machine Learning in Cancers)
Show Figures

Figure 1

Figure 1
<p>Overview of the proposed MEGWO-LCCHC model for colon and lung cancer detection.</p>
Full article ">Figure 2
<p>Contrast enhancement of histopathological images using CLAHE: (<b>a</b>) Original image of lung adenocarcinoma tissue; (<b>b</b>) Enhanced image demonstrating improved local contrast and detail in tissue structures.</p>
Full article ">Figure 3
<p>Sample histopathological images of cancer tissues: (<b>a</b>) Enhanced images of colon adenocarcinoma and colon benign tissue using CLAHE; (<b>b</b>) Enhanced images of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue using CLAHE.</p>
Full article ">Figure 4
<p>Confusion matrices illustrating the classification performance of the MEGWO-LCCHC approach across different classifiers: (<b>a</b>) XGBoost during Training; (<b>b</b>) XGBoost during Testing; (<b>c</b>) LightGBM during Training; (<b>d</b>) LightGBM during Testing; (<b>e</b>) CatBoost during Training; (<b>f</b>) CatBoost during Testing; (<b>g</b>) Lightweight DNN during Training; (<b>h</b>) Lightweight DNN during Testing.</p>
Full article ">Figure 4 Cont.
<p>Confusion matrices illustrating the classification performance of the MEGWO-LCCHC approach across different classifiers: (<b>a</b>) XGBoost during Training; (<b>b</b>) XGBoost during Testing; (<b>c</b>) LightGBM during Training; (<b>d</b>) LightGBM during Testing; (<b>e</b>) CatBoost during Training; (<b>f</b>) CatBoost during Testing; (<b>g</b>) Lightweight DNN during Training; (<b>h</b>) Lightweight DNN during Testing.</p>
Full article ">Figure 5
<p>Average evaluation metrics of the MEGWO-LCCHC method across the 80:20 training and testing phases for all classifiers: (<b>a</b>) Metrics during the training phase; (<b>b</b>) Metrics during the testing phase.</p>
Full article ">Figure 6
<p>Confidence interval analysis of testing classification metrics for the MEGWO-LCCHC system in colon and lung cancer detection models: (<b>a</b>) Precision; (<b>b</b>) Recall; (<b>c</b>) F1-score.</p>
Full article ">Figure 7
<p>Loss and accuracy curves for the MEGWO-LCCHC method with the lightweight DNN: (<b>a</b>) Loss curve; (<b>b</b>) Accuracy curve.</p>
Full article ">Figure 8
<p>ROC and PR curve analysis of the MEGWO-LCCHC system using an 80:20 training/testing split: (<b>a</b>) ROC curve for CatBoost in the testing phase; (<b>b</b>) PR curve for CatBoost in the testing phase; (<b>c</b>) ROC curve for the Lightweight DNN in the testing phase; (<b>d</b>) PR curve for the Lightweight DNN in the testing phase.</p>
Full article ">
32 pages, 6809 KiB  
Article
Integrating Deep Learning and Energy Management Standards for Enhanced Solar–Hydrogen Systems: A Study Using MobileNetV2, InceptionV3, and ISO 50001:2018
by Salaki Reynaldo Joshua, Yang Junghyun, Sanguk Park and Kihyeon Kwon
Hydrogen 2024, 5(4), 819-850; https://doi.org/10.3390/hydrogen5040043 - 10 Nov 2024
Viewed by 954
Abstract
This study addresses the growing need for effective energy management solutions in university settings, with particular emphasis on solar–hydrogen systems. The study’s purpose is to explore the integration of deep learning models, specifically MobileNetV2 and InceptionV3, in enhancing fault detection capabilities in AIoT-based [...] Read more.
This study addresses the growing need for effective energy management solutions in university settings, with particular emphasis on solar–hydrogen systems. The study’s purpose is to explore the integration of deep learning models, specifically MobileNetV2 and InceptionV3, in enhancing fault detection capabilities in AIoT-based environments, while also customizing ISO 50001:2018 standards to align with the unique energy management needs of academic institutions. Our research employs comparative analysis of the two deep learning models in terms of their performance in detecting solar panel defects and assessing accuracy, loss values, and computational efficiency. The findings reveal that MobileNetV2 achieves 80% accuracy, making it suitable for resource-constrained environments, while InceptionV3 demonstrates superior accuracy of 90% but requires more computational resources. The study concludes that both models offer distinct advantages based on application scenarios, emphasizing the importance of balancing accuracy and efficiency when selecting appropriate models for solar–hydrogen system management. This research highlights the critical role of continuous improvement and leadership commitment in the successful implementation of energy management standards in universities. Full article
Show Figures

Figure 1

Figure 1
<p>Design approach graph.</p>
Full article ">Figure 2
<p>Organization Structure at Kangwon National University Samcheok Campus.</p>
Full article ">Figure 3
<p>The University Organization Structure by Adapting ISO 50001:2018.</p>
Full article ">Figure 4
<p>Training data.</p>
Full article ">Figure 5
<p>Loss and accuracy result: (<b>a</b>) MobileNetV2, (<b>b</b>) InceptionV3.</p>
Full article ">Figure 6
<p>Prediction result: (<b>a</b>) MobileNetV2, (<b>b</b>) InceptionV3. Green Color: Correct Prediction, Red Color: Wrong Prediction.</p>
Full article ">
22 pages, 12107 KiB  
Article
Deep Learning-Based Classification of Macrofungi: Comparative Analysis of Advanced Models for Accurate Fungi Identification
by Sifa Ozsari, Eda Kumru, Fatih Ekinci, Ilgaz Akata, Mehmet Serdar Guzel, Koray Acici, Eray Ozcan and Tunc Asuroglu
Sensors 2024, 24(22), 7189; https://doi.org/10.3390/s24227189 - 9 Nov 2024
Viewed by 571
Abstract
This study focuses on the classification of six different macrofungi species using advanced deep learning techniques. Fungi species, such as Amanita pantherina, Boletus edulis, Cantharellus cibarius, Lactarius deliciosus, Pleurotus ostreatus and Tricholoma terreum were chosen based on their ecological [...] Read more.
This study focuses on the classification of six different macrofungi species using advanced deep learning techniques. Fungi species, such as Amanita pantherina, Boletus edulis, Cantharellus cibarius, Lactarius deliciosus, Pleurotus ostreatus and Tricholoma terreum were chosen based on their ecological importance and distinct morphological characteristics. The research employed 5 different machine learning techniques and 12 deep learning models, including DenseNet121, MobileNetV2, ConvNeXt, EfficientNet, and swin transformers, to evaluate their performance in identifying fungi from images. The DenseNet121 model demonstrated the highest accuracy (92%) and AUC score (95%), making it the most effective in distinguishing between species. The study also revealed that transformer-based models, particularly the swin transformer, were less effective, suggesting room for improvement in their application to this task. Further advancements in macrofungi classification could be achieved by expanding datasets, incorporating additional data types such as biochemical, electron microscopy, and RNA/DNA sequences, and using ensemble methods to enhance model performance. The findings contribute valuable insights into both the use of deep learning for biodiversity research and the ecological conservation of macrofungi species. Full article
Show Figures

Figure 1

Figure 1
<p>Overview of datasets utilized for training AI algorithms, presented from a macroscopic perspective.</p>
Full article ">Figure 2
<p>Validation accuracy.</p>
Full article ">Figure 3
<p>ROC curve.</p>
Full article ">Figure 4
<p>İmages without Grad-CAM visualization.</p>
Full article ">Figure 5
<p>ConvNeXt Grad-CAM visualization.</p>
Full article ">Figure 6
<p>EfficientNet Grad-CAM visualization.</p>
Full article ">Figure 7
<p>DenseNet121, InceptionV3, and InceptionResNetV2 Grad-CAM visualization.</p>
Full article ">Figure 8
<p>MobileNetV2, ResNet152, and Xception Grad-CAM visualization.</p>
Full article ">Figure 9
<p>Different levels of Gaussian white noise [<a href="#B40-sensors-24-07189" class="html-bibr">40</a>].</p>
Full article ">Figure 10
<p>DenseNet121 and MobileNetV2 Grad-CAM visualization on SNR-10 noisy images.</p>
Full article ">
18 pages, 8490 KiB  
Article
Wildfire Identification Based on an Improved MobileNetV3-Small Model
by Guo-Xing Shi, Yi-Na Wang, Zhen-Fa Yang, Ying-Qing Guo and Zhi-Wei Zhang
Forests 2024, 15(11), 1975; https://doi.org/10.3390/f15111975 - 8 Nov 2024
Viewed by 289
Abstract
In this paper, an improved MobileNetV3-Small algorithm model is proposed for the problem of poor real-time wildfire identification based on convolutional neural networks (CNNs). Firstly, a wildfire dataset is constructed and subsequently expanded through image enhancement techniques. Secondly, an efficient channel attention mechanism [...] Read more.
In this paper, an improved MobileNetV3-Small algorithm model is proposed for the problem of poor real-time wildfire identification based on convolutional neural networks (CNNs). Firstly, a wildfire dataset is constructed and subsequently expanded through image enhancement techniques. Secondly, an efficient channel attention mechanism (ECA) is utilised instead of the Squeeze-and-Excitation (SE) module within the MobileNetV3-Small model to enhance the model’s identification speed. Lastly, a support vector machine (SVM) is employed to replace the classification layer of the MobileNetV3-Small model, with principal component analysis (PCA) applied before the SVM to reduce the dimensionality of the features, thereby enhancing the SVM’s identification efficiency. The experimental results demonstrate that the improved model achieves an accuracy of 98.75% and an average frame rate of 93. Compared to the initial model, the mean frame rate has been elevated by 7.23. The wildfire identification model designed in this paper improves the speed of identification while maintaining accuracy, thereby advancing the development and application of CNNs in the field of wildfire monitoring. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

Figure 1
<p>Images of wildfire.</p>
Full article ">Figure 2
<p>Images of routine forest.</p>
Full article ">Figure 3
<p>Images of interference forest.</p>
Full article ">Figure 4
<p>Resultant diagram of adaptive histogram equalisation.</p>
Full article ">Figure 5
<p>Comparison of before and after treatment.</p>
Full article ">Figure 6
<p>Simplified architecture diagram of MobileNetV3-Small.</p>
Full article ">Figure 7
<p>Structure diagram of bottleneck.</p>
Full article ">Figure 8
<p>Schematic diagram of ordinary convolution and DSC.</p>
Full article ">Figure 9
<p>Structure diagram of SE attention mechanism.</p>
Full article ">Figure 10
<p>Sketch of residual and inverted residual structure.</p>
Full article ">Figure 11
<p>Structure diagram of ECA attention mechanism.</p>
Full article ">Figure 12
<p>Sketch of improved MobileNetV3-Small structure.</p>
Full article ">Figure 13
<p>Accuracy curves for different learning rates.</p>
Full article ">Figure 14
<p>Comparison diagram of the loss values before and after image enhancement.</p>
Full article ">Figure 15
<p>Comparison diagram of accuracy (<b>a</b>), precision (<b>b</b>), recall (<b>c</b>), and F1 score (<b>d</b>).</p>
Full article ">Figure 16
<p>ROC curves based on ECA, ECA_SVM and EPS models.</p>
Full article ">
19 pages, 7911 KiB  
Article
A Multiclassification Model for Skin Diseases Using Dermatoscopy Images with Inception-v2
by Shulong Zhi, Zhenwei Li, Xiaoli Yang, Kai Sun and Jiawen Wang
Appl. Sci. 2024, 14(22), 10197; https://doi.org/10.3390/app142210197 - 6 Nov 2024
Viewed by 449
Abstract
Skin cancer represents a significant global public health concern, with over five million new cases diagnosed annually. If not diagnosed at an early stage, skin diseases have the potential to pose a significant threat to human life. In recent years, deep learning has [...] Read more.
Skin cancer represents a significant global public health concern, with over five million new cases diagnosed annually. If not diagnosed at an early stage, skin diseases have the potential to pose a significant threat to human life. In recent years, deep learning has increasingly been used in dermatological diagnosis. In this paper, a multiclassification model based on the Inception-v2 network and the focal loss function is proposed on the basis of deep learning, and the ISIC 2019 dataset is optimised using data augmentation and hair removal to achieve seven classifications of dermatological images and generate heat maps to visualise the predictions of the model. The results show that the model has an average accuracy of 89.04%, a precision of 87.37%, recall of 90.15%, and an F1-score of 88.76%, The accuracy rates of ResNext101, MobileNetv2, Vgg19, and ConvNet are 88.50%, 85.30%, 88.57%, and 86.90%, respectively. These results show that our proposed model performs better than the above models and performs well in classifying dermatological images, which has significant application value. Full article
Show Figures

Figure 1

Figure 1
<p>General flowchart of the proposed method.</p>
Full article ">Figure 2
<p>Diagram of the original Inception model.</p>
Full article ">Figure 3
<p>Inception-v2 model diagram.</p>
Full article ">Figure 4
<p>Subplot of the focal loss calculation equation.</p>
Full article ">Figure 5
<p>Examples of seven dermatologic images.</p>
Full article ">Figure 6
<p>Example of image enhancement. (<b>a</b>) Original image in the base training set; (<b>b</b>) image after data enhancement.</p>
Full article ">Figure 7
<p>Distribution of dataset after image enhancement.</p>
Full article ">Figure 8
<p>Example of hair removal.</p>
Full article ">Figure 9
<p>Confusion matrix plots and ROC curves for classification results obtained using the focal loss function. (<b>a</b>,<b>b</b>) Resnet-50, (<b>c</b>,<b>d</b>) Densenet, and (<b>e</b>,<b>f</b>) Inception-v2.</p>
Full article ">Figure 9 Cont.
<p>Confusion matrix plots and ROC curves for classification results obtained using the focal loss function. (<b>a</b>,<b>b</b>) Resnet-50, (<b>c</b>,<b>d</b>) Densenet, and (<b>e</b>,<b>f</b>) Inception-v2.</p>
Full article ">Figure 10
<p>Accuracy curves for classification results obtained using the focal loss function. (<b>a</b>) Resnet-50, (<b>b</b>) Densenet, and (<b>c</b>) Inception-v2.</p>
Full article ">Figure 11
<p>Loss curves for classification results obtained using the focal loss function. (<b>a</b>) Resnet-50, (<b>b</b>) Densenet, and (<b>c</b>) Inception-v2.</p>
Full article ">Figure 12
<p>Histogram of classification results based on focal loss. (<b>a</b>) Resnet-50, (<b>b</b>) Densenet, and (<b>c</b>) Inception-v2.</p>
Full article ">Figure 12 Cont.
<p>Histogram of classification results based on focal loss. (<b>a</b>) Resnet-50, (<b>b</b>) Densenet, and (<b>c</b>) Inception-v2.</p>
Full article ">Figure 13
<p>Grad-CAM heat maps of dermatological images for the network proposed in this paper.</p>
Full article ">
22 pages, 5584 KiB  
Article
Enhanced Magnetic Resonance Imaging-Based Brain Tumor Classification with a Hybrid Swin Transformer and ResNet50V2 Model
by Abeer Fayez Al Bataineh, Khalid M. O. Nahar, Hayel Khafajeh, Ghassan Samara, Raed Alazaidah, Ahmad Nasayreh, Ayah Bashkami, Hasan Gharaibeh and Waed Dawaghreh
Appl. Sci. 2024, 14(22), 10154; https://doi.org/10.3390/app142210154 - 6 Nov 2024
Viewed by 474
Abstract
Brain tumors can be serious; consequently, rapid and accurate detection is crucial. Nevertheless, a variety of obstacles, such as poor imaging resolution, doubts over the accuracy of data, a lack of diverse tumor classes and stages, and the possibility of misunderstanding, present challenges [...] Read more.
Brain tumors can be serious; consequently, rapid and accurate detection is crucial. Nevertheless, a variety of obstacles, such as poor imaging resolution, doubts over the accuracy of data, a lack of diverse tumor classes and stages, and the possibility of misunderstanding, present challenges to achieve an accurate and final diagnosis. Effective brain cancer detection is crucial for patients’ safety and health. Deep learning systems provide the capability to assist radiologists in quickly and accurately detecting diagnoses. This study presents an innovative deep learning approach that utilizes the Swin Transformer. The suggested method entails integrating the Swin Transformer with the pretrained deep learning model Resnet50V2, called (SwT+Resnet50V2). The objective of this modification is to decrease memory utilization, enhance classification accuracy, and reduce training complexity. The self-attention mechanism of the Swin Transformer identifies distant relationships and captures the overall context. Resnet 50V2 improves both accuracy and training speed by extracting adaptive features from the Swin Transformer’s dependencies. We evaluate the proposed framework using two publicly accessible brain magnetic resonance imaging (MRI) datasets, each including two and four distinct classes, respectively. Employing data augmentation and transfer learning techniques enhances model performance, leading to more dependable and cost-effective training. The suggested model achieves an impressive accuracy of 99.9% on the binary-labeled dataset and 96.8% on the four-labeled dataset, outperforming the VGG16, MobileNetV2, Resnet50V2, EfficientNetV2B3, ConvNeXtTiny, and convolutional neural network (CNN) algorithms used for comparison. This demonstrates that the Swin transducer, when combined with Resnet50V2, is capable of accurately diagnosing brain tumors. This method leverages the combination of SwT+Resnet50V2 to create an innovative diagnostic tool. Radiologists have the potential to accelerate and improve the detection of brain tumors, leading to improved patient outcomes and reduced risks. Full article
(This article belongs to the Special Issue Advances in Bioinformatics and Biomedical Engineering)
Show Figures

Figure 1

Figure 1
<p>Workflow diagram of the proposed brain tumor detection method.</p>
Full article ">Figure 2
<p>Architecture of Swin Transformer.</p>
Full article ">Figure 3
<p>Architecture of Resnet50V2.</p>
Full article ">Figure 4
<p>Instances of the types of brain tumor in MRI images.</p>
Full article ">Figure 5
<p>Performance evaluation on Bra35H dataset.</p>
Full article ">Figure 6
<p>Training and validation metrics (accuracy and loss) for (SwT+Resnet50V2) on Bra35H dataset.</p>
Full article ">Figure 7
<p>Comparison of confusion matrices for all models using the Bra35H dataset.</p>
Full article ">Figure 8
<p>Performance evaluation on Kaggle dataset.</p>
Full article ">Figure 9
<p>Training and validation metrics (accuracy and loss) for (SwT+Resnet50V2) on Kaggle dataset.</p>
Full article ">Figure 10
<p>Comparison of confusion matrices for all models using the Kaggle dataset.</p>
Full article ">
23 pages, 5919 KiB  
Article
Research on Soybean Seedling Stage Recognition Based on Swin Transformer
by Kai Ma, Jinkai Qiu, Ye Kang, Liqiang Qi, Wei Zhang, Song Wang and Xiuying Xu
Agronomy 2024, 14(11), 2614; https://doi.org/10.3390/agronomy14112614 - 6 Nov 2024
Viewed by 613
Abstract
Accurate identification of the second and third compound leaf periods of soybean seedlings is a prerequisite to ensure that soybeans are chemically weeded after seedling at the optimal application period. Accurate identification of the soybean seedling period is susceptible to natural light and [...] Read more.
Accurate identification of the second and third compound leaf periods of soybean seedlings is a prerequisite to ensure that soybeans are chemically weeded after seedling at the optimal application period. Accurate identification of the soybean seedling period is susceptible to natural light and complex field background factors. A transfer learning-based Swin-T (Swin Transformer) network is proposed to recognize different stages of the soybean seedling stage. A drone was used to collect images of soybeans at the true leaf stage, the first compound leaf stage, the second compound leaf stage, and the third compound leaf stage, and data enhancement methods such as image rotation and brightness enhancement were used to expand the dataset, simulate the drone’s collection of images at different shooting angles and weather conditions, and enhance the adaptability of the model. The field environment and shooting equipment directly affect the quality of the captured images, and in order to test the anti-interference ability of different models, the Gaussian blur method was used to blur the images of the test set to different degrees. The Swin-T model was optimized by introducing transfer learning and combining hyperparameter combination experiments and optimizer selection experiments. The performance of the optimized Swin-T model was compared with the MobileNetV2, ResNet50, AlexNet, GoogleNet, and VGG16Net models. The results show that the optimized Swin-T model has an average accuracy of 98.38% in the test set, which is an improvement of 11.25%, 12.62%, 10.75%, 1.00%, and 0.63% compared with the MobileNetV2, ResNet50, AlexNet, GoogleNet, and VGG16Net models, respectively. The optimized Swin-T model is best in terms of recall and F1 score. In the performance degradation test of the motion blur level model, the maximum degradation accuracy, overall degradation index, and average degradation index of the optimized Swin-T model were 87.77%, 6.54%, and 2.18%, respectively. The maximum degradation accuracy was 7.02%, 7.48%, 10.15%, 3.56%, and 2.5% higher than the MobileNetV2, ResNet50, AlexNet, GoogleNet, and VGG16Net models, respectively. In the performance degradation test of the Gaussian fuzzy level models, the maximum degradation accuracy, overall degradation index, and average degradation index of the optimized Swin-T model were 94.3%, 3.85%, and 1.285%, respectively. Compared with the MobileNetV2, ResNet50, AlexNet, GoogleNet, and VGG16Net models, the maximum degradation accuracy was 12.13%, 15.98%, 16.7%, 2.2%, and 1.5% higher, respectively. Taking into account various degradation indicators, the Swin-T model can still maintain high recognition accuracy and demonstrate good anti-interference ability even when inputting blurry images caused by interference in shooting. It can meet the recognition of different growth stages of soybean seedlings in complex environments, providing a basis for post-seedling chemical weed control during the second and third compound leaf stages of soybeans. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Partially acquired visible spectral image of soybean seedlings. (<b>a</b>) Image of the true leaf period of soybeans. (<b>b</b>) Image of the first compound leaf stage of soybeans. (<b>c</b>) Image of the second compound leaf stage of soybeans. (<b>d</b>) Image of the third compound leaf stage of soybeans.</p>
Full article ">Figure 2
<p>Image samples of soybean seedlings at different growth stages. (<b>a</b>) Sample of the true leaf stage of soybeans. (<b>b</b>) Sample of the first compound leaf stage of soybeans. (<b>c</b>) Sample of the second compound leaf stage of soybeans. (<b>d</b>) Sample of the third compound leaf stage of soybeans.</p>
Full article ">Figure 3
<p>Example of data augmentation for the second compound leaf period of soybeans. (<b>a</b>) Original figure. (<b>b</b>) HSV data enhancement. (<b>c</b>) Revolved image. (<b>d</b>) Contrast enhancement. (<b>e</b>) Brightness adjustment.</p>
Full article ">Figure 4
<p>Different blur radius r processing effects: (<b>a</b>) original figure; (<b>b</b>) r = 1; (<b>c</b>) r = 2; (<b>d</b>) r = 3; and (<b>e</b>) r = 4.</p>
Full article ">Figure 5
<p>The motion blur processing effect of different blur levels: (<b>a</b>) original figure; (<b>b</b>) f = 1; (<b>c</b>) f = 2; (<b>d</b>) f = 3; and (<b>e</b>) f = 4.</p>
Full article ">Figure 6
<p>General framework diagram of the Swin-T network.</p>
Full article ">Figure 7
<p>The implementation process of the patch merging layer in the Swin-T architecture.</p>
Full article ">Figure 8
<p>Swin-T Block structure in the Swin-T architecture.</p>
Full article ">Figure 9
<p>MSA and W-MSA window partitioning mechanism. (<b>a</b>) MSA. (<b>b</b>) W-MSA.</p>
Full article ">Figure 10
<p>Description of the shift window process. (<b>a</b>) SW-MSA. (<b>b</b>) Cyclic shift. (<b>c</b>) Masked MSA. (<b>d</b>) Reverse cyclic shift.</p>
Full article ">Figure 11
<p>Comparison of the accuracy and loss values of different optimizers. (<b>a</b>) Optimizer accuracy value. (<b>b</b>) Optimizer loss value.</p>
Full article ">Figure 12
<p>Comparison of accuracy and loss values of different model training sets. (<b>a</b>) Training set accuracy. (<b>b</b>) Training set loss.</p>
Full article ">Figure 13
<p>Confusion matrix plot of the recognition results from different classification models. (<b>a</b>) MobileNetV2; (<b>b</b>) ResNet50; (<b>c</b>) AlexNet; (<b>d</b>) GoogleNet; (<b>e</b>) VGG16Net; and (<b>f</b>) optimized Swin-T.</p>
Full article ">Figure 14
<p>The influence of different levels of motion blur on the classification accuracy of different models.</p>
Full article ">Figure 15
<p>Classification accuracy of different models under different blur radii.</p>
Full article ">Figure 16
<p>Comparison of the recognition results of the Swin-T model trained on different data. (<b>a</b>) Recognition results of the Swin-T model without data enhancement training. (<b>b</b>) Recognition results of the Swin-T model trained with data augmentation.</p>
Full article ">Figure 17
<p>Comparison of the thermal power of different models at different soybean seedling stages.</p>
Full article ">
15 pages, 4974 KiB  
Article
High-Precision and Lightweight Model for Rapid Safety Helmet Detection
by Xuejun Jia, Xiaoxiong Zhou, Chunyi Su, Zhihan Shi, Xiaodong Lv, Chao Lu and Guangming Zhang
Sensors 2024, 24(21), 6985; https://doi.org/10.3390/s24216985 - 30 Oct 2024
Viewed by 399
Abstract
This paper presents significant improvements in the accuracy and computational efficiency of safety helmet detection within industrial environments through the optimization of the you only look once version 5 small (YOLOv5s) model structure and the enhancement of its loss function. We introduce the [...] Read more.
This paper presents significant improvements in the accuracy and computational efficiency of safety helmet detection within industrial environments through the optimization of the you only look once version 5 small (YOLOv5s) model structure and the enhancement of its loss function. We introduce the convolutional block attention module (CBAM) to bolster the model’s sensitivity to key features, thereby enhancing detection accuracy. To address potential performance degradation issues associated with the complete intersection over union (CIoU) loss function in the original model, we implement the modified penalty-decay intersection over union (MPDIoU) loss function to achieve more stable and precise bounding box regression. Furthermore, considering the original YOLOv5s model’s large parameter count, we adopt a lightweight design using the MobileNetV3 architecture and replace the original squeeze-and-excitation (SE) attention mechanism with CBAM, significantly reducing computational complexity. These improvements reduce the model’s parameters from 15.7 GFLOPs to 5.7 GFLOPs while increasing the mean average precision (mAP) from 82.34% to 91.56%, demonstrating its superior performance and potential value in practical industrial applications. Full article
Show Figures

Figure 1

Figure 1
<p>YOLOv5 network structure.</p>
Full article ">Figure 2
<p>CBAM module structure.</p>
Full article ">Figure 3
<p>MPDIoU schematic diagram.</p>
Full article ">Figure 4
<p>Improved MobileNetV3 network structure.</p>
Full article ">Figure 5
<p>Training samples.</p>
Full article ">Figure 6
<p>PR curve.</p>
Full article ">Figure 7
<p>mAP@0.5 curve.</p>
Full article ">Figure 8
<p>Comparison of basic YOLOV5s and improved YOLOV5s examples. (<b>Left</b>) Basic YOLOV5s; (<b>right</b>) improved YOLOV5s.</p>
Full article ">Figure 9
<p>Images of the self-made small target dataset.</p>
Full article ">Figure 10
<p>Self-made dataset training results.</p>
Full article ">Figure 11
<p>Small object detection images. (<b>upper</b>) Basic YOLOV5s; (<b>down</b>) improved YOLOV5s.</p>
Full article ">
27 pages, 1362 KiB  
Article
Real-Time Forest Fire Detection with Lightweight CNN Using Hierarchical Multi-Task Knowledge Distillation
by Ismail El-Madafri, Marta Peña and Noelia Olmedo-Torre
Fire 2024, 7(11), 392; https://doi.org/10.3390/fire7110392 - 30 Oct 2024
Viewed by 563
Abstract
Forest fires pose a significant threat to ecosystems, property, and human life, making their early and accurate detection crucial for effective intervention. This study presents a novel, lightweight approach to real-time forest fire detection that is optimized for resource-constrained devices like drones. The [...] Read more.
Forest fires pose a significant threat to ecosystems, property, and human life, making their early and accurate detection crucial for effective intervention. This study presents a novel, lightweight approach to real-time forest fire detection that is optimized for resource-constrained devices like drones. The method integrates multi-task knowledge distillation, transferring knowledge from a high-performance DenseNet201 teacher model that was trained on a hierarchically structured wildfire dataset. The dataset comprised primary classes (fire vs. non-fire) and detailed subclasses that account for confounding elements such as smoke, fog, and reflections. The novelty of this approach lies in leveraging knowledge distillation to transfer the deeper insights learned by the DenseNet201 teacher model—specifically, the auxiliary task of recognizing the confounding elements responsible for false positives—into a lightweight student model, enabling it to achieve a similar robustness without the need for complex architectures. Using this distilled knowledge, we trained a MobileNetV3-based student model, which was designed to operate efficiently in real-time while maintaining a low computational overhead. To address the challenge of false positives caused by visually similar non-fire elements, we introduced the Confounding Element Specificity (CES) metric. This novel metric, made possible by the hierarchical structure of the wildfire dataset, is unique in its focus on evaluating how well the model distinguishes actual fires from the confounding elements that typically result in false positives within the negative class. The proposed approach outperformed the baseline methods—including single-task learning and direct multi-task learning—achieving a primary accuracy of 93.36%, an F1-score of 91.57%, and a higher MES score, demonstrating its enhanced robustness and reliability in diverse environmental conditions. This work bridges the gap between advanced deep learning techniques and practical, scalable solutions for environmental monitoring. Future research will focus on integrating multi-modal data and developing adaptive distillation techniques to further enhance the model’s performance in real-time applications. Full article
Show Figures

Figure 1

Figure 1
<p>Images from the wildfire dataset showcasing various scenarios: an active wildfire with smoke (<b>top-left</b>), smoke from a fire that is difficult to detect (<b>top-right</b>), misleading elements that may resemble smoke or fire (<b>bottom-left</b>), and a mixed landscape view (<b>bottom-right</b>). These examples highlight the diversity of conditions and perspectives essential for training the detection model.</p>
Full article ">Figure 2
<p>Flowchart of the proposed wildfire detection approach, showing the stages from teacher model training to knowledge distillation and final student model optimization.</p>
Full article ">Figure 3
<p>Grad-CAM visualizations showing the areas of focus for the fire classification model across different scenarios. These images are randomly selected from the test dataset and represent a variety of fire and non-fire situations, including true positives, true negatives, and false positives. The original image is shown alongside the Grad-CAM heatmap, which highlights the regions that contributed to the model’s decision. The images are numbered from <b>left</b> to <b>right</b>, <b>top</b> to <b>bottom</b>.</p>
Full article ">
15 pages, 18517 KiB  
Article
Rice Leaf Disease Classification—A Comparative Approach Using Convolutional Neural Network (CNN), Cascading Autoencoder with Attention Residual U-Net (CAAR-U-Net), and MobileNet-V2 Architectures
by Monoronjon Dutta, Md Rashedul Islam Sujan, Mayen Uddin Mojumdar, Narayan Ranjan Chakraborty, Ahmed Al Marouf, Jon G. Rokne and Reda Alhajj
Technologies 2024, 12(11), 214; https://doi.org/10.3390/technologies12110214 - 29 Oct 2024
Viewed by 1349
Abstract
Classifying rice leaf diseases in agricultural technology helps to maintain crop health and to ensure a good yield. In this work, deep learning algorithms were, therefore, employed for the identification and classification of rice leaf diseases from images of crops in the field. [...] Read more.
Classifying rice leaf diseases in agricultural technology helps to maintain crop health and to ensure a good yield. In this work, deep learning algorithms were, therefore, employed for the identification and classification of rice leaf diseases from images of crops in the field. The initial algorithmic phase involved image pre-processing of the crop images, using a bilateral filter to improve image quality. The effectiveness of this step was measured by using metrics like the Structural Similarity Index (SSIM) and the Peak Signal-to-Noise Ratio (PSNR). Following this, this work employed advanced neural network architectures for classification, including Cascading Autoencoder with Attention Residual U-Net (CAAR-U-Net), MobileNetV2, and Convolutional Neural Network (CNN). The proposed CNN model stood out, since it demonstrated exceptional performance in identifying rice leaf diseases, with test Accuracy of 98% and high Precision, Recall, and F1 scores. This result highlights that the proposed model is particularly well suited for rice leaf disease classification. The robustness of the proposed model was validated through k-fold cross-validation, confirming its generalizability and minimizing the risk of overfitting. This study not only focused on classifying rice leaf diseases but also has the potential to benefit farmers and the agricultural community greatly. This work highlights the advantages of custom CNN models for efficient and accurate rice leaf disease classification, paving the way for technology-driven advancements in farming practices. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

Figure 1
<p>Proposed methodology for rice leaf disease classification.</p>
Full article ">Figure 2
<p>Sample of dataset.</p>
Full article ">Figure 3
<p>The CNN model architecture visualization.</p>
Full article ">Figure 4
<p>The CAAR-U-Net model architecture.</p>
Full article ">Figure 5
<p>The MobileNetV2 model architecture.</p>
Full article ">Figure 6
<p>The CAAR-U-Net model’s (<b>a</b>) training and validation loss and (<b>b</b>) training and validation accuracy curves.</p>
Full article ">Figure 7
<p>The MobileNetV2 model’s (<b>a</b>) training and validation loss and (<b>b</b>) training and validation accuracy curves.</p>
Full article ">Figure 8
<p>The CNN model’s (<b>a</b>) training and validation loss and (<b>b</b>) training and validation accuracy curves.</p>
Full article ">Figure 9
<p>Confusion matrix for the CNN (<b>a</b>) and MobileNetV2 (<b>b</b>) models.</p>
Full article ">Figure 10
<p>ROC for CNN (<b>a</b>) and MobileNetV2 (<b>b</b>) models.</p>
Full article ">
16 pages, 3950 KiB  
Article
MRI-Driven Alzheimer’s Disease Diagnosis Using Deep Network Fusion and Optimal Selection of Feature
by Muhammad Umair Ali, Shaik Javeed Hussain, Majdi Khalid, Majed Farrash, Hassan Fareed M. Lahza and Amad Zafar
Bioengineering 2024, 11(11), 1076; https://doi.org/10.3390/bioengineering11111076 - 28 Oct 2024
Viewed by 648
Abstract
Alzheimer’s disease (AD) is a degenerative neurological condition characterized by cognitive decline, memory loss, and reduced everyday function, which eventually causes dementia. Symptoms develop years after the disease begins, making early detection difficult. While AD remains incurable, timely detection and prompt treatment can [...] Read more.
Alzheimer’s disease (AD) is a degenerative neurological condition characterized by cognitive decline, memory loss, and reduced everyday function, which eventually causes dementia. Symptoms develop years after the disease begins, making early detection difficult. While AD remains incurable, timely detection and prompt treatment can substantially slow its progression. This study presented a framework for automated AD detection using brain MRIs. Firstly, the deep network information (i.e., features) were extracted using various deep-learning networks. The information extracted from the best deep networks (EfficientNet-b0 and MobileNet-v2) were merged using the canonical correlation approach (CCA). The CCA-based fused features resulted in an enhanced classification performance of 94.7% with a large feature vector size (i.e., 2532). To remove the redundant features from the CCA-based fused feature vector, the binary-enhanced WOA was utilized for optimal feature selection, which yielded an average accuracy of 98.12 ± 0.52 (mean ± standard deviation) with only 953 features. The results were compared with other optimal feature selection techniques, showing that the binary-enhanced WOA results are statistically significant (p < 0.01). The ablation study was also performed to show the significance of each step of the proposed methodology. Furthermore, the comparison shows the superiority and high classification performance of the proposed automated AD detection approach, suggesting that the hybrid approach may help doctors with dementia detection and staging. Full article
Show Figures

Figure 1

Figure 1
<p>AD detection and staging using deep feature fusion and optimal feature selection approach.</p>
Full article ">Figure 2
<p>Deep feature extraction using modified AlexNet using transfer learning.</p>
Full article ">Figure 3
<p>Classification performance comparison of various deep features for AD detection.</p>
Full article ">Figure 4
<p>(<b>a</b>) Number of features used to subclassify brain MRI images; (<b>b</b>) processing time taken by each approach.</p>
Full article ">Figure 5
<p>Result for dementia identification and staging using a hybrid deep feature fusion and optimal feature selection approach.</p>
Full article ">Figure 6
<p>Ablation study results for AD detection (for ten runs). MN-v2, MobileNet-v2; EN-b0, EfficientNet-b0; CCA, canonical correlation analysis; WOA, whale optimization algorithm; b-EWOA, binary-enhanced whale optimization algorithm.</p>
Full article ">
19 pages, 5336 KiB  
Article
Enhancing Situational Awareness with VAS-Compass Net for the Recognition of Directional Vehicle Alert Sounds
by Chiun-Li Chin, Jun-Ren Chen, Wan-Xuan Lin, Hsuan-Chiao Hung, Shang-En Chiang, Chih-Hui Wang, Liang-Ching Lee and Shing-Hong Liu
Sensors 2024, 24(21), 6841; https://doi.org/10.3390/s24216841 - 24 Oct 2024
Viewed by 592
Abstract
People with hearing impairments often face increased risks related to traffic accidents due to their reduced ability to perceive surrounding sounds. Given the cost and usage limitations of traditional hearing aids and cochlear implants, this study aims to develop a sound alert assistance [...] Read more.
People with hearing impairments often face increased risks related to traffic accidents due to their reduced ability to perceive surrounding sounds. Given the cost and usage limitations of traditional hearing aids and cochlear implants, this study aims to develop a sound alert assistance system (SAAS) to enhance situational awareness and improve travel safety for people with hearing impairments. We proposed the VAS-Compass Net (Vehicle Alert Sound–Compass Net), which integrates three lightweight convolutional neural networks: EfficientNet-lite0, MobileNetV3-Small, and GhostNet. Through employing a fuzzy ranking ensemble technique, our proposed model can identify different categories of vehicle alert sounds and directions of sound sources on an edge computing device. The experimental dataset consisted of images derived from the sounds of approaching police cars, ambulances, fire trucks, and car horns from various directions. The audio signals were converted into spectrogram images and Mel-frequency cepstral coefficient images, and they were fused into a complete image using image stitching techniques. We successfully deployed our proposed model on a Raspberry Pi 5 microcomputer, paired with a customized smartwatch to realize an SAAS. Our experimental results demonstrated that VAS-Compass Net achieved an accuracy of 84.38% based on server-based computing and an accuracy of 83.01% based on edge computing. Our proposed SAAS has the potential to significantly enhance the situational awareness, alertness, and safety of people with hearing impairments on the road. Full article
(This article belongs to the Special Issue Wearable Robotics and Assistive Devices)
Show Figures

Figure 1

Figure 1
<p>The architecture and the operational process of the proposed SAAS.</p>
Full article ">Figure 2
<p>Audio signals are converted to image samples in four steps, including audio data collection, preprocessing and augmentation, image formation, and image stitching. The x-axis of the MFCC images represents the time sequence, and the y-axis represents the 13 MFCCs. The x-axis of the spectrogram images represents the time sequence, and the y-axis represents the spectrum. For a detailed explanation, please refer to <a href="#sec3dot4-sensors-24-06841" class="html-sec">Section 3.4</a>.</p>
Full article ">Figure 3
<p>Clarity scores of audio signals at different augmented factors.</p>
Full article ">Figure 4
<p>The stitched images include three spectrogram images and three MFCC images, respectively. In each image, the x-axis represents the time sequence, while the y-axis represents the spectrum or spectrum. For a detailed explanation, please refer to <a href="#sec3dot4-sensors-24-06841" class="html-sec">Section 3.4</a>.</p>
Full article ">Figure 5
<p>The architecture of VAS-Compass Net.</p>
Full article ">Figure 6
<p>The circuit diagram of the alert output device in the SAAS.</p>
Full article ">Figure 7
<p>The main hardware components of the SAAS and their ideal placement on the user’s body.</p>
Full article ">Figure 8
<p>The confusion matrix of VAS-Compass Net in the server-based environment.</p>
Full article ">Figure 9
<p>The confusion matrix of VAS-Compass Net in the edge computing device.</p>
Full article ">Figure 10
<p>Spectrogram images of a police car siren collected by the device (<b>a</b>) on the left side, (<b>b</b>) at the rear, and (<b>c</b>) on the right side. In each spectrogram image, the x-axis represents the time sequence and the y-axis represents the spectrum.</p>
Full article ">Figure 11
<p>MFCC images of a police car siren collected by the device (<b>a</b>) on the left side, (<b>b</b>) at the rear, and (<b>c</b>) on the right side. In each MFCC image, the x-axis represents the time sequence and the y-axis represents the spectrum.</p>
Full article ">Figure 12
<p>Different stitching images for spectrogram images and MFCC images: (<b>a</b>) Spectral_H, (<b>b</b>) MFCC_H, (<b>c</b>) Spectral_V, and (<b>d</b>) MFCC_V.</p>
Full article ">Figure 13
<p>The SAAS indicates a police siren approaching the user from the left side.</p>
Full article ">
14 pages, 12763 KiB  
Article
Semantic Segmentation Model-Based Boundary Line Recognition Method for Wheat Harvesting
by Qian Wang, Wuchang Qin, Mengnan Liu, Junjie Zhao, Qingzhen Zhu and Yanxin Yin
Agriculture 2024, 14(10), 1846; https://doi.org/10.3390/agriculture14101846 - 19 Oct 2024
Viewed by 724
Abstract
The wheat harvesting boundary line is vital reference information for the path tracking of an autonomously driving combine harvester. However, unfavorable factors, such as a complex light environment, tree shade, weeds, and wheat stubble color interference in the field, make it challenging to [...] Read more.
The wheat harvesting boundary line is vital reference information for the path tracking of an autonomously driving combine harvester. However, unfavorable factors, such as a complex light environment, tree shade, weeds, and wheat stubble color interference in the field, make it challenging to identify the wheat harvest boundary line accurately and quickly. Therefore, this paper proposes a harvest boundary line recognition model for wheat harvesting based on the MV3_DeepLabV3+ network framework, which can quickly and accurately complete the identification in complex environments. The model uses the lightweight MobileNetV3_Large as the backbone network and the LeakyReLU activation function to avoid the neural death problem. Depth-separable convolution is introduced into Atrous Spatial Pyramid Pooling (ASPP) to reduce the complexity of network parameters. The cubic B-spline curve-fitting method extracts the wheat harvesting boundary line. A prototype harvester for wheat harvesting boundary recognition was built, and field tests were conducted. The test results show that the wheat harvest boundary line recognition model proposed in this paper achieves a segmentation accuracy of 98.04% for unharvested wheat regions in complex environments, with an IoU of 95.02%. When the combine harvester travels at 0~1.5 m/s, the normal speed for operation, the average processing time and pixel error for a single image are 0.15 s and 7.3 pixels, respectively. This method could achieve high recognition accuracy and fast recognition speed. This paper provides a practical reference for the autonomous harvesting operation of a combine harvester. Full article
(This article belongs to the Special Issue Agricultural Collaborative Robots for Smart Farming)
Show Figures

Figure 1

Figure 1
<p>MV3-DeepLabV3+ model structure.</p>
Full article ">Figure 2
<p>Bneck structure.</p>
Full article ">Figure 3
<p>Cubic B-spline sampling algorithm’s boundary-line-fitting results.</p>
Full article ">Figure 4
<p>Combine harvester field collection data.</p>
Full article ">Figure 5
<p>Cubic B-spline sampling algorithm’s boundary-line-fitting results. Keys: (<b>a</b>) image labeling information, (<b>b</b>) strong light, (<b>c</b>) backlight, (<b>d</b>) shadow occlusion, (<b>e</b>) weak light, (<b>f</b>) front lighting, (<b>g</b>) land edge.</p>
Full article ">Figure 6
<p>Comparison of segmentation effects of different semantic segmentation models.</p>
Full article ">Figure 7
<p>Fitting boundary lines using cubic B-spline sampling algorithm.</p>
Full article ">
Back to TopTop