[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (8,801)

Search Parameters:
Keywords = convolution neural networks (CNNs)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3755 KiB  
Article
Artificial Intelligence-Empowered Doppler Weather Profile for Low-Earth-Orbit Satellites
by Ekta Sharma, Ravinesh C. Deo, Christopher P. Davey and Brad D. Carter
Sensors 2024, 24(16), 5271; https://doi.org/10.3390/s24165271 (registering DOI) - 14 Aug 2024
Abstract
Low-Earth-orbit (LEO) satellites are widely acknowledged as a promising infrastructure solution for global Internet of Things (IoT) services. However, the Doppler effect presents a significant challenge in the context of long-range (LoRa) modulation uplink connectivity. This study comprehensively examines the operational efficiency of [...] Read more.
Low-Earth-orbit (LEO) satellites are widely acknowledged as a promising infrastructure solution for global Internet of Things (IoT) services. However, the Doppler effect presents a significant challenge in the context of long-range (LoRa) modulation uplink connectivity. This study comprehensively examines the operational efficiency of LEO satellites concerning the Doppler weather effect, with state-of-the-art artificial intelligence techniques. Two LEO satellite constellations—Globalstar and the International Space Station (ISS)—were detected and tracked using ground radars in Perth and Brisbane, Australia, for 24 h starting 1 January 2024. The study involves modelling the constellation, calculating latency, and frequency offset and designing a hybrid Iterative Input Selection–Long Short-Term Memory Network (IIS-LSTM) integrated model to predict the Doppler weather profile for LEO satellites. The IIS algorithm selects relevant input variables for the model, while the LSTM algorithm learns and predicts patterns. This model is compared with Convolutional Neural Network and Extreme Gradient Boosting (XGBoost) models. The results show that the packet delivery rate is above 91% for the sensitive spread factor 12 with a bandwidth of 11.5 MHz for Globalstar and 145.8 MHz for ISS NAUKA. The carrier frequency for ISS orbiting at 402.3 km is 631 MHz and 500 MHz for Globalstar at 1414 km altitude, aiding in combating packet losses. The ISS-LSTM model achieved an accuracy of 97.51% and a loss of 1.17% with signal-to-noise ratios (SNRs) ranging from 0–30 dB. The XGB model has the fastest testing time, attaining ≈0.0997 s for higher SNRs and an accuracy of 87%. However, in lower SNR, it proves to be computationally expensive. IIS-LSTM attains a better computation time for lower SNRs at ≈0.4651 s, followed by XGB at ≈0.5990 and CNN at ≈0.6120 s. The study calls for further research on LoRa Doppler analysis, considering atmospheric attenuation, and relevant space parameters for future work. Full article
(This article belongs to the Section Remote Sensors)
25 pages, 8503 KiB  
Article
A Deep Learning Quantile Regression Photovoltaic Power-Forecasting Method under a Priori Knowledge Injection
by Xiaoying Ren, Yongqian Liu, Fei Zhang and Lingfeng Li
Energies 2024, 17(16), 4026; https://doi.org/10.3390/en17164026 - 14 Aug 2024
Abstract
Accurate and reliable PV power probabilistic-forecasting results can help grid operators and market participants better understand and cope with PV energy volatility and uncertainty and improve the efficiency of energy dispatch and operation, which plays an important role in application scenarios such as [...] Read more.
Accurate and reliable PV power probabilistic-forecasting results can help grid operators and market participants better understand and cope with PV energy volatility and uncertainty and improve the efficiency of energy dispatch and operation, which plays an important role in application scenarios such as power market trading, risk management, and grid scheduling. In this paper, an innovative deep learning quantile regression ultra-short-term PV power-forecasting method is proposed. This method employs a two-branch deep learning architecture to forecast the conditional quantile of PV power; one branch is a QR-based stacked conventional convolutional neural network (QR_CNN), and the other is a QR-based temporal convolutional network (QR_TCN). The stacked CNN is used to focus on learning short-term local dependencies in PV power sequences, and the TCN is used to learn long-term temporal constraints between multi-feature data. These two branches extract different features from input data with different prior knowledge. By jointly training the two branches, the model is able to learn the probability distribution of PV power and obtain discrete conditional quantile forecasts of PV power in the ultra-short term. Then, based on these conditional quantile forecasts, a kernel density estimation method is used to estimate the PV power probability density function. The proposed method innovatively employs two ways of a priori knowledge injection: constructing a differential sequence of historical power as an input feature to provide more information about the ultrashort-term dynamics of the PV power and, at the same time, dividing it, together with all the other features, into two sets of inputs that contain different a priori features according to the demand of the forecasting task; and the dual-branching model architecture is designed to deeply match the data of the two sets of input features to the corresponding branching model computational mechanisms. The two a priori knowledge injection methods provide more effective features for the model and improve the forecasting performance and understandability of the model. The performance of the proposed model in point forecasting, interval forecasting, and probabilistic forecasting is comprehensively evaluated through the case of a real PV plant. The experimental results show that the proposed model performs well on the task of ultra-short-term PV power probabilistic forecasting and outperforms other state-of-the-art deep learning models in the field combined with QR. The proposed method in this paper can provide technical support for application scenarios such as energy scheduling, market trading, and risk management on the ultra-short-term time scale of the power system. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

Figure 1
<p>The overall research flowchart of the proposed method.</p>
Full article ">Figure 2
<p>Structure of the TCN model.</p>
Full article ">Figure 3
<p>A schematic of the structure of the proposed model.</p>
Full article ">Figure 4
<p>Trends in input feature variables for five consecutive days in dataset 1.</p>
Full article ">Figure 5
<p>Trend comparison of the PV power series with its first-order difference series. (<b>a</b>) PV power series for five consecutive days with its first-order difference series. (<b>b</b>) Enlarged view of the dotted box in (<b>a</b>).</p>
Full article ">Figure 6
<p>Parameter settings and data flow for each model.</p>
Full article ">Figure 7
<p>The line graph of point predictions for all models on dataset 1 for 5 consecutive days.</p>
Full article ">Figure 8
<p>The line graph of point predictions for all models on dataset 2 for 5 consecutive days.</p>
Full article ">Figure 9
<p>The line graph of point predictions for all models on dataset 3 for 5 consecutive days.</p>
Full article ">Figure 10
<p>Results of 5 consecutive days of day interval forecasting for the proposed model on dataset 1.</p>
Full article ">Figure 11
<p>Results of 5 consecutive days of day interval forecasting for the proposed model on dataset 2.</p>
Full article ">Figure 12
<p>Results of 5 consecutive days of day interval forecasting for the proposed model on dataset 3.</p>
Full article ">Figure 13
<p>Comparison of the three forecasting results of each model on the three datasets.</p>
Full article ">Figure 14
<p>Probability density curves for the proposed model on dataset 1 at 9 sampling points.</p>
Full article ">
13 pages, 1615 KiB  
Article
Semi-Supervised Left-Atrial Segmentation Based on Squeeze–Excitation and Triple Consistency Training
by Dongsheng Wang, Tiezhen Xv, Jianshen Li, Jiehui Liu, Jinxi Guo and Lijie Yang
Symmetry 2024, 16(8), 1041; https://doi.org/10.3390/sym16081041 - 14 Aug 2024
Abstract
Convolutional neural networks (CNNs) have achieved remarkable success in fully supervised medical image segmentation tasks. However, the acquisition of large quantities of homogeneous labeled data is challenging, making semi-supervised training methods that rely on a small amount of labeled data and pseudo-labels increasingly [...] Read more.
Convolutional neural networks (CNNs) have achieved remarkable success in fully supervised medical image segmentation tasks. However, the acquisition of large quantities of homogeneous labeled data is challenging, making semi-supervised training methods that rely on a small amount of labeled data and pseudo-labels increasingly popular in recent years. Most existing semi-supervised learning methods, however, underestimate the importance of the unlabeled regions during training. This paper posits that these regions may contain crucial information for minimizing the model’s uncertainty prediction. To enhance the segmentation performance of the left-atrium database, this paper proposes a triple consistency segmentation network based on the squeeze-and-excitation mechanism (SETC-Net). Specifically, the paper constructs a symmetric architectural unit called SEConv, which adaptively recalibrates the feature responses in the channel direction by modeling the inter-channel correlations. This allows the network to adaptively weigh each channel according to the task’s needs, thereby emphasizing or suppressing different feature channels. Moreover, SETC-Net is composed of an encoder and three slightly different decoders, which convert the prediction discrepancies among the three decoders into unsupervised loss through a constructed iterative pseudo-labeling scheme, thus encouraging consistent and low-entropy predictions. This allows the model to gradually capture generalized features from these challenging unmarked regions. We evaluated the proposed SETC-Net on the public left-atrium (LA) database. The proposed method achieved an excellent Dice score of 91.14% using only 20% of the labeled data. The experiments demonstrate that the proposed SETC-Net outperforms seven current semi-supervised methods in left-atrium segmentation and is one of the best semi-supervised segmentation methods on the LA database. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Figure 1
<p>Squeeze-and-excitation block.</p>
Full article ">Figure 2
<p>SEConv block.</p>
Full article ">Figure 3
<p>SETC-Net network structure diagram.</p>
Full article ">Figure 4
<p>Segmentation results obtained on the LA database using UA-MT [<a href="#B25-symmetry-16-01041" class="html-bibr">25</a>] (first column), SASSNet [<a href="#B31-symmetry-16-01041" class="html-bibr">31</a>] (second column), DTC [<a href="#B30-symmetry-16-01041" class="html-bibr">30</a>] (third column), and our SETC-Net (fourth column), along with the corresponding ground truth (fifth column). Comparisons between the use of 10% and 20% labeled data are shown in the top and bottom rows, respectively.</p>
Full article ">
20 pages, 18987 KiB  
Article
Convolutional Neural Network and Ensemble Learning-Based Unmanned Aerial Vehicles Radio Frequency Fingerprinting Identification
by Yunfei Zheng, Xuejun Zhang, Shenghan Wang and Weidong Zhang
Drones 2024, 8(8), 391; https://doi.org/10.3390/drones8080391 - 13 Aug 2024
Viewed by 334
Abstract
With the rapid development of the unmanned aerial vehicles (UAVs) industry, there is increasing demand for UAV surveillance technology. Automatic Dependent Surveillance-Broadcast (ADS-B) provides accurate monitoring of UAVs. However, the system cannot encrypt messages or verify identity. To address the issue of identity [...] Read more.
With the rapid development of the unmanned aerial vehicles (UAVs) industry, there is increasing demand for UAV surveillance technology. Automatic Dependent Surveillance-Broadcast (ADS-B) provides accurate monitoring of UAVs. However, the system cannot encrypt messages or verify identity. To address the issue of identity spoofing, radio frequency fingerprinting identification (RFFI) is applied for ADS-B transmitters to determine the true identities of UAVs through physical layer security technology. This paper develops an ensemble learning ADS-B radio signal recognition framework. Firstly, the research analyzes the data content characteristics of the ADS-B signal and conducts segment processing to eliminate the possible effects of the signal content. To extract features from different signal segments, a method merging end-to-end and non-end-to-end data processing is approached in a convolutional neural network. Subsequently, these features are fused through EL to enhance the robustness and generalizability of the identification system. Finally, the proposed framework’s effectiveness is evaluated using collected ADS-B data. The experimental results indicate that the recognition accuracy of the proposed ELWAM-CNN method can reach up to 97.43% and have better performance at different signal-to-noise ratios compared to existing methods using machine learning. Full article
(This article belongs to the Special Issue Physical-Layer Security in Drone Communications)
Show Figures

Figure 1

Figure 1
<p>ADS-B signal format.</p>
Full article ">Figure 2
<p>Content characteristics of ADS-B signals.</p>
Full article ">Figure 3
<p>ADS-B signal transmission block.</p>
Full article ">Figure 4
<p>Flow chart of RFFI.</p>
Full article ">Figure 5
<p>Preprocessing of ADS-B signals.</p>
Full article ">Figure 6
<p>CNN framework models: (<b>a</b>) VGG linear architecture model, (<b>b</b>) ResNet architecture model.</p>
Full article ">Figure 7
<p>Ensemble learning flow.</p>
Full article ">Figure 8
<p>ADS-B signals from six different transmitters: (<b>a</b>) ADS-B signals from transmitter 1; (<b>b</b>) ADS-B signals from transmitter 2; (<b>c</b>) ADS-B signals from transmitter 3; (<b>d</b>) ADS-B signals from transmitter 4; (<b>e</b>) ADS-B signals from transmitter 5; (<b>f</b>) ADS-B signals from transmitter 6.</p>
Full article ">Figure 9
<p>Comparison between the original signal and the noise-added signal: (<b>a</b>) Original signal; (<b>b</b>) Noise-added signal.</p>
Full article ">Figure 10
<p>ADS-B signal segmentation fragment: (<b>a</b>) First segmentation fragment SET 1; (<b>b</b>) Second segmentation fragment STE 2; (<b>c</b>) Third segmentation fragment SET 3.1; (<b>d</b>) Fourth segmentation fragment STE 3.2.</p>
Full article ">Figure 11
<p>CNN parameter settings: (<b>a</b>) parameter settings for the modified VGG network; (<b>b</b>) parameter settings for the modified ResNet network.</p>
Full article ">Figure 12
<p>Accuracy of each primary classifier for different segments of data.</p>
Full article ">Figure 13
<p>Performance comparison of ensemble classifier and primary classifier.</p>
Full article ">Figure 14
<p>Performance of combinatorial classifiers under different data preprocessing.</p>
Full article ">Figure 15
<p>Ensemble learning versus other methods.</p>
Full article ">Figure 16
<p>Recognition accuracy at different frequency offsets: (<b>a</b>) ADS-B signal with 2 kHz frequency offset; (<b>b</b>) ADS-B signal with 4 kHz frequency offset; (<b>c</b>) ADS-B signal with 6 kHz frequency offset; (<b>d</b>) ADS-B signal with 8 kHz frequency offset.</p>
Full article ">Figure 17
<p>Confusion matrix for RFF of 21 classes ADS-B transmitters: (<b>a</b>) ELWAN-CNN recognition accuracy confusion matrix at 30 dB SNR; (<b>b</b>) ELWAN-CNN recognition accuracy confusion matrix at 0 dB SNR; (<b>c</b>) ELVM-CNN recognition accuracy confusion matrix at 30 dB SNR; (<b>d</b>) ELVM-CNN recognition accuracy confusion matrix at 0 dB SNR.</p>
Full article ">
21 pages, 3057 KiB  
Article
Automated Multi-Class Facial Syndrome Classification Using Transfer Learning Techniques
by Fayroz F. Sherif, Nahed Tawfik, Doaa Mousa, Mohamed S. Abdallah and Young-Im Cho
Bioengineering 2024, 11(8), 827; https://doi.org/10.3390/bioengineering11080827 - 13 Aug 2024
Viewed by 197
Abstract
Genetic disorders affect over 6% of the global population and pose substantial obstacles to healthcare systems. Early identification of these rare facial genetic disorders is essential for managing related medical complexities and health issues. Many people consider the existing screening techniques inadequate, often [...] Read more.
Genetic disorders affect over 6% of the global population and pose substantial obstacles to healthcare systems. Early identification of these rare facial genetic disorders is essential for managing related medical complexities and health issues. Many people consider the existing screening techniques inadequate, often leading to a diagnosis several years after birth. This study evaluated the efficacy of deep learning-based classifier models for accurately recognizing dysmorphic characteristics using facial photos. This study proposes a multi-class facial syndrome classification framework that encompasses a unique combination of diseases not previously examined together. The study focused on distinguishing between individuals with four specific genetic disorders (Down syndrome, Noonan syndrome, Turner syndrome, and Williams syndrome) and healthy controls. We investigated how well fine-tuning a few well-known convolutional neural network (CNN)-based pre-trained models—including VGG16, ResNet-50, ResNet152, and VGG-Face—worked for the multi-class facial syndrome classification task. We obtained the most encouraging results by adjusting the VGG-Face model. The proposed fine-tuned VGG-Face model not only demonstrated the best performance in this study, but it also performed better than other state-of-the-art pre-trained CNN models for the multi-class facial syndrome classification task. The fine-tuned model achieved both accuracy and an F1-Score of 90%, indicating significant progress in accurately detecting the specified genetic disorders. Full article
Show Figures

Figure 1

Figure 1
<p>The distribution of the dataset in terms of age, gender, and ethnicity.</p>
Full article ">Figure 2
<p>The proposed multi-class syndrome classification.</p>
Full article ">Figure 3
<p>The training data distribution before (<b>a</b>) and after (<b>b</b>) data augmentation.</p>
Full article ">Figure 4
<p>The VGG16 model performance during the training phase.</p>
Full article ">Figure 5
<p>The ResNet-50 model performance during the training phase.</p>
Full article ">Figure 6
<p>The ResNet-152 model performance during the training phase.</p>
Full article ">Figure 7
<p>The NGG-Face model performance during the training phase.</p>
Full article ">Figure 8
<p>Comparative ROC curve for Down syndrome.</p>
Full article ">Figure 9
<p>Comparative ROC curve for Turner syndrome.</p>
Full article ">Figure 10
<p>Comparative ROC curve for Williams syndrome.</p>
Full article ">Figure 11
<p>Comparative ROC curve for Noonan syndrome.</p>
Full article ">Figure 12
<p>Comparative ROC curve for healthy controls.</p>
Full article ">
19 pages, 1604 KiB  
Article
An Efficient AdaBoost Algorithm for Enhancing Skin Cancer Detection and Classification
by Seham Gamil, Feng Zeng, Moath Alrifaey, Muhammad Asim and Naveed Ahmad
Algorithms 2024, 17(8), 353; https://doi.org/10.3390/a17080353 - 12 Aug 2024
Viewed by 404
Abstract
Skin cancer is a prevalent and perilous form of cancer and presents significant diagnostic challenges due to its high costs, dependence on medical experts, and time-consuming procedures. The existing diagnostic process is inefficient and expensive, requiring extensive medical expertise and time. To tackle [...] Read more.
Skin cancer is a prevalent and perilous form of cancer and presents significant diagnostic challenges due to its high costs, dependence on medical experts, and time-consuming procedures. The existing diagnostic process is inefficient and expensive, requiring extensive medical expertise and time. To tackle these issues, researchers have explored the application of artificial intelligence (AI) tools, particularly machine learning techniques such as shallow and deep learning, to enhance the diagnostic process for skin cancer. These tools employ computer algorithms and deep neural networks to identify and categorize skin cancer. However, accurately distinguishing between skin cancer and benign tumors remains challenging, necessitating the extraction of pertinent features from image data for classification. This study addresses these challenges by employing Principal Component Analysis (PCA), a dimensionality-reduction approach, to extract relevant features from skin images. Additionally, accurately classifying skin images into malignant and benign categories presents another obstacle. To improve accuracy, the AdaBoost algorithm is utilized, which amalgamates weak classification models into a robust classifier with high accuracy. This research introduces a novel approach to skin cancer diagnosis by integrating Principal Component Analysis (PCA), AdaBoost, and EfficientNet B0, leveraging artificial intelligence (AI) tools. The novelty lies in the combination of these techniques to develop a robust and accurate system for skin cancer classification. The advantage of this approach is its ability to significantly reduce costs, minimize reliance on medical experts, and expedite the diagnostic process. The developed model achieved an accuracy of 93.00% using the DermIS dataset and demonstrated excellent precision, recall, and F1-score values, confirming its ability to correctly classify skin lesions as malignant or benign. Additionally, the model achieved an accuracy of 91.00% using the ISIC dataset, which is widely recognized for its comprehensive collection of annotated dermoscopic images, providing a robust foundation for training and validation. These advancements have the potential to significantly enhance the efficiency and accuracy of skin cancer diagnosis and classification. Ultimately, the integration of AI tools and techniques in skin cancer diagnosis can lead to cost reduction and improved patient outcomes, benefiting both patients and healthcare providers. Full article
Show Figures

Figure 1

Figure 1
<p>Our proposed model.</p>
Full article ">Figure 2
<p>Images of a skin lesion. (<b>a</b>) Melanoma image. (<b>b</b>) Benign image.</p>
Full article ">Figure 3
<p>Visualization of the performance metrics for algorithms classifiers.</p>
Full article ">Figure 4
<p>Visualization of the performance metrics for algorithms classifiers.</p>
Full article ">
25 pages, 13951 KiB  
Article
1D-CNN-Transformer for Radar Emitter Identification and Implemented on FPGA
by Xiangang Gao, Bin Wu, Peng Li and Zehuan Jing
Remote Sens. 2024, 16(16), 2962; https://doi.org/10.3390/rs16162962 - 12 Aug 2024
Viewed by 301
Abstract
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to [...] Read more.
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to the requirements of the low power consumption and high-performance processing of SEI on embedded devices, so this article proposes solutions from the aspects of software and hardware. From the software side, we design a Transformer variant network, lightweight convolutional Transformer (LW-CT) that supports parameter sharing. Then, we cascade convolutional neural networks (CNNs) and the LW-CT to construct a one-dimensional-CNN-Transformer(1D-CNN-Transformer) lightweight neural network model that can capture the long-range dependencies of radar emitter signals and extract signal spatial domain features meanwhile. In terms of hardware, we design a low-power neural network accelerator based on an FPGA to complete the real-time recognition of radar emitter signals. The accelerator not only designs high-efficiency computing engines for the network, but also devises a reconfigurable buffer called “Ping-pong CBUF” and two-level pipeline architecture for the convolution layer for alleviating the bottleneck caused by the off-chip storage access bandwidth. Experimental results show that the algorithm can achieve a high recognition performance of SEI with a low calculation overhead. In addition, the hardware acceleration platform not only perfectly meets the requirements of the radar emitter recognition system for low power consumption and high-performance processing, but also outperforms the accelerators in other papers in terms of the energy efficiency ratio of Transformer layer processing. Full article
Show Figures

Figure 1

Figure 1
<p>Overall architecture of the accelerator.</p>
Full article ">Figure 2
<p>Waveform of the LFM signal, which is normalized.</p>
Full article ">Figure 3
<p>(<b>a</b>) The whole neural network architecture. (<b>b</b>) The structure of the ResD1D Block.</p>
Full article ">Figure 4
<p>The structure of LW-CT.</p>
Full article ">Figure 5
<p>The structure of Central Logic.</p>
Full article ">Figure 6
<p>Instruction encoding format.</p>
Full article ">Figure 7
<p>Two-stage pipeline architecture for convolution.</p>
Full article ">Figure 8
<p>CONV1D calculation order.</p>
Full article ">Figure 9
<p>The structure of the CONV1D module.</p>
Full article ">Figure 10
<p>(<b>a</b>) The structure of the PE cluster, (<b>b</b>) the structure of PE, (<b>c</b>) the structure of MPM.</p>
Full article ">Figure 11
<p>The method of our PE cluster convolution and the traditional convolution.</p>
Full article ">Figure 12
<p>The structure of the MHSA module.</p>
Full article ">Figure 13
<p>The structure of the Self-attention Processing Module.</p>
Full article ">Figure 14
<p>The structure of the FC module.</p>
Full article ">Figure 15
<p>The radar emitter signal waveform of six radar individuals. (<b>a</b>–<b>f</b>) The signal-to-noise ratio of each radar emitter signal is −6 dB.</p>
Full article ">Figure 16
<p>The network classification performance of different models under −10 dB to 4 dB. The maximum number of channels in the convolutional layers of (<b>a</b>–<b>d</b>) are 48, 96, 192, and 384, respectively.</p>
Full article ">Figure 17
<p>(<b>a</b>) Test accuracy with different channel numbers; (<b>b</b>) params and operations with different channel numbers.</p>
Full article ">Figure 18
<p>Recognition performance of different models.</p>
Full article ">Figure 19
<p>Details of the proposed FPGA implementation. Breakdowns of (<b>a</b>) DSP blocks, (<b>b</b>) block RAMs.</p>
Full article ">
26 pages, 9128 KiB  
Article
AI-Based Visual Early Warning System
by Zeena Al-Tekreeti, Jeronimo Moreno-Cuesta, Maria Isabel Madrigal Garcia and Marcos A. Rodrigues
Informatics 2024, 11(3), 59; https://doi.org/10.3390/informatics11030059 - 12 Aug 2024
Viewed by 347
Abstract
Facial expressions are a universally recognised means of conveying internal emotional states across diverse human cultural and ethnic groups. Recent advances in understanding people’s emotions expressed through verbal and non-verbal communication are particularly noteworthy in the clinical context for the assessment of patients’ [...] Read more.
Facial expressions are a universally recognised means of conveying internal emotional states across diverse human cultural and ethnic groups. Recent advances in understanding people’s emotions expressed through verbal and non-verbal communication are particularly noteworthy in the clinical context for the assessment of patients’ health and well-being. Facial expression recognition (FER) plays an important and vital role in health care, providing communication with a patient’s feelings and allowing the assessment and monitoring of mental and physical health conditions. This paper shows that automatic machine learning methods can predict health deterioration accurately and robustly, independent of human subjective assessment. The prior work of this paper is to discover the early signs of deteriorating health that align with the principles of preventive reactions, improving health outcomes and human survival, and promoting overall health and well-being. Therefore, methods are developed to create a facial database mimicking the underlying muscular structure of the face, whose Action Unit motions can then be transferred to human face images, thus displaying animated expressions of interest. Then, building and developing an automatic system based on convolution neural networks (CNN) and long short-term memory (LSTM) to recognise patterns of facial expressions with a focus on patients at risk of deterioration in hospital wards. This research presents state-of-the-art results on generating and modelling synthetic database and automated deterioration prediction through FEs with 99.89% accuracy. The main contributions to knowledge from this paper can be summarized as (1) the generation of visual datasets mimicking real-life samples of facial expressions indicating health deterioration, (2) improvement of the understanding and communication with patients at risk of deterioration through facial expression analysis, and (3) development of a state-of-the-art model to recognize such facial expressions using a ConvLSTM model. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Facial expression areas that reveal if the patient is under deterioration or not. (<b>a</b>) The left avatar expresses a neutral expression, which is bounded by the blue rectangles. (<b>b</b>) The right avatar reveals deterioration status in the final stage, which is bounded by the red rectangles.</p>
Full article ">Figure 2
<p>Five classes along with the combination of Action Units.</p>
Full article ">Figure 3
<p>Frames of video sample after utilizing FOMM to transfer facial expressions from avatars to real facial images.</p>
Full article ">Figure 4
<p>Samples of five classes of facial frames representing five classes.</p>
Full article ">Figure 5
<p>Facial frames samples for each class after pre-processing using face mesh as a face detection technique.</p>
Full article ">Figure 6
<p>Number and ratio of samples in each class for the whole dataset. (<b>a</b>) The total number of samples is represented by column chart. (<b>b</b>) The ratio of samples in each class.</p>
Full article ">Figure 7
<p>Number of samples in training and test dataset. (<b>a</b>) The number of samples in the training dataset. (<b>b</b>) The number of samples in the test dataset.</p>
Full article ">Figure 8
<p>Number of training samples before and after oversampling method. (<b>a</b>) Number of samples of training dataset before oversampling. (<b>b</b>) Number of samples of training dataset after oversampling.</p>
Full article ">Figure 9
<p>Structure of ConvLSTM [<a href="#B40-informatics-11-00059" class="html-bibr">40</a>].</p>
Full article ">Figure 10
<p>The proposed model architecture.</p>
Full article ">Figure 11
<p>(<b>a</b>) Evaluation metrics of model performance. Accuracy of the proposed model. (<b>b</b>) Precision of the proposed model. (<b>c</b>) Recall of the proposed model.</p>
Full article ">Figure 12
<p>Loss, Mean Square Error and Mean Absolute Error. (<b>a</b>) Loss of the predicted model. (<b>b</b>) Mean Square Error of the predicted model. (<b>c</b>) Mean Absolute Error.</p>
Full article ">Figure 13
<p>Confusion matrix.</p>
Full article ">Figure 14
<p>Evaluation of the model.</p>
Full article ">Figure 15
<p>Classification report.</p>
Full article ">Figure 16
<p>Evaluating model by Receiver Operating Characteristics Curve (ROC) Precision-Recall Curve. (<b>a</b>) ROC. (<b>b</b>) Precision-Recall Curve.</p>
Full article ">Figure 17
<p>The percentage of accuracy of model prediction for unseen data for different classes. (<b>a</b>) Accuracy of prediction of unseen data predicted as Class FD1. (<b>b</b>) Accuracy of prediction unseen data predicted as Class FD2-L. (<b>c</b>) Accuracy of prediction of unseen data predicted as Class FD2-R. (<b>d</b>) Accuracy of prediction of unseen data predicted as Class FD3-L. (<b>e</b>) Accuracy of prediction of unseen data predicted as Class FD3-R.</p>
Full article ">Figure 17 Cont.
<p>The percentage of accuracy of model prediction for unseen data for different classes. (<b>a</b>) Accuracy of prediction of unseen data predicted as Class FD1. (<b>b</b>) Accuracy of prediction unseen data predicted as Class FD2-L. (<b>c</b>) Accuracy of prediction of unseen data predicted as Class FD2-R. (<b>d</b>) Accuracy of prediction of unseen data predicted as Class FD3-L. (<b>e</b>) Accuracy of prediction of unseen data predicted as Class FD3-R.</p>
Full article ">Figure 17 Cont.
<p>The percentage of accuracy of model prediction for unseen data for different classes. (<b>a</b>) Accuracy of prediction of unseen data predicted as Class FD1. (<b>b</b>) Accuracy of prediction unseen data predicted as Class FD2-L. (<b>c</b>) Accuracy of prediction of unseen data predicted as Class FD2-R. (<b>d</b>) Accuracy of prediction of unseen data predicted as Class FD3-L. (<b>e</b>) Accuracy of prediction of unseen data predicted as Class FD3-R.</p>
Full article ">
22 pages, 18817 KiB  
Article
Innovative Noise Extraction and Denoising in Low-Dose CT Using a Supervised Deep Learning Framework
by Wei Zhang, Abderrahmane Salmi, Chifu Yang and Feng Jiang
Electronics 2024, 13(16), 3184; https://doi.org/10.3390/electronics13163184 - 12 Aug 2024
Viewed by 283
Abstract
Low-dose computed tomography (LDCT) imaging is a critical tool in medical diagnostics due to its reduced radiation exposure. However, this reduction often results in increased noise levels, compromising image quality and diagnostic accuracy. Despite advancements in denoising techniques, a robust method that effectively [...] Read more.
Low-dose computed tomography (LDCT) imaging is a critical tool in medical diagnostics due to its reduced radiation exposure. However, this reduction often results in increased noise levels, compromising image quality and diagnostic accuracy. Despite advancements in denoising techniques, a robust method that effectively balances noise reduction and detail preservation remains a significant need. Current denoising algorithms frequently fail to maintain the necessary balance between suppressing noise and preserving crucial diagnostic details. Addressing this gap, our study focuses on developing a deep learning-based denoising algorithm that enhances LDCT image quality without losing essential diagnostic information. Here we present a novel supervised learning-based LDCT denoising algorithm that employs innovative noise extraction and denoising techniques. Our method significantly enhances LDCT image quality by incorporating multiple attention mechanisms within a U-Net-like architecture. Our approach includes a noise extraction network designed to capture diverse noise patterns precisely. This network is integrated into a comprehensive denoising system consisting of a generator network, a discriminator network, and a feature extraction AutoEncoder network. The generator network removes noise and produces high-quality CT images, while the discriminator network differentiates real images from denoised ones, improving the realism of the outputs. The AutoEncoder network ensures the preservation of image details and diagnostic integrity. Our method improves the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) by 7.777 and 0.128 compared to LDCT, by 0.483 and 0.064 compared to residual encoder–decoder convolutional neural network (RED-CNN), by 4.101 and 0.017 compared to Wasserstein generative adversarial network–visual geometry group (WGAN-VGG), and by 3.895 and 0.011 compared to Wasserstein generative adversarial network–autoencoder (WGAN-AE). This demonstrates that our method has a significant advantage in enhancing the signal-to-noise ratio of images. Extensive experiments on multiple standard datasets demonstrate our method’s superior performance in noise suppression and image quality enhancement compared to existing techniques. Our findings significantly impact medical imaging, particularly improving LDCT scan diagnostic accuracy. The enhanced image clarity and detail preservation offered by our method open new avenues for clinical applications and research. This improvement in LDCT image quality promises substantial contributions to clinical diagnostics, disease detection, and treatment planning, ensuring high-quality diagnostic outcomes while minimizing patient radiation exposure. Full article
(This article belongs to the Special Issue Advanced Internet of Things Solutions and Technologies)
Show Figures

Figure 1

Figure 1
<p>Overall network structure of noise extraction.</p>
Full article ">Figure 2
<p>Autoencoder network structure diagram.</p>
Full article ">Figure 3
<p>Schematic diagram of the overall structure of our proposed denoising network.</p>
Full article ">Figure 4
<p>Generator diagram.</p>
Full article ">Figure 5
<p>Discriminator diagram.</p>
Full article ">Figure 6
<p>Samples from the AAPM−Mayo dataset, where (<b>a</b>) is an LDCT abdominal image and (<b>b</b>) is an NDCT abdominal image.</p>
Full article ">Figure 7
<p>Input and label images for the noise extraction network.</p>
Full article ">Figure 8
<p>Training loss curve.</p>
Full article ">Figure 9
<p>Predicted results of networks with various attention mechanism combinations on test images.</p>
Full article ">Figure 10
<p>Comparison of LDCT image denoising results using different methods.</p>
Full article ">Figure 11
<p>Comparison of local magnification of low-dose CT image denoising results using different methods.</p>
Full article ">
20 pages, 16267 KiB  
Article
Multi-Scale Detail–Noise Complementary Learning for Image Denoising
by Yan Cui, Mingyue Shi and Jielin Jiang
Appl. Sci. 2024, 14(16), 7044; https://doi.org/10.3390/app14167044 - 11 Aug 2024
Viewed by 460
Abstract
Deep convolutional neural networks (CNNs) have demonstrated significant potential in enhancing image denoising performance. However, most denoising methods fuse different levels of features through long and short skip connections, easily generating a lot of redundant information, thereby weakening the complementarity of different levels [...] Read more.
Deep convolutional neural networks (CNNs) have demonstrated significant potential in enhancing image denoising performance. However, most denoising methods fuse different levels of features through long and short skip connections, easily generating a lot of redundant information, thereby weakening the complementarity of different levels of features, resulting in the loss of image details. In this paper, we propose a multi-scale detail–noise complementary learning (MDNCL) network for additive white Gaussian noise removal and real-world noise removal. The MDNCL network comprises two branches, namely the Detail Feature Learning Branch (DLB) and the Noise Learning Branch (NLB). Specifically, a loss function is applied to guide the complementary learning of image detail features and noisy mappings in these two branches. This learning approach effectively balances noise reduction and detail restoration, especially when dealing with high ratios of noise. To enhance the complementarity of features between different network layers and avoid redundant information, we designed a Feature Subtraction Unit (FSU) to capture the differences in features across the DLB network layers. Our extensive experimental evaluations demonstrate that the MDNCL approach achieves impressive denoising performance and outperforms other popular denoising methods. Full article
(This article belongs to the Special Issue Advances in Neural Networks and Deep Learning)
Show Figures

Figure 1

Figure 1
<p>The overall framework of the proposed method. (<b>a</b>) Network architecture of MDNCL. (<b>b</b>) Dense block (DB).</p>
Full article ">Figure 2
<p>Schematic diagram of DLB.</p>
Full article ">Figure 3
<p>Schematic diagram of FSU.</p>
Full article ">Figure 4
<p>Denoising results of “House” from Set12 with a noise level of 15. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) BM3D/34.93 dB; (<b>d</b>) DnCNN/34.99 dB; (<b>e</b>) FFDNet/35.07 dB; (<b>f</b>) ADNet/35.22 dB; (<b>g</b>) BRDNet/35.27 dB; (<b>h</b>) MDNCL (Ours)/35.46 dB.</p>
Full article ">Figure 5
<p>Denoising results of images from BSD68 with a noise of level 25. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) BM3D/29.53 dB; (<b>d</b>) DnCNN/30.17 dB; (<b>e</b>) FFDNet/30.02 dB; (<b>f</b>) ADNet/30.24 dB; (<b>g</b>) BRDNet/30.27 dB; (<b>h</b>) MWDCNN/30.14 dB; and (<b>i</b>) MDNCL (ours)/30.39 dB.</p>
Full article ">Figure 6
<p>Denoising results of two images from CBSD68 with a noise level of 25. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) FFDNet/30.54 dB; (<b>d</b>) ADNet/30.74 dB; (<b>e</b>) MDNCL (Ours)/31.10 dB; (<b>f</b>) clean image; (<b>g</b>) noisy image; (<b>h</b>) FFDNet/29.85 dB; (<b>i</b>) ADNet/29.92 dB; (<b>j</b>) MDNCL (Ours)/30.07 dB.</p>
Full article ">Figure 7
<p>Denoising results of one image from Kodak24 with a noise level of 50. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) DnCNN/26.13 dB; (<b>d</b>) ADNet/26.22 dB; (<b>e</b>) FFDNet/26.23 dB; (<b>f</b>) CMDNCL (Ours)/26.48 dB.</p>
Full article ">Figure 8
<p>Denoising results of one image from the McMaster dataset with a noise level of 50. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) DnCNN/31.13 dB; (<b>d</b>) ADNet/31.84 dB; (<b>e</b>) FFDNet/31.82 dB; (<b>f</b>) CMDNCL (Ours)/32.08 dB.</p>
Full article ">Figure 9
<p>Denoising results on SIDD. The noisy images are arranged at the top, and the denoised images are presented at the bottom.</p>
Full article ">Figure 10
<p>Denoising results of one image from Set12 with a noise level of 15. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) MDNCL-NLB/30.19 dB; (<b>d</b>) MDNCL-DLB/31.27 dB; (<b>e</b>) MDNCL-FSU/28.79 dB; (<b>f</b>) MDNCL/32.08 dB.</p>
Full article ">Figure 11
<p>Image denoising result with a noise level of 25 on Set12 when <math display="inline"><semantics> <mi>μ</mi> </semantics></math> takes different values. (<b>a</b>) clean image; (<b>b</b>) noisy image; (<b>c</b>) <math display="inline"><semantics> <mi>μ</mi> </semantics></math> = 0.6; (<b>d</b>) <math display="inline"><semantics> <mi>μ</mi> </semantics></math> = 0.8; (<b>e</b>) <math display="inline"><semantics> <mi>μ</mi> </semantics></math> = 1.2; (<b>f</b>) <math display="inline"><semantics> <mi>μ</mi> </semantics></math> = 1.0.</p>
Full article ">
18 pages, 12276 KiB  
Article
Early Poplar (Populus) Leaf-Based Disease Detection through Computer Vision, YOLOv8, and Contrast Stretching Technique
by Furkat Bolikulov, Akmalbek Abdusalomov, Rashid Nasimov, Farkhod Akhmedov and Young-Im Cho
Sensors 2024, 24(16), 5200; https://doi.org/10.3390/s24165200 - 11 Aug 2024
Viewed by 320
Abstract
Poplar (Populus) trees play a vital role in various industries and in environmental sustainability. They are widely used for paper production, timber, and as windbreaks, in addition to their significant contributions to carbon sequestration. Given their economic and ecological importance, effective [...] Read more.
Poplar (Populus) trees play a vital role in various industries and in environmental sustainability. They are widely used for paper production, timber, and as windbreaks, in addition to their significant contributions to carbon sequestration. Given their economic and ecological importance, effective disease management is essential. Convolutional Neural Networks (CNNs), particularly adept at processing visual information, are crucial for the accurate detection and classification of plant diseases. This study introduces a novel dataset of manually collected images of diseased poplar leaves from Uzbekistan and South Korea, enhancing the geographic diversity and application of the dataset. The disease classes consist of “Parsha (Scab)”, “Brown-spotting”, “White-Gray spotting”, and “Rust”, reflecting common afflictions in these regions. This dataset will be made publicly available to support ongoing research efforts. Employing the advanced YOLOv8 model, a state-of-the-art CNN architecture, we applied a Contrast Stretching technique prior to model training in order to enhance disease detection accuracy. This approach not only improves the model’s diagnostic capabilities but also offers a scalable tool for monitoring and treating poplar diseases, thereby supporting the health and sustainability of these critical resources. This dataset, to our knowledge, will be the first of its kind to be publicly available, offering a valuable resource for researchers and practitioners worldwide. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

Figure 1
<p>Description of our proposed method.</p>
Full article ">Figure 2
<p>Classes of diseases on the leaves.</p>
Full article ">Figure 3
<p>Examples of labelled <span class="html-italic">poplar</span> leaves before augmentation.</p>
Full article ">Figure 4
<p>Examples of training dataset after augmentation process.</p>
Full article ">Figure 5
<p>Architecture of YOLOv8 [<a href="#B30-sensors-24-05200" class="html-bibr">30</a>].</p>
Full article ">Figure 6
<p>Contrast Stretching image technique.</p>
Full article ">Figure 7
<p>Precision–Confidence Curve.</p>
Full article ">Figure 8
<p>Precision–Recall Curve.</p>
Full article ">Figure 9
<p>Recall–Confidence Curve.</p>
Full article ">Figure 10
<p>F1–Confidance Curve.</p>
Full article ">Figure 11
<p>Correlogram.</p>
Full article ">Figure 12
<p>Accuracy of proposed method.</p>
Full article ">
40 pages, 27981 KiB  
Article
Pyramid Cascaded Convolutional Neural Network with Graph Convolution for Hyperspectral Image Classification
by Haizhu Pan, Hui Yan, Haimiao Ge, Liguo Wang and Cuiping Shi
Remote Sens. 2024, 16(16), 2942; https://doi.org/10.3390/rs16162942 - 11 Aug 2024
Viewed by 269
Abstract
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have made considerable advances in hyperspectral image (HSI) classification. However, most CNN-based methods learn features at a single-scale in HSI data, which may be insufficient for multi-scale feature extraction in complex data scenes. To [...] Read more.
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have made considerable advances in hyperspectral image (HSI) classification. However, most CNN-based methods learn features at a single-scale in HSI data, which may be insufficient for multi-scale feature extraction in complex data scenes. To learn the relations among samples in non-grid data, GCNs are employed and combined with CNNs to process HSIs. Nevertheless, most methods based on CNN-GCN may overlook the integration of pixel-wise spectral signatures. In this paper, we propose a pyramid cascaded convolutional neural network with graph convolution (PCCGC) for hyperspectral image classification. It mainly comprises CNN-based and GCN-based subnetworks. Specifically, in the CNN-based subnetwork, a pyramid residual cascaded module and a pyramid convolution cascaded module are employed to extract multiscale spectral and spatial features separately, which can enhance the robustness of the proposed model. Furthermore, an adaptive feature-weighted fusion strategy is utilized to adaptively fuse multiscale spectral and spatial features. In the GCN-based subnetwork, a band selection network (BSNet) is used to learn the spectral signatures in the HSI using nonlinear inter-band dependencies. Then, the spectral-enhanced GCN module is utilized to extract and enhance the important features in the spectral matrix. Subsequently, a mutual-cooperative attention mechanism is constructed to align the spectral signatures between BSNet-based matrix with the spectral-enhanced GCN-based matrix for spectral signature integration. Abundant experiments performed on four widely used real HSI datasets show that our model achieves higher classification accuracy than the fourteen other comparative methods, which shows the superior classification performance of PCCGC over the state-of-the-art methods. Full article
Show Figures

Figure 1

Figure 1
<p>The overall structure of the PCCGC.</p>
Full article ">Figure 2
<p>The detailed structure of the SpePHC block.</p>
Full article ">Figure 3
<p>The detailed structure of the SpaPHC block.</p>
Full article ">Figure 4
<p>The detailed structure of the spectral-enhanced GCN module.</p>
Full article ">Figure 5
<p>The structure of the mutual-cooperative attention mechanism.</p>
Full article ">Figure 6
<p>Full-pixel classification maps for the PU data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 7
<p>Full-pixel classification maps for the Houston data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 7 Cont.
<p>Full-pixel classification maps for the Houston data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 8
<p>Full-pixel classification maps for the Houghu data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 9
<p>Full-pixel classification maps for the IP data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 10
<p>Full-pixel classification maps for the IP data scene. (<b>a</b>) Ground-truth; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) Ours; (<b>q</b>) False-color image.</p>
Full article ">Figure 11
<p>The train accuracy and validation accuracy under the influence of weight <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>_</mo> <mn>1</mn> </mrow> </semantics></math> and weight <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>_</mo> <mn>2</mn> </mrow> </semantics></math> on (<b>a</b>) PU; (<b>b</b>) Honghu; (<b>c</b>) Houston; and (<b>d</b>) IP data scenes.</p>
Full article ">Figure 12
<p>The OA of our proposed model under different GCN layers on PU, Honghu, Houston, and IP data scenes.</p>
Full article ">Figure 13
<p>The OA of learning rate of our method under different epochs on (<b>a</b>) PU; (<b>b</b>) Honghu; (<b>c</b>) Houston; (<b>d</b>) IP data scenes.</p>
Full article ">Figure 14
<p>The OA of different classification methods under different training samples on (<b>a</b>) PU; (<b>b</b>) Honghu; (<b>c</b>) Houston; (<b>d</b>) IP data scenes.</p>
Full article ">Figure 15
<p>Feature visualization map of ten comparative methods and our proposed method on IP data scene. (<b>a</b>) Origin data; (<b>b</b>) SSRN; (<b>c</b>) DBDA; (<b>d</b>) SSGCA; (<b>e</b>) PCIA; (<b>f</b>) MDBNet; (<b>g</b>) HDDA; (<b>h</b>) DBPFA; (<b>i</b>) ChebNet; (<b>j</b>) GCN; (<b>k</b>) MVAHN; (<b>l</b>) DGFNet; (<b>m</b>) FTINet; (<b>n</b>) DKDMN; (<b>o</b>) MRCAG; (<b>p</b>) ours.</p>
Full article ">Figure 16
<p>Ablation experiments of our proposed model on PU, Houston, Honghu, and IP data scenes: model_0: complete model; model_1: model without mutual-cooperative mechanism; model_2: model with mutual-cooperative attention mechanism that includes GCN-based spectral signature; model_3: model without spectral-enhanced GCN module; model_4: the model that only includes GCN-based subnetwork; model_5: model that only includes CNN-based subnetwork; model_6: model without adaptive feature-weighted fusion strategy; model_7: model without spectral pyramid hybrid convolution block; model_8: model without spatial pyramid hybrid convolution block.</p>
Full article ">Figure 17
<p>The features before the spectral-enhanced GCN module: (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>), the features after the spectral-enhanced GCN module: (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>).</p>
Full article ">
35 pages, 3152 KiB  
Review
Deep Learning Models for PV Power Forecasting: Review
by Junfeng Yu, Xiaodong Li, Lei Yang, Linze Li, Zhichao Huang, Keyan Shen, Xu Yang, Xu Yang, Zhikang Xu, Dongying Zhang and Shuai Du
Energies 2024, 17(16), 3973; https://doi.org/10.3390/en17163973 - 10 Aug 2024
Viewed by 524
Abstract
Accurate forecasting of photovoltaic (PV) power is essential for grid scheduling and energy management. In recent years, deep learning technology has made significant progress in time-series forecasting, offering new solutions for PV power forecasting. This study provides a systematic review of deep learning [...] Read more.
Accurate forecasting of photovoltaic (PV) power is essential for grid scheduling and energy management. In recent years, deep learning technology has made significant progress in time-series forecasting, offering new solutions for PV power forecasting. This study provides a systematic review of deep learning models for PV power forecasting, concentrating on comparisons of the features, advantages, and limitations of different model architectures. First, we analyze the commonly used datasets for PV power forecasting. Additionally, we provide an overview of mainstream deep learning model architectures, including multilayer perceptron (MLP), recurrent neural networks (RNN), convolutional neural networks (CNN), and graph neural networks (GNN), and explain their fundamental principles and technical features. Moreover, we systematically organize the research progress of deep learning models based on different architectures for PV power forecasting. This study indicates that different deep learning model architectures have their own advantages in PV power forecasting. MLP models have strong nonlinear fitting capabilities, RNN models can capture long-term dependencies, CNN models can automatically extract local features, and GNN models have unique advantages for modeling spatiotemporal characteristics. This manuscript provides a comprehensive research survey for PV power forecasting using deep learning models, helping researchers and practitioners to gain a deeper understanding of the current applications, challenges, and opportunities of deep learning technology in this area. Full article
(This article belongs to the Topic Clean and Low Carbon Energy, 2nd Volume)
Show Figures

Figure 1

Figure 1
<p>Multilayer perceptron with a single hidden layer containing five hidden units.</p>
Full article ">Figure 2
<p>A network module for RNNs.</p>
Full article ">Figure 3
<p>The architecture of an LSTM model.</p>
Full article ">Figure 4
<p>Unit structure of GRU network.</p>
Full article ">Figure 5
<p>The basic structure of a convolutional neural network.</p>
Full article ">Figure 6
<p>The method of network information propagation in the TCN model.</p>
Full article ">Figure 7
<p>The approach for calculating an element of the output tensor.</p>
Full article ">Figure 8
<p>The diagram showing the relationship between two consecutive output elements and their respective input subsequences.</p>
Full article ">Figure 9
<p>The case of multiple input channels.</p>
Full article ">Figure 10
<p>A classic graph neural network.</p>
Full article ">Figure 11
<p>The modeling process of the spatial-temporal graph neural network (ST-GNN).</p>
Full article ">Figure 12
<p>Using RNN as components of ST-GNN.</p>
Full article ">Figure 13
<p>All references and PV power forecasting references.</p>
Full article ">
31 pages, 15968 KiB  
Article
Advanced Forecasting of Drought Zones in Canada Using Deep Learning and CMIP6 Projections
by Keyvan Soltani, Afshin Amiri, Isa Ebtehaj, Hanieh Cheshmehghasabani, Sina Fazeli, Silvio José Gumiere and Hossein Bonakdari
Climate 2024, 12(8), 119; https://doi.org/10.3390/cli12080119 - 10 Aug 2024
Viewed by 403
Abstract
This study addresses the critical issue of drought zoning in Canada using advanced deep learning techniques. Drought, exacerbated by climate change, significantly affects ecosystems, agriculture, and water resources. Canadian Drought Monitor (CDM) data provided by the Canadian government and ERA5-Land daily data were [...] Read more.
This study addresses the critical issue of drought zoning in Canada using advanced deep learning techniques. Drought, exacerbated by climate change, significantly affects ecosystems, agriculture, and water resources. Canadian Drought Monitor (CDM) data provided by the Canadian government and ERA5-Land daily data were utilized to generate a comprehensive time series of mean monthly precipitation and air temperature for 199 sample locations in Canada from 1979 to 2023. These data were processed in the Google Earth Engine (GEE) environment and used to develop a Convolutional Neural Network (CNN) model to estimate CDM values, thereby filling gaps in historical drought data. The CanESM5 climate model, as assessed in the IPCC Sixth Assessment Report, was employed under four climate change scenarios to predict future drought conditions. Our CNN model forecasts CDM values up to 2100, enabling accurate drought zoning. The results reveal significant trends in temperature changes, indicating areas most vulnerable to future droughts, while precipitation shows a slow increasing trend. Our analysis indicates that under extreme climate scenarios, certain regions may experience a significant increase in the frequency and severity of droughts, necessitating proactive planning and mitigation strategies. These findings are critical for policymakers and stakeholders in designing effective drought management and adaptation programs. Full article
Show Figures

Figure 1

Figure 1
<p>The relief map of the study area. The black dots show the distribution of sample points across Canada.</p>
Full article ">Figure 2
<p>The Schematic of the CNN’s structure.</p>
Full article ">Figure 3
<p>Research flowchart.</p>
Full article ">Figure 4
<p>(<b>a</b>) Classification Accuracy for 1022 ELM models. (<b>b</b>) Area Under the Curve (AUC) for 1022 ELM models.</p>
Full article ">Figure 5
<p>Zoning of Projected Average Annual Precipitation Anomalies in Canada (2024–2100) Compared to the Observed Period (1983–2023) using CanESM5 from CMIP6.</p>
Full article ">Figure 5 Cont.
<p>Zoning of Projected Average Annual Precipitation Anomalies in Canada (2024–2100) Compared to the Observed Period (1983–2023) using CanESM5 from CMIP6.</p>
Full article ">Figure 6
<p>Zoning of Projected Average Annual Temperature Anomalies in Canada (2024–2100) Compared to the Observed Period (1983–2023) using CanESM5 from CMIP6.</p>
Full article ">Figure 6 Cont.
<p>Zoning of Projected Average Annual Temperature Anomalies in Canada (2024–2100) Compared to the Observed Period (1983–2023) using CanESM5 from CMIP6.</p>
Full article ">Figure 7
<p>Canada’s Percentage of Drought-Affected Area, Pr, and T under the SSP126 Scenario: A Comparative Analysis from 2003–2023 to 2024–2100, Examining Historical Trends and Future Projections Using CanESM5 within the CMIP6 Framework.</p>
Full article ">Figure 8
<p>Canada’s Percentage of Drought-Affected Area, Pr, and T under the SSP245 Scenario: A Comparative Analysis from 2003–2023 to 2024–2100, Examining Historical Trends and Future Projections Using CanESM5 within the CMIP6 Framework.</p>
Full article ">Figure 9
<p>Canada’s Percentage of Drought-Affected Area, Pr, and T under the SSP370 Scenario: A Comparative Analysis from 2003–2023 to 2024–2100, Examining Historical Trends and Future Projections Using CanESM5 within the CMIP6 Framework.</p>
Full article ">Figure 10
<p>Canada’s Percentage of Drought-Affected Area, Pr, and T under the SSP585 Scenario: A Comparative Analysis from 2003–2023 to 2024–2100, Examining Historical Trends and Future Projections Using CanESM5 within the CMIP6 Framework.</p>
Full article ">Figure 11
<p>Policy Implications for Canada in Addressing Climate Change and Drought Conditions.</p>
Full article ">
10 pages, 1304 KiB  
Article
Age and Sex Estimation in Children and Young Adults Using Panoramic Radiographs with Convolutional Neural Networks
by Tuğçe Nur Şahin and Türkay Kölüş
Appl. Sci. 2024, 14(16), 7014; https://doi.org/10.3390/app14167014 - 9 Aug 2024
Viewed by 389
Abstract
Image processing with artificial intelligence has shown significant promise in various medical imaging applications. The present study aims to evaluate the performance of 16 different convolutional neural networks (CNNs) in predicting age and gender from panoramic radiographs in children and young adults. The [...] Read more.
Image processing with artificial intelligence has shown significant promise in various medical imaging applications. The present study aims to evaluate the performance of 16 different convolutional neural networks (CNNs) in predicting age and gender from panoramic radiographs in children and young adults. The networks tested included DarkNet-19, DarkNet-53, Inception-ResNet-v2, VGG-19, DenseNet-201, ResNet-50, GoogLeNet, VGG-16, SqueezeNet, ResNet-101, ResNet-18, ShuffleNet, MobileNet-v2, NasNet-Mobile, AlexNet, and Xception. These networks were trained on a dataset of 7336 radiographs from individuals aged between 5 and 21. Age and gender estimation accuracy and mean absolute age prediction errors were evaluated on 340 radiographs. Statistical analyses were conducted using Shapiro–Wilk, one-way ANOVA, and Tukey tests (p < 0.05). The gender prediction accuracy and the mean absolute age prediction error were, respectively, 87.94% and 0.582 for DarkNet-53, 86.18% and 0.427 for DarkNet-19, 84.71% and 0.703 for GoogLeNet, 81.76% and 0.756 for DenseNet-201, 81.76% and 1.115 for ResNet-18, 80.88% and 0.650 for VGG-19, 79.41% and 0.988 for SqueezeNet, 79.12% and 0.682 for Inception-Resnet-v2, 78.24% and 0.747 for ResNet-50, 77.35% and 1.047 for VGG-16, 76.47% and 1.109 for Xception, 75.88% and 0.977 for ResNet-101, 73.24% and 0.894 for ShuffleNet, 72.35% and 1.206 for AlexNet, 71.18% and 1.094 for NasNet-Mobile, and 62.94% and 1.327 for MobileNet-v2. No statistical difference in age prediction performance was found between DarkNet-19 and DarkNet-53, which demonstrated the most successful age estimation results. Despite these promising results, all tested CNNs performed below 90% accuracy and were not deemed suitable for clinical use. Future studies should continue with more-advanced networks and larger datasets. Full article
(This article belongs to the Special Issue Oral Diseases: Diagnosis and Therapy)
Show Figures

Figure 1

Figure 1
<p>Detailed prediction distribution of test radiographs by age and sex groups for the DarkNet-53 network. Correct predictions are shown in shades of blue, while incorrect predictions are displayed in shades of red. For example, all 10 radiographs of 5-year-old girls (05F) were correctly predicted by DarkNet-53. However, among the ten radiographs of 6-year-old females (06F), only one was correctly predicted, while one was predicted as a 5-year-old male (05M), five as 6-year-old males (06M), one as a 7-year-old female (07F), and two as 8-year-old females (08F).</p>
Full article ">Figure 2
<p>Distribution of predictions for the GoogLeNet network. Correct predictions are shown in shades of blue, while incorrect predictions are displayed in shades of red.</p>
Full article ">
Back to TopTop