[go: up one dir, main page]

 
 
applsci-logo

Journal Browser

Journal Browser

Advancements in Deep Learning and Its Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 May 2025 | Viewed by 4897

Special Issue Editors


E-Mail Website
Guest Editor
1. CEOS.PP, ISCAP, Polytechnic of Porto, 4465-004 Porto, Portugal
2. INESC TEC, 4200-465 Porto, Portugal
Interests: statistical modelling; forecasting; optimization; machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
1. Faculty of Economics, University of Porto, 4200-464 Porto, Portugal
2. INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal
Interests: time series forecasting; machine learning; deep learning; data science; big data
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep Learning is a subfield of Machine Learning that has seen significant advancements over the past few years, thanks to the availability of large amounts of data, faster computing hardware, and improved algorithms. The advancements in Deep Learning have revolutionized several fields, including image recognition, speech recognition, natural language processing, robotics, and healthcare. The development of Convolutional Neural Networks, Recurrent Neural Networks, and Deep Reinforcement Learning has significantly improved the performance of Deep Learning models in these areas. As Deep Learning continues to grow, we can expect to see even more breakthroughs in various applications, which will have a profound impact on our lives.

Given this context, this Special Issue calls for a more critical discussion and perspective on the practical implementations of Artificial Intelligence and Deep Learning in real-world scenarios, as well as the recent advancements in leveraging these pioneering technologies, and to disseminate acquired knowledge. We encourage authors to submit original research articles that tackle crucial matters and contribute to the creation of innovative concepts, methodologies, applications, trends, and knowledge in the field. Additionally, review articles that present the current state of the art are warmly welcomed.

You may choose our Joint Special Issue in Applied System Innovation.

Dr. Patrícia Ramos
Dr. Jose Manuel Oliveira
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning applications
  • artificial intelligence
  • neural network architectures
  • transformers
  • generative models
  • real-world AI implementation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 5323 KiB  
Article
AI- and Deep Learning-Powered Driver Drowsiness Detection Method Using Facial Analysis
by Tahesin Samira Delwar, Mangal Singh, Sayak Mukhopadhyay, Akshay Kumar, Deepak Parashar, Yangwon Lee, Md Habibur Rahman, Mohammad Abrar Shakil Sejan and Jee Youl Ryu
Appl. Sci. 2025, 15(3), 1102; https://doi.org/10.3390/app15031102 - 22 Jan 2025
Viewed by 1192
Abstract
The significant number of road traffic accidents caused by fatigued drivers presents substantial risks to the public’s overall safety. In recent years, there has been a notable convergence of intelligent cameras and artificial intelligence (AI), leading to significant advancements in identifying driver drowsiness. [...] Read more.
The significant number of road traffic accidents caused by fatigued drivers presents substantial risks to the public’s overall safety. In recent years, there has been a notable convergence of intelligent cameras and artificial intelligence (AI), leading to significant advancements in identifying driver drowsiness. Advances in computer vision technology allow for the identification of driver drowsiness by monitoring facial expressions such as yawning, eye movements, and head movements. These physical indications, together with assessments of the driver’s physiological condition and behavior, aid in assessing fatigue and lowering the likelihood of drowsy driving-related incidents. This study presents an extensive variety of meticulously designed algorithms that were thoroughly analyzed to assess their effectiveness in detecting drowsiness. At the core of this attempt lay the essential concept of feature extraction, an efficient technique for isolating facial and ocular regions from a particular set of input images. Following this, various deep learning models, such as a traditional CNN, VGG16, and MobileNet, facilitated detecting drowsiness. Among these approaches, the MobileNet model was a valuable choice for drowsiness detection in drivers due to its real-time processing capability and suitability for deployment in resource-constrained environments, with the highest achieved accuracy of 92.75%. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
Show Figures

Figure 1

Figure 1
<p>Working framework for driver drowsiness detection.</p>
Full article ">Figure 2
<p>Different states of drowsiness in drivers.</p>
Full article ">Figure 3
<p>Graphical abstract of proposed work.</p>
Full article ">Figure 4
<p>Sample images from Kaggle dataset (0-500x0-400 pixels).</p>
Full article ">Figure 5
<p>Sample images from Health Informatics dataset (0-60x0-50 pixels).</p>
Full article ">Figure 6
<p>(<b>A</b>) Facial feature detection and (<b>B</b>) localization for feature extraction.</p>
Full article ">Figure 7
<p>The layered architecture of the CNN Model.</p>
Full article ">Figure 8
<p>Accuracy vs. epochs of CNN model.</p>
Full article ">Figure 9
<p>Loss vs. epochs of CNN Model.</p>
Full article ">Figure 10
<p>Layer architecture of VGG16 model.</p>
Full article ">Figure 11
<p>Accuracy vs. epochs of VGG16 architecture.</p>
Full article ">Figure 12
<p>Loss vs. epochs of VGG16 architecture.</p>
Full article ">Figure 13
<p>Layer architecture of MobileNet model.</p>
Full article ">Figure 14
<p>Accuracy vs. epochs of MobileNet model.</p>
Full article ">Figure 15
<p>Loss vs. epochs of MobileNet model.</p>
Full article ">Figure 16
<p>Challenges in detecting drowsiness.</p>
Full article ">
21 pages, 5845 KiB  
Article
FPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs
by Mustafa Tasci, Ayhan Istanbullu, Vedat Tumen and Selahattin Kosunalp
Appl. Sci. 2025, 15(2), 688; https://doi.org/10.3390/app15020688 - 12 Jan 2025
Viewed by 1077
Abstract
Recently, convolutional neural networks (CNNs) have received a massive amount of interest due to their ability to achieve high accuracy in various artificial intelligence tasks. With the development of complex CNN models, a significant drawback is their high computational burden and memory requirements. [...] Read more.
Recently, convolutional neural networks (CNNs) have received a massive amount of interest due to their ability to achieve high accuracy in various artificial intelligence tasks. With the development of complex CNN models, a significant drawback is their high computational burden and memory requirements. The performance of a typical CNN model can be enhanced by the improvement of hardware accelerators. Practical implementations on field-programmable gate arrays (FPGA) have the potential to reduce resource utilization while maintaining low power consumption. Nevertheless, when implementing complex CNN models on FPGAs, these may may require further computational and memory capacities, exceeding the available capacity provided by many current FPGAs. An effective solution to this issue is to use quantized neural network (QNN) models to remove the burden of full-precision weights and activations. This article proposes an accelerator design framework for FPGAs, called FPGA-QNN, with a particular value in reducing high computational burden and memory requirements when implementing CNNs. To approach this goal, FPGA-QNN exploits the basics of quantized neural network (QNN) models by converting the high burden of full-precision weights and activations into integer operations. The FPGA-QNN framework comes up with 12 accelerators based on multi-layer perceptron (MLP) and LeNet CNN models, each of which is associated with a specific combination of quantization and folding. The outputs from the performance evaluations on Xilinx PYNQ Z1 development board proved the superiority of FPGA-QNN in terms of resource utilization and energy efficiency in comparison to several recent approaches. The proposed MLP model classified the FashionMNIST dataset at a speed of 953 kFPS with 1019 GOPs while consuming 2.05 W. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
Show Figures

Figure 1

Figure 1
<p>A view of entire system.</p>
Full article ">Figure 2
<p>(<b>a</b>) Single-layer perceptron vs. (<b>b</b>) multi layer-perceptron.</p>
Full article ">Figure 3
<p>A demonstration of the LeNet-5 architecture.</p>
Full article ">Figure 4
<p>The running mechanism of QAT and PTQ quantization directions.</p>
Full article ">Figure 5
<p>A demonstration of 8-bit quantization in Brevitas.</p>
Full article ">Figure 6
<p>An example scenario: (<b>a</b>) standard quantization with 32-bit weight (<b>b</b>) BNN with 1-bit weight.</p>
Full article ">Figure 7
<p>The main body of the FINN framework with 4 steps.</p>
Full article ">Figure 8
<p>(<b>a</b>) Single processing element (PE) and (<b>b</b>) matrix–vector–threshold unit in FINN framework.</p>
Full article ">Figure 9
<p>Illustration of four separate matrix multiplication resulting from folding.</p>
Full article ">Figure 10
<p>The MLP model implemented for acceleration.</p>
Full article ">Figure 11
<p>The LeNet model implemented for acceleration.</p>
Full article ">Figure 12
<p>Training phase accuracy and loss plots for MLP and LeNet quantized models.</p>
Full article ">Figure 13
<p>Transformations applied to models in FINN.</p>
Full article ">Figure 14
<p>The blocks of the developed accelerator hardware.</p>
Full article ">Figure 15
<p>FPGA and CPU accuracy graph for MLP and LeNet models.</p>
Full article ">Figure 16
<p>Accuracy and timing analysis of FPGA and CPU platforms to observe the effects of precision and folding configurations.</p>
Full article ">Figure 17
<p>FPGA source consumption by model, precision, and folding.</p>
Full article ">Figure 18
<p>Xilinx Vivado power estimation tool and actual power measurement based on quantization levels.</p>
Full article ">
14 pages, 9837 KiB  
Article
Class Activation Map Guided Backpropagation for Discriminative Explanations
by Yongjie Liu, Wei Guo, Xudong Lu, Lanju Kong and Zhongmin Yan
Appl. Sci. 2025, 15(1), 379; https://doi.org/10.3390/app15010379 - 3 Jan 2025
Viewed by 581
Abstract
The interpretability of neural networks has garnered significant attention. In the domain of computer vision, gradient-based feature attribution techniques like RectGrad have been proposed to utilize saliency maps to demonstrate feature contributions to predictions. Despite advancements, RectGrad falls short in category discrimination, producing [...] Read more.
The interpretability of neural networks has garnered significant attention. In the domain of computer vision, gradient-based feature attribution techniques like RectGrad have been proposed to utilize saliency maps to demonstrate feature contributions to predictions. Despite advancements, RectGrad falls short in category discrimination, producing similar saliency maps across categories. This paper pinpoints the ineffectiveness of threshold-based strategies in RectGrad for distinguishing feature gradients and introduces Class activation map Guided BackPropagation (CGBP) to tackle the issue. CGBP leverages class activation maps during backpropagation to enhance gradient selection, achieving consistent improvements across four models (VGG16, VGG19, ResNet50, and ResNet101) on ImageNet’s validation set. Notably, on VGG16, CGBP improves SIC, AIC, and IS scores by 10.3%, 11.5%, and 4.5%, respectively, compared to RectGrad while maintaining competitive DS performance. Moreover, CGBP demonstrates greater sensitivity to model parameter changes than RectGrad, as confirmed by a sanity check. The proposed method has broad applicability in scenarios like model debugging, where it identifies causes of misclassification, and medical image diagnosis, where it enhances user trust by aligning visual explanations with clinical insights. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
Show Figures

Figure 1

Figure 1
<p>Visualization of saliency maps and gradients generated by the RectGrad method for detailed analysis. (<b>a</b>) Original input image showing a zebra and an African elephant. (<b>b</b>) Saliency maps for the categories of zebra (left) and African elephant (right) generated by RectGrad. (<b>c</b>) Visualization of gradients from the last ReLU layer of the feature modules of VGG16.</p>
Full article ">Figure 2
<p>CGBP uses gradient backpropagation to calculate feature importance scores and generates a target (hostile) class activation map. Threshold-based spatial and feature masking are applied, followed by RectGrad, to produce the target (hostile) saliency map.</p>
Full article ">Figure 3
<p>CGBP produces the final saliency map by comparing the target and hostile saliency maps and masking attribution scores that are lower in the hostile saliency map.</p>
Full article ">Figure 4
<p>Visualization of multiple samples using different methods to generate saliency maps, with each category indicated on the left side of the corresponding sample. The top four rows display results from VGG16, while the bottom four rows present results from ResNet50, separated by a red line.</p>
Full article ">Figure 5
<p>The influence of the cascading randomization test on the saliency maps generated by GBP, RectGrad, and CGBP.</p>
Full article ">Figure 6
<p>Saliency map visualizations for model debugging.</p>
Full article ">Figure 7
<p>Saliency map visualizations for glaucoma diagnosis using different methods.</p>
Full article ">
21 pages, 17134 KiB  
Article
BAT-Transformer: Prediction of Bus Arrival Time with Transformer Encoder for Smart Public Transportation System
by Suhyun Jeong, Changsong Oh and Jongpil Jeong
Appl. Sci. 2024, 14(20), 9488; https://doi.org/10.3390/app14209488 - 17 Oct 2024
Viewed by 1571
Abstract
In urban public transportation systems, the accuracy of bus arrival time prediction is crucial to reduce passenger waiting time, increase satisfaction, and ensure efficient transportation operations. However, traditional bus information systems (BISs) rely on neural network models, which have limited prediction accuracy, and [...] Read more.
In urban public transportation systems, the accuracy of bus arrival time prediction is crucial to reduce passenger waiting time, increase satisfaction, and ensure efficient transportation operations. However, traditional bus information systems (BISs) rely on neural network models, which have limited prediction accuracy, and some public transportation systems have non-fixed or irregular arrival times, making it difficult to directly apply traditional prediction models. Therefore, we used a Transformer Encoder model to effectively learn the long-term dependencies of time series data, and a multi-headed attentional mechanism to reduce the root mean square error (RMSE) and lower the mean absolute percentage error (MAPE) compared to other models to improve prediction performance. The model was trained on real bus-operation data collected from a public data portal covering the Gangnam-gu area of Seoul, Korea, and data preprocessing included missing value handling, normalization and one-hot encoding, and resampling techniques. A linear projection process, learnable location-encoding technique, and a fully connected layer were applied to the transformer-encoder model to capture the time series data more precisely. Therefore, we propose BAT-Transformer, a method that applies a linear projection process, learnable location-encoding technique, and a fully connected layer using bus data. It is expected to help optimize public transportation systems and show its applicability in various urban environments. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
Show Figures

Figure 1

Figure 1
<p>Overall structure of a smart public transportation system: organic connection of real-time data collection, analysis, and service provision.</p>
Full article ">Figure 2
<p>BAT-Transformer architecture.</p>
Full article ">Figure 3
<p>Stage 1: data preprocessing.</p>
Full article ">Figure 4
<p>Input and output data features.</p>
Full article ">Figure 5
<p>Comparison graph of boardings and alightings by time of day.</p>
Full article ">Figure 6
<p>Stage 2: time prediction.</p>
Full article ">Figure 7
<p>Data-collection and -conversion process using the Public Data Portal API.</p>
Full article ">Figure 8
<p>Data Distribution of data before and after resampling by time zone: blue line shows before resampling and red line shows after resampling.</p>
Full article ">Figure 9
<p>Loss of training and validation: The blue line shows the change in training values and the yellow line shows the change in Validaion values.</p>
Full article ">Figure 10
<p>RMSE of training and validation: The blue line shows the change in training values and the yellow line shows the change in Validaion values.</p>
Full article ">Figure 11
<p>MAPE of training and validation: The blue line shows the change in training values and the yellow line shows the change in Validaion values.</p>
Full article ">Figure 12
<p>Comparison of RMSE: the red line represents FCNN, the blue line represents transformer-encoder, and the green line represents BAT-Transformer.</p>
Full article ">
Back to TopTop