[go: up one dir, main page]

 
 
applsci-logo

Journal Browser

Journal Browser

Application of Artificial Intelligence in Image Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 May 2025 | Viewed by 4551

Special Issue Editor


E-Mail Website
Guest Editor
School of Software, Yunnan University, Kunming 650000, China
Interests: deep learning; image processing; fuzzy sets; information fusion

Special Issue Information

Dear Colleagues,

The field of image processing has long been at the forefront of leveraging artificial intelligence (AI) to revolutionize how we analyze, interpret, and utilize visual data. As we continue to push the boundaries of what is possible with AI, the integration of advanced algorithms and machine learning models has opened up new horizons in the realm of image analysis, leading to innovative applications across various industries. This has been facilitated by the rapid advancements in computational power, the availability of large annotated datasets, and the development of sophisticated AI techniques such as deep learning, computer vision, and pattern recognition.

The adoption of AI in image processing has triggered a wave of innovation that spans from medical imaging and diagnostics to security surveillance, autonomous vehicles, and digital media. The ability of AI to recognize patterns, classify images, and detect anomalies has far-reaching implications for enhancing efficiency, accuracy, and speed in image analysis tasks. In the face of these technological advancements, there is a growing need for research that explores the potential of AI in image processing, addresses the challenges associated with its implementation, and identifies new opportunities for innovation. The integration of AI with image processing technologies not only transforms the way we process visual information but also raises important questions about data privacy, ethical considerations, and the future of work in related fields.

This Special Issue on "Application of Artificial Intelligence in Image Processing" invites submissions that delve into the latest research and development regarding the application of AI to image processing. We welcome contributions that cover a wide range of topics, including (but not limited to) the following:

  • Utilizing AI for image recognition and classification;
  • Deep learning applications in medical imaging and diagnostics;
  • AI-driven techniques for object detection and segmentation in images;
  • Computer vision systems for security and surveillance;
  • AI in the enhancement and restoration of digital media;
  • Ethical considerations and challenges in AI-driven image processing;
  • Integration of AI with Internet of Things (IoT) for real-time image analysis;
  • Applications of AI in autonomous vehicle imaging systems;
  • AI and the future of image forensics and authentication;
  • Case studies in AI application for environmental monitoring and agriculture;
  • Exploring the role of AI in creative industries for image generation;
  • Sustainable AI practices in image processing;
  • Customer experience enhancement through AI-powered image personalization.

We encourage researchers, academics, and industry professionals to share their insights, findings, and innovative applications of AI in image processing. This Special Issue aims to provide a comprehensive overview of the current landscape and future directions in the field, offering a platform for the exchange of knowledge and the inspiration of new ideas and solutions.

Dr. Qian Jiang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • image processing
  • computer vision
  • machine learning
  • image recognition
  • image classification
  • medical imaging
  • object detection
  • image generation
  • application of AI-driven image processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 3603 KiB  
Article
Improvement of a Subpixel Convolutional Neural Network for a Super-Resolution Image
by Muhammed Fatih Ağalday and Ahmet Çinar
Appl. Sci. 2025, 15(5), 2459; https://doi.org/10.3390/app15052459 - 25 Feb 2025
Viewed by 161
Abstract
Super-resolution technologies are one of the tools used in image restoration, which aims to obtain high-resolution content from low-resolution images. Super-resolution technology aims to increase the quality of a low-resolution image by reconstructing it. It is a useful technology, especially in content where [...] Read more.
Super-resolution technologies are one of the tools used in image restoration, which aims to obtain high-resolution content from low-resolution images. Super-resolution technology aims to increase the quality of a low-resolution image by reconstructing it. It is a useful technology, especially in content where low-resolution images need to be enhanced. Super-resolution applications are used in areas such as face recognition, medical imaging, and satellite imaging. Deep neural network models used for single-image super-resolution are quite successful in terms of computational performance. In these models, low-resolution images are converted to high resolution using methods such as bicubic interpolation. Since the super-resolution process is performed in the high-resolution area, it adds a memory cost and computational complexity. In our proposed model, a low-resolution image is given as input to a convolutional neural network to reduce computational complexity. In this model, a subpixel convolution layer is presented that learns a series of filters to enhance low-resolution feature maps to high-resolution images. In our proposed model, convolution layers are added to the efficient subpixel convolutional neural network (ESPCN) model, and in order to prevent the lost gradient value, we transfer the feature information of the current layer from the previous layer to the next upper layer. The efficient subpixel convolutional neural network (R-ESPCN) model proposed in this paper is remodeled to reduce the time required for the real-time subpixel convolutional neural network to perform super-resolution operations on images. The results show that our method is significantly improved in accuracy and demonstrates the applicability of deep learning methods in the field of image data processing. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Comparison of SRCNN and bicubic interpolation [<a href="#B14-applsci-15-02459" class="html-bibr">14</a>].</p>
Full article ">Figure 2
<p>Comparison of the residual blocks in SRResNet and EDSR [<a href="#B16-applsci-15-02459" class="html-bibr">16</a>].</p>
Full article ">Figure 3
<p>Residual learning: a building block [<a href="#B19-applsci-15-02459" class="html-bibr">19</a>].</p>
Full article ">Figure 4
<p>Flow chart of the ESPCN [<a href="#B17-applsci-15-02459" class="html-bibr">17</a>].</p>
Full article ">Figure 5
<p>Flowchart of the proposed R-ESPCN.</p>
Full article ">Figure 6
<p>Structure of the proposed R-ESPCN model.</p>
Full article ">Figure 7
<p>Training results of (<b>a</b>) the ESPCN model and (<b>b</b>) the R-ESPCN model.</p>
Full article ">Figure 8
<p>Comparison of ESPCN and R-ESPCN models on the set5 dataset: baby, bird, butterfly, face, and woman.</p>
Full article ">Figure 9
<p>Comparison of ESPCN and R-ESPCN models on the set14 dataset: baboon, comic, Barbara, pepper, zebra, and flowers.</p>
Full article ">Figure 10
<p>Comparison of ESPCN and R-ESPCN models on the security camera image dataset.</p>
Full article ">Figure 11
<p>Comparison of ESPCN and R-ESPCN models on the medical image dataset with the dicom extension.</p>
Full article ">Figure 12
<p>Comparison of ESPCN and R-ESPCN models on images belonging to article authors.</p>
Full article ">
15 pages, 3184 KiB  
Article
A Lightweight Single-Image Super-Resolution Method Based on the Parallel Connection of Convolution and Swin Transformer Blocks
by Tengyun Jing, Cuiyin Liu and Yuanshuai Chen
Appl. Sci. 2025, 15(4), 1806; https://doi.org/10.3390/app15041806 - 10 Feb 2025
Viewed by 419
Abstract
In recent years, with the development of deep learning technologies, Vision Transformers combined with Convolutional Neural Networks (CNNs) have made significant progress in the field of single-image super-resolution (SISR). However, existing methods still face issues such as incomplete high-frequency information reconstruction, training instability [...] Read more.
In recent years, with the development of deep learning technologies, Vision Transformers combined with Convolutional Neural Networks (CNNs) have made significant progress in the field of single-image super-resolution (SISR). However, existing methods still face issues such as incomplete high-frequency information reconstruction, training instability caused by residual connections, and insufficient cross-window information exchange. To address these problems and better leverage both local and global information, this paper proposes a super-resolution reconstruction network based on the Parallel Connection of Convolution and Swin Transformer Block (PCCSTB) to model the local and global features of an image. Specifically, through a parallel structure of channel feature-enhanced convolution and Swin Transformer, the network extracts, enhances, and fuses the local and global information. Additionally, this paper designs a fusion module to integrate the global and local information extracted by CNNs. The experimental results show that the proposed network effectively balances SR performance and network complexity, achieving good results in the lightweight SR domain. For instance, in the 4× super-resolution experiment on the Urban100 dataset, the network achieves an inference speed of 55 frames per second under the same device conditions, which is more than seven times as fast as the state-of-the-art network Shifted Window-based Image Restoration (SwinIR). Moreover, the network’s Peak Signal-to-Noise Ratio (PSNR) outperforms SwinIR by 0.29 dB at a 4× scale on the Set5 dataset, indicating that the network efficiently performs high-resolution image reconstruction. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Parallel Connection Convolution Swin Transformer Super-Resolution Net.</p>
Full article ">Figure 2
<p>Shallow feature extraction block.</p>
Full article ">Figure 3
<p>Parallel Connection of Convolution and Swin Transformer Block (PCCSTB).</p>
Full article ">Figure 4
<p>Fusion block.</p>
Full article ">Figure 5
<p>Comparison of reconstruction results between advanced algorithms and PCCSTSR on the Urban100 dataset (×4).</p>
Full article ">Figure 6
<p>Comparison of reconstruction results among original high-resolution image, low-resolution image, and PCCSTSR on the Urban100 dataset (×4).</p>
Full article ">Figure 6 Cont.
<p>Comparison of reconstruction results among original high-resolution image, low-resolution image, and PCCSTSR on the Urban100 dataset (×4).</p>
Full article ">
16 pages, 1820 KiB  
Article
GAN-Based Map Generation Technique of Aerial Image Using Residual Blocks and Canny Edge Detector
by Jongwook Si and Sungyoung Kim
Appl. Sci. 2024, 14(23), 10963; https://doi.org/10.3390/app142310963 - 26 Nov 2024
Viewed by 700
Abstract
As the significance of meticulous and precise map creation grows in modern Geographic Information Systems (GISs), urban planning, disaster response, and other domains, the necessity for sophisticated map generation technology has become increasingly evident. In response to this demand, this paper puts forward [...] Read more.
As the significance of meticulous and precise map creation grows in modern Geographic Information Systems (GISs), urban planning, disaster response, and other domains, the necessity for sophisticated map generation technology has become increasingly evident. In response to this demand, this paper puts forward a technique based on Generative Adversarial Networks (GANs) for converting aerial imagery into high-quality maps. The proposed method, comprising a generator and a discriminator, introduces novel strategies to overcome existing challenges; namely, the use of a Canny edge detector and Residual Blocks. The proposed loss function enhances the generator’s performance by assigning greater weight to edge regions using the Canny edge map and eliminating superfluous information. This approach enhances the visual quality of the generated maps and ensures the accurate capture of fine details. The experimental results demonstrate that this method generates maps of superior visual quality, achieving outstanding performance compared to existing methodologies. The results show that the proposed technology has significant potential for practical applications in a range of real-world scenarios. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>The overall process of the proposed method.</p>
Full article ">Figure 2
<p>Comparisons of generation results between the proposed method and related works ((<b>a</b>): original aerial image, (<b>b</b>): original map image, (<b>c</b>): Pix2Pix [<a href="#B6-applsci-14-10963" class="html-bibr">6</a>], (<b>d</b>): CycleGAN [<a href="#B7-applsci-14-10963" class="html-bibr">7</a>], (<b>e</b>): SMAPGAN [<a href="#B16-applsci-14-10963" class="html-bibr">16</a>], and (<b>f</b>): our research).</p>
Full article ">Figure 3
<p>Examples of map images generated using the proposed method.</p>
Full article ">
15 pages, 2189 KiB  
Article
Entropy-Based Ensemble of Convolutional Neural Networks for Clothes Texture Pattern Recognition
by Reham Al-Majed and Muhammad Hussain
Appl. Sci. 2024, 14(22), 10730; https://doi.org/10.3390/app142210730 - 20 Nov 2024
Viewed by 714
Abstract
Automatic clothes pattern recognition is important to assist visually impaired people and for real-world applications such as e-commerce or personal fashion recommendation systems, and it has attracted increased interest from researchers. It is a challenging texture classification problem in that even images of [...] Read more.
Automatic clothes pattern recognition is important to assist visually impaired people and for real-world applications such as e-commerce or personal fashion recommendation systems, and it has attracted increased interest from researchers. It is a challenging texture classification problem in that even images of the same texture class expose a high degree of intraclass variations. Moreover, images of clothes patterns may be taken in an unconstrained illumination environment. Machine learning methods proposed for this problem mostly rely on handcrafted features and traditional classification methods. The research works that utilize the deep learning approach result in poor recognition performance. We propose a deep learning method based on an ensemble of convolutional neural networks where feature engineering is not required while extracting robust local and global features of clothes patterns. The ensemble classifier employs a pre-trained ResNet50 with a non-local (NL) block, a squeeze-and-excitation (SE) block, and a coordinate attention (CA) block as base learners. To fuse the individual decisions of the base learners, we introduce a simple and effective fusing technique based on entropy voting, which incorporates the uncertainties in the decisions of base learners. We validate the proposed method on benchmark datasets for clothes patterns that have six categories: solid, striped, checkered, dotted, zigzag, and floral. The proposed method achieves promising results for limited computational and data resources. In terms of accuracy, it achieves 98.18% for the GoogleClothingDataset and 96.03% for the CCYN dataset. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Examples of six classes of clothes patterns: checkered, floral, dotted, solid, striped, and zigzag.</p>
Full article ">Figure 2
<p>High-level depiction of the architecture of the proposed ensemble classifier.</p>
Full article ">Figure 3
<p>Detail of ResNet50 architecture.</p>
Full article ">Figure 4
<p>Architecture of bottleneck residual block.</p>
Full article ">Figure 5
<p>ResNet50 with SE Blocks. <math display="inline"><semantics> <mrow> <mi>R</mi> <mi>e</mi> <mi>s</mi> <msub> <mi>G</mi> <mi>i</mi> </msub> </mrow> </semantics></math> is the <span class="html-italic">i</span>th group of ResNet blocks.</p>
Full article ">Figure 6
<p>ResNet50 with CA block. <math display="inline"><semantics> <mrow> <mi>R</mi> <mi>e</mi> <mi>s</mi> <msub> <mi>G</mi> <mi>i</mi> </msub> </mrow> </semantics></math> is the <span class="html-italic">i</span>th group of ResNet blocks.</p>
Full article ">Figure 7
<p>ResNet50 with NL block. <math display="inline"><semantics> <mrow> <mi>R</mi> <mi>e</mi> <mi>s</mi> <msub> <mi>G</mi> <mi>i</mi> </msub> </mrow> </semantics></math> is the <span class="html-italic">i</span>th group of ResNet blocks.</p>
Full article ">Figure 8
<p>The performance of two ensemble classifiers. (<b>a</b>) The performance in terms of accuracy of ensemble learner 1 and the base learners. (<b>b</b>) The performance in terms of accuracy of ensemble learner 2 and the base learners.</p>
Full article ">Figure 9
<p>Venn diagram of base learners’ errors. (<b>a</b>) Error analysis of base learners of ensemble classifier 1. (<b>b</b>) Error analysis of base learners of ensemble classifier 2.</p>
Full article ">Figure 10
<p>Confusion matrix showing the decision making of the ensemble classifier.</p>
Full article ">Figure 11
<p>Performance of the base learners for each class.</p>
Full article ">
19 pages, 829 KiB  
Article
A New Image Oversampling Method Based on Influence Functions and Weights
by Jun Ye, Shoulei Lu and Jiawei Chen
Appl. Sci. 2024, 14(22), 10553; https://doi.org/10.3390/app142210553 - 15 Nov 2024
Viewed by 632
Abstract
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced [...] Read more.
Although imbalanced data have been studied for many years, the problem of data imbalance is still a major problem in the development of machine learning and artificial intelligence. The development of deep learning and artificial intelligence has further expanded the impact of imbalanced data, so studying imbalanced data classification is of practical significance. We propose an image oversampling algorithm based on the influence function and sample weights. Our scheme not only synthesizes high-quality minority class samples but also preserves the original features and information of minority class images. To address the lack of visually reasonable features in SMOTE when synthesizing images, we improve the pre-training model by removing the pooling layer and the fully connected layer in the model, extracting the important features of the image by convolving the image, executing SMOTE interpolation operation on the extracted important features to derive the synthesized image features, and inputting the features into a DCGAN network generator, which maps these features into the high-dimensional image space to generate a realistic image. To verify that our scheme can synthesize high-quality images and thus improve classification accuracy, we conduct experiments on the processed CIFAR10, CIFAR100, and ImageNet-LT datasets. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Imbalance data resolution.</p>
Full article ">Figure 2
<p>Image feature extraction network architecture.</p>
Full article ">Figure 3
<p>Visualization of image features.</p>
Full article ">Figure 4
<p>SMOTE interpolation for synthetic feature visualization.</p>
Full article ">Figure 5
<p>DCGAN generator architecture.</p>
Full article ">Figure 6
<p>Oversampling in the CIFAR-10 image dataset.</p>
Full article ">Figure 7
<p>Accuracy for each category in the CIFAR-10 dataset at an imbalance ratio of 50.</p>
Full article ">Figure 8
<p>Accuracy rates of different schemes in the ImageNet-LT dataset.</p>
Full article ">
26 pages, 4018 KiB  
Article
A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia
by Ioannis Galanakis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis and Maria Samarakou
Appl. Sci. 2024, 14(22), 10266; https://doi.org/10.3390/app142210266 - 7 Nov 2024
Viewed by 1224
Abstract
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, [...] Read more.
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, argumentative and non-argumentative behaviors, corresponding to verbal argumentative behavior. By using a dataset extracted from video frames of hand gestures, body stance and facial expression, and by using their corresponding landmarks, three different classification models were trained and evaluated. The results indicate that Random Forest Classifier outperformed the other two by classifying argumentative behaviors with 68.07% accuracy and non-argumentative behaviors with 94.18% accuracy, correspondingly. Thus, there is future scope for advancing this classification model to a prediction model, with the aim of predicting aggressive behavior in patients suffering with dementia before their onset. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Argumentative image dataset sample.</p>
Full article ">Figure 2
<p>Non-argumentative image dataset sample.</p>
Full article ">Figure 3
<p>Cross-validation metrics for the three models.</p>
Full article ">Figure 4
<p>AUC scores of the three trained models. A model that makes random guesses (practically a model with no discriminative power), is represented by the diagonal dashed blue line that extends from the bottom left (0, 0) to the top right (1, 1). The ROC curve for any model that outperforms the random one will be above this diagonal line.</p>
Full article ">Figure 5
<p>Confusion matrix of Random Forest Classifier after training.</p>
Full article ">Figure 6
<p>Confusion matrix of Gradient Boosting after training.</p>
Full article ">Figure 7
<p>Confusion matrix of Ridge Classifier after training.</p>
Full article ">Figure 8
<p>Learning curve of Random Forest Classifier after training.</p>
Full article ">Figure 9
<p>Learning curve for Gradient Boosting after training.</p>
Full article ">Figure 10
<p>Learning curve of Ridge Classifier after training.</p>
Full article ">Figure 11
<p>Paired <span class="html-italic">t</span>-test statistic results across all models and metrics.</p>
Full article ">Figure 12
<p>Confusion Matrix of Random Forest Classifier after testing.</p>
Full article ">Figure 13
<p>ROC AUC score of Random Forest Classifier after testing.</p>
Full article ">Figure 14
<p>Final model evaluation metrics.</p>
Full article ">Figure 15
<p>Probability range/count of correct argumentative and non-argumentative predictions per 0.1 accuracy range, with 1.0 being the perfect accuracy score.</p>
Full article ">
Back to TopTop