Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 July 2025 | Viewed by 14118

Share This Special Issue

Special Issue Editor

Dr. Shiyang Yan

E-Mail Website
Guest Editor

INRIA Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France
Interests: attention mechanism; reinforcement learning; generative learning; causal inference

Special Issue Information

Dear Colleagues,

Convolutional neural networks (CNNs) and related deep neural networks have seen great success in machine learning and computer vision. Advanced CNNs, such as fast, faster R-CNN, have achieved breakthrough performance in object detection. Recently, transformer models have been widely applied in classification, object detection, and multimodal machine learning tasks.

To further boost the research and application of advanced deep neural networks in various computer vision applications, this Special Issue aims to gather and collect advanced deep neural networks and algorithms in the field of computer vision and related areas. We encouraged the submission of research papers on, but not restricted to, object detection, image segmentation, and classification.

Dr. Shiyang Yan
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

convolutional neural network
computer vision
object detection
image segmentation
image classification

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

18 pages, 3452 KiB

Open AccessArticle

Pupil Refinement Recognition Method Based on Deep Residual Network and Attention Mechanism

by Zehui Chen, Changyuan Wang and Gongpu Wu

Appl. Sci. 2024, 14(23), 10971; https://doi.org/10.3390/app142310971 - 26 Nov 2024

Cited by 1 | Viewed by 542

Abstract

This study aims to capture subtle changes in the pupil, identify relatively weak inter-class changes, extract more abstract and discriminative pupil features, and study a pupil refinement recognition method based on attention mechanisms. Based on the deep learning framework and the ResNet101 deep residual network as the backbone network, a pupil refinement recognition model is established. Among them, the image preprocessing module is used to preprocess the pupil images captured by infrared spectroscopy, removing internal noise from the pupil images. By using the ResNet101 backbone network, subtle changes in the pupil are captured, weak inter-class changes are identified, and different features of the pupil image are extracted. The channel attention module is used to screen pupil features and obtain key pupil features. External attention modules are used to enhance the expression of key pupil feature information and extract more abstract and discriminative pupil features. The Softmax classifier is used to process the pupil features captured by infrared spectra and output refined pupil recognition results. Experimental results show that this method can effectively preprocess pupil images captured by infrared spectroscopy and extract pupil features. This method can effectively achieve fine pupil recognition, and the fine recognition effect is relatively good. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

17 pages, 4745 KiB

Open AccessArticle

Implementing YOLO Convolutional Neural Network for Seed Size Detection

by Jakub Pawłowski, Marcin Kołodziej and Andrzej Majkowski

Appl. Sci. 2024, 14(14), 6294; https://doi.org/10.3390/app14146294 - 19 Jul 2024

Viewed by 1273

Abstract

The article presents research on the application of image processing techniques and convolutional neural networks (CNN) for the detection and measurement of seed sizes, specifically focusing on coffee and white bean seeds. The primary objective of the study is to evaluate the potential of using CNNs to develop tools that automate seed recognition and measurement in images. A database was created, containing photographs of coffee and white bean seeds with precise annotations of their location and type. Image processing techniques and You Only Look Once v8 (YOLO) models were employed to analyze the seeds’ position, size, and type. A detailed comparison of the effectiveness and performance of the applied methods was conducted. The experiments demonstrated that the best-trained CNN model achieved a segmentation accuracy of 90.1% IoU, with an average seed size error of 0.58 mm. The conclusions indicate a significant potential for using image processing techniques and CNN models in automating seed analysis processes, which could lead to increased efficiency and accuracy in these processes. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

15 pages, 2625 KiB

Open AccessArticle

Segmentation of Liver Tumors by Monai and PyTorch in CT Images with Deep Learning Techniques

by Sabir Muhammad and Jing Zhang

Appl. Sci. 2024, 14(12), 5144; https://doi.org/10.3390/app14125144 - 13 Jun 2024

Cited by 2 | Viewed by 2832

Abstract

Image segmentation and identification are crucial to modern medical image processing techniques. This research provides a novel and effective method for identifying and segmenting liver tumors from public CT images. Our approach leverages the hybrid ResUNet model, a combination of both the ResNet and UNet models developed by the Monai and PyTorch frameworks. The ResNet deep dense network architecture is implemented on public CT scans using the MSD Task03 Liver dataset. The novelty of our method lies in several key aspects. First, we introduce innovative enhancements to the ResUNet architecture, optimizing its performance, especially for liver tumor segmentation tasks. Additionally, by harassing the capabilities of Monai, we streamline the implementation process, eliminating the need for manual script writing and enabling faster, more efficient model development and optimization. The process of preparing images for analysis by a deep neural network involves several steps: data augmentation, a Hounsfield windowing unit, and image normalization. ResUNet network performance is measured by using the DC metric Dice coefficient. This approach, which utilizes residual connections, has proven to be more reliable than other existing techniques. This approach achieved DC values of 0.98% for detecting liver tumors and 0.87% for segmentation. Both qualitative and quantitative evaluations show promising results regarding model precision and accuracy. The implications of this research are that it could be used to increase the precision and accuracy of liver tumor detection and liver segmentation, reflecting the potential of the proposed method. This could help in the early diagnosis and treatment of liver cancer, which can ultimately improve patient prognosis. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

17 pages, 3884 KiB

Open AccessArticle

A Method for Underwater Biological Detection Based on Improved YOLOXs

by Heng Wang, Pu Zhang, Mengnan You and Xinyuan You

Appl. Sci. 2024, 14(8), 3196; https://doi.org/10.3390/app14083196 - 10 Apr 2024

Cited by 3 | Viewed by 1310

Abstract

This article proposes a lightweight underwater biological target detection network based on the improvement of YOLOXs, addressing the challenges of complex and dynamic underwater environments, limited memory in underwater devices, and constrained computational capabilities. Firstly, in the backbone network, GhostConv and GhostBottleneck are introduced to replace standard convolutions and the Bottleneck1 structure in CSPBottleneck_1, significantly reducing the model’s parameter count and computational load, facilitating the construction of a lightweight network. Next, in the feature fusion network, a Contextual Transformer block replaces the 3 × 3 convolution in CSPBottleneck_2. This enhances self-attention learning by leveraging the rich context between input keys, improving the model’s representational capacity. Finally, the positioning loss function Focal_EIoU Loss is employed to replace IoU Loss, enhancing the model’s robustness and generalization ability, leading to faster and more accurate convergence during training. Our experimental results demonstrate that compared to the YOLOXs model, the proposed YOLOXs-GCE achieves a 1.1% improvement in mAP value, while reducing parameters by 24.47%, the computational load by 26.39%, and the model size by 23.87%. This effectively enhances the detection performance of the model, making it suitable for complex and dynamic underwater environments, as well as underwater devices with limited memory. The model meets the requirements of underwater target detection tasks. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

27 pages, 28358 KiB

Open AccessArticle

Fast Coherent Video Style Transfer via Flow Errors Reduction

by Li Wang, Xiaosong Yang and Jianjun Zhang

Appl. Sci. 2024, 14(6), 2630; https://doi.org/10.3390/app14062630 - 21 Mar 2024

Viewed by 1417

Abstract

For video style transfer, naively applying still image techniques to process a video frame-by-frame independently often causes flickering artefacts. Some works adopt optical flow into the design of temporal constraint loss to secure temporal consistency. However, these works still suffer from incoherence (including ghosting artefacts) where large motions or occlusions occur, as optical flow fails to detect the boundaries of objects accurately. To address this problem, we propose a novel framework which consists of the following two stages: (1) creating new initialization images from proposed mask techniques, which are able to significantly reduce the flow errors; (2) process these initialized images iteratively with proposed losses to obtain stylized videos which are free of artefacts, which also increases the speed from over 3 min per frame to less than 2 s per frame for the gradient-based optimization methods. To be specific, we propose a multi-scale mask fusion scheme to reduce untraceable flow errors, and obtain an incremental mask to reduce ghosting artefacts. In addition, a multi-frame mask fusion scheme is designed to reduce traceable flow errors. In our proposed losses, the Sharpness Losses are used to deal with the potential image blurriness artefacts over long-range frames, and the Coherent Losses are performed to restrict the temporal consistency at both the multi-frame RGB level and Feature level. Overall, our approach produces stable video stylization outputs even in large motion or occlusion scenarios. The experiments demonstrate that the proposed method outperforms the state-of-the-art video style transfer methods qualitatively and quantitatively on the MPI Sintel dataset. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

23 pages, 7009 KiB

Open AccessArticle

PGDS-YOLOv8s: An Improved YOLOv8s Model for Object Detection in Fisheye Images

by Degang Yang, Jie Zhou, Tingting Song, Xin Zhang and Yingze Song

Appl. Sci. 2024, 14(1), 44; https://doi.org/10.3390/app14010044 - 20 Dec 2023

Cited by 4 | Viewed by 3991

Abstract

Recently, object detection has become a research hotspot in computer vision, which often detects regular images with small viewing angles. In order to obtain a field of view without blind spots, fisheye cameras, which have distortions and discontinuities, have come into use. The fisheye camera, which has a wide viewing angle, and an unmanned aerial vehicle equipped with a fisheye camera are used to obtain a field of view without blind spots. However, distorted and discontinuous objects appear in the captured fisheye images due to the unique viewing angle of fisheye cameras. It poses a significant challenge to some existing object detectors. To solve this problem, this paper proposes a PGDS-YOLOv8s model to solve the issue of detecting distorted and discontinuous objects in fisheye images. First, two novel downsampling modules are proposed. Among them, the Max Pooling and Ghost’s Downsampling (MPGD) module effectively extracts the essential feature information of distorted and discontinuous objects. The Average Pooling and Ghost’s Downsampling (APGD) module acquires rich global features and reduces the feature loss of distorted and discontinuous objects. In addition, the proposed C2fs module uses Squeeze-and-Excitation (SE) blocks to model the interdependence of the channels to acquire richer gradient flow information about the features. The C2fs module provides a better understanding of the contextual information in fisheye images. Subsequently, an SE block is added after the Spatial Pyramid Pooling Fast (SPPF), thus improving the model’s ability to capture features of distorted, discontinuous objects. Moreover, the UAV-360 dataset is created for object detection in fisheye images. Finally, experiments show that the proposed PGDS-YOLOv8s model on the VOC-360 dataset improves [email protected] by 19.8% and [email protected]:0.95 by 27.5% compared to the original YOLOv8s model. In addition, the improved model on the UAV-360 dataset achieves 89.0% for [email protected] and 60.5% for [email protected]:0.95. Furthermore, on the MS-COCO 2017 dataset, the PGDS-YOLOv8s model improved AP by 1.4%, AP

_{50}

by 1.7%, and AP

_{75}

by 1.2% compared with the original YOLOv8s model. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Figure 1

13 pages, 2895 KiB

Open AccessArticle

Improved Lightweight Multi-Target Recognition Model for Live Streaming Scenes

by Zongwei Li, Kai Qiao, Jianing Chen, Zhenyu Li and Yanhui Zhang

Appl. Sci. 2023, 13(18), 10170; https://doi.org/10.3390/app131810170 - 10 Sep 2023

Viewed by 1405

Abstract

Nowadays, the commercial potential of live e-commerce is being continuously explored, and machine vision algorithms are gradually attracting the attention of marketers and researchers. During live streaming, the visuals can be effectively captured by algorithms, thereby providing additional data support. This paper aims to consider the diversity of live streaming devices and proposes an extremely lightweight and high-precision model to meet different requirements in live streaming scenarios. Building upon yolov5s, we incorporate the MobileNetV3 module and the CA attention mechanism to optimize the model. Furthermore, we construct a multi-object dataset specific to live streaming scenarios, including anchor facial expressions and commodities. A series of experiments have demonstrated that our model realized a 0.4% improvement in accuracy compared to the original model, while reducing its weight to 10.52%. Full article

(This article belongs to the Special Issue Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing)

► Show Figures

Journal Menu

Journal Browser

Advanced Convolutional Neural Network (CNN) Technology in Object Detection and Data Processing

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (7 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI