Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Object Detection and Image Classification

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 March 2025 | Viewed by 7476

Share This Special Issue

Special Issue Editors

Dr. Patrick Wong

E-Mail Website
Guest Editor

School of Computing & Communications, The Open University, Walton Hall, Kents Hill, Milton Keynes MK7 6AA, UK
Interests: image processing; object detection and tracking; computer vision; automatic umpiring; anomaly detection; deepfake detection

Prof. Dr. Yifan Zhao

E-Mail Website
Guest Editor

Faculty of Engineering and Applied Sciences, Cranfield University, Cranfield MK43 0AL, UK
Interests: machine learning; artificial intelligence; human factors; pattern recognition; digital twins; instrumentation, sensors and measurement science; systems engineering; through-life engineering services
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Rapid advances in machine learning and artificial intelligence in the last decade have enabled various objects in images to be effectively identified and classified. This advancement makes the detection of objects in various application domains possible, such as detecting cancerous cells in microscopic images, classifying plants and insects in natural environments, identifying astronomical objects in space and distinguinshing deepfake images from real ones. In some cases, these detections are more accurate than human experts’. However, various challenges still need to be resolved before automatic object objection applications can be widely deployed. These challenges includes improved detection accuracy and reliability, explainability and acceptability.

This Special Issue invites high-quality papers that present novel ideas in object detection and classification, the explanation of detection decision and the improvement on acceptability in any application domains. Areas relevant to this Special Issue include, but are not limited to, the following:

Object detection and tracking;
Classification of images;
Deepfake detection;
Explainable AI on object detection;
Object localization in images;
Augmented reality;
Autonomous vehicles and robots;
Umpire Decision Review System;
Remote sensing;
Disease detection and diagnosis;
Biometrics.

Dr. Patrick Wong
Prof. Dr. Yifan Zhao
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

object detection
object tracking
image classification
deepfake detection
explainable AI
object localization
augumented reality
automonous vehicles
automonous robots
umpire decision review system
remote sensing
disease detection
biometrics

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

13 pages, 3531 KiB

Open AccessArticle

Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection

by Guimei Qi, Zhihong Yu and Jian Song

Appl. Sci. 2025, 15(2), 924; https://doi.org/10.3390/app15020924 - 18 Jan 2025

Viewed by 623

Abstract

Accurate and efficient object detection in UAV images is a challenging task due to the diversity of target scales and the massive number of small targets. This study investigates the enhancement in the detection head using sparse convolution, demonstrating its effectiveness in achieving an optimal balance between accuracy and efficiency. Nevertheless, the sparse convolution method encounters challenges related to the inadequate incorporation of global contextual information and exhibits network inflexibility attributable to its fixed mask ratios. To address the above issues, the MFFCESSC-SSD, a novel single-shot detector (SSD) with multi-scale feature fusion and context-enhanced spatial sparse convolution, is proposed in this paper. First, a global context-enhanced group normalization (CE-GN) layer is developed to address the issue of information loss resulting from the convolution process applied exclusively to the masked region. Subsequently, a dynamic masking strategy is designed to determine the optimal mask ratios, thereby ensuring compact foreground coverage that enhances both accuracy and efficiency. Experiments on two datasets (i.e., VisDrone and ARH2000; the latter dataset was created by the researchers) demonstrate that the MFFCESSC-SSD remarkably outperforms the performance of the SSD and numerous conventional object detection algorithms in terms of accuracy and efficiency. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

Figure 1
A visualization of objects in a sample image from VisDrone2019 (classical UAV dataset), and a comparison of objects in the UAV image and COCO datasets. The number of objects in each VisDrone2019 sample is uniformly distributed between 10 and 300, while the number of objects in each COCO sample is mostly less than 20. The percentage of small objects (with a ratio of 0.05 to the entire background) in VisDrone is up to 72.45%. Full article ">Figure 2
The MFFCESSC-SSD framework based on the SSD (highlighted in green). MFF aims to utilize information from different feature maps and suppress the impact of background noise by using AU blocks; CESSC replaces the detection head in each FPN layer by using a mask feature <math display="inline"><semantics> <mrow> <msub> <mi>H</mi> <mi>i</mi> </msub> </mrow> </semantics></math> and a global feature <math display="inline"><semantics> <mrow> <msub> <mi>G</mi> <mi>i</mi> </msub> </mrow> </semantics></math>. The mask ratio of <math display="inline"><semantics> <mrow> <msub> <mi>H</mi> <mi>i</mi> </msub> </mrow> </semantics></math> is a spatial sparse mask generated from the feature statistics for each layer. Full article ">Figure 3
Visualized thermal comparison of UAV images before and after MFF processing. Full article ">Figure 4
Visualization of detection results. Yellow ovals highlight small objects in validation set, which cannot be detected by SSD. By adding MFF to SSD, model generated denser detection boxes on VisDrone2019 and successfully detected small objects in two datasets. Full article ">Figure 5
Comparison of detection results of different algorithms using VisDrone2019 dataset. The green boxes are the detected targets, and the MFFCESSC-SSD has the lowest leakage rate. Full article ">

17 pages, 17602 KiB

Open AccessArticle

Enhancing Detection of Pedestrians in Low-Light Conditions by Accentuating Gaussian–Sobel Edge Features from Depth Maps

by Minyoung Jung and Jeongho Cho

Appl. Sci. 2024, 14(18), 8326; https://doi.org/10.3390/app14188326 - 15 Sep 2024

Cited by 1 | Viewed by 1335

Abstract

Owing to the low detection accuracy of camera-based object detection models, various fusion techniques with Light Detection and Ranging (LiDAR) have been attempted. This has resulted in improved detection of objects that are difficult to detect due to partial occlusion by obstacles or unclear silhouettes. However, the detection performance remains limited in low-light environments where small pedestrians are located far from the sensor or pedestrians have difficult-to-estimate shapes. This study proposes an object detection model that employs a Gaussian–Sobel filter. This filter combines Gaussian blurring, which suppresses the effects of noise, and a Sobel mask, which accentuates object features, to effectively utilize depth maps generated by LiDAR for object detection. The model performs independent pedestrian detection using the real-time object detection model You Only Look Once v4, based on RGB images obtained using a camera and depth maps preprocessed by the Gaussian–Sobel filter, and estimates the optimal pedestrian location using non-maximum suppression. This enables accurate pedestrian detection while maintaining a high detection accuracy even in low-light or external-noise environments, where object features and contours are not well defined. The test evaluation results demonstrated that the proposed method achieved at least 1–7% higher average precision than the state-of-the-art models under various environments. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

26 pages, 14527 KiB

Open AccessArticle

by Shun Hattori, Takafumi Miki, Akisada Sanjo, Daiki Kobayashi and Madoka Takahara

Appl. Sci. 2024, 14(17), 7958; https://doi.org/10.3390/app14177958 - 6 Sep 2024

Viewed by 829

Abstract

In the field of studies on the “Neural Synapses” in the nervous system, its experts manually (or pseudo-automatically) detect the bio-molecule clusters (e.g., of proteins) in many TIRF (Total Internal Reflection Fluorescence) images of a fluorescent cell and analyze their static/dynamic behaviors. This paper proposes a novel method for the automatic detection of the bio-molecule clusters in a TIRF image of a fluorescent cell and conducts several experiments on its performance, e.g., mAP @ IoU (mean Average Precision @ Intersection over Union) and F1-score @ IoU, as an objective/quantitative means of evaluation. As a result, the best of the proposed methods achieved 0.695 as its mAP @ IoU = 0.5 and 0.250 as its F1-score @ IoU = 0.5 and would have to be improved, especially with respect to its recall @ IoU. But, the proposed method could automatically detect bio-molecule clusters that are not only circular and not always uniform in size, and it can output various histograms and heatmaps for novel deeper analyses of the automatically detected bio-molecule clusters, while the particles detected by the Mosaic Particle Tracker 2D/3D, which is one of the most conventional methods for experts, can be only circular and uniform in size. In addition, this paper defines and validates a novel similarity of automatically detected bio-molecule clusters between fluorescent cells, i.e., SimMolCC, and also shows some examples of SimMolCC-based applications. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

15 pages, 3110 KiB

Open AccessArticle

Knowledge Embedding Relation Network for Small Data Defect Detection

by Jinjia Ruan, Jin He, Yao Tong, Yuchuan Wang, Yinghao Fang and Liang Qu

Appl. Sci. 2024, 14(17), 7922; https://doi.org/10.3390/app14177922 - 5 Sep 2024

Viewed by 744

Abstract

In industrial vision, the lack of defect samples is one of the key constraints of depth vision quality inspection. This paper mainly studies defect detection under a small training set, trying to reduce the dependence of the model on defect samples by using normal samples. Therefore, we propose a Knowledge-Embedding Relational Network. We propose a Knowledge-Embedding Relational Network (KRN): firstly, unsupervised clustering and convolution features are used to model the knowledge of normal samples; at the same time, based on CNN feature extraction assisted by image segmentation, the conv feature is obtained from the backbone network; then, we build the relationship between knowledge and prediction samples through covariance, embed the knowledge, further mine the correlation using gram operation, normalize the power of the high-order features obtained by covariance, and finally send them to the prediction network. Our KRN has three attractive characteristics: (I) Knowledge Modeling uses the unsupervised clustering algorithm to statistically model the standard samples so as to reduce the dependence of the model on defect data. (II) Covariance-based Knowledge Embedding and the Gram Operation capture the second-order statistics of knowledge features and predicted image features to deeply mine the robust correlation. (III) Power Normalizing suppresses the burstiness of covariance module learning and the complexity of the feature space. KRN outperformed several advanced baselines in small training sets on the DAGM 2007, KSDD, and Steel datasets. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

Figure 1
The architecture of general Defect Detection models. (1) Encoding Network; (2) Knowledge Embedding consist of knowledge model, correlation fusion module, and Gram Operation; (3) Predictive Network. Full article ">Figure 2
Knowledge mining based on embedded relation module. The input is the conv characteristics of the predicted samples and the prior knowledge processed into tensors. The correlation between the two is captured with the help of covariance operation and then fused in the form of attention. Full article ">Figure 3
Gram Operation module. We flatten conv features into feature vectors, capture second-order features by covariance operation, and then send them to PN. Finally, self replication is carried out for the subsequent diversity and fusion promotion operation. Full article ">Figure 4
Examples of images, defects, and detections with segmentation output from the DAGM (top), KolektorSDD (middle), and Steel (bottom) datasets. Full article ">Figure 4 Cont.
Examples of images, defects, and detections with segmentation output from the DAGM (top), KolektorSDD (middle), and Steel (bottom) datasets. Full article ">Figure 5
Smaller training set size results of DAGM, KSDD, Steel. The three figures above are the change curve of map with the number of positive samples, and the three figures below are the corresponding FP + FN. Full article ">

17 pages, 3554 KiB

Open AccessArticle

Robot Operating Systems–You Only Look Once Version 5–Fleet Efficient Multi-Scale Attention: An Improved You Only Look Once Version 5-Lite Object Detection Algorithm Based on Efficient Multi-Scale Attention and Bounding Box Regression Combined with Robot Operating Systems

by Haiyan Wang, Zhan Shi, Guiyuan Gao, Chuang Li, Jian Zhao and Zhiwei Xu

Appl. Sci. 2024, 14(17), 7591; https://doi.org/10.3390/app14177591 - 28 Aug 2024

Viewed by 1101

Abstract

This paper primarily investigates enhanced object detection techniques for indoor service mobile robots. Robot operating systems (ROS) supply rich sensor data, which boost the models’ ability to generalize. However, the model’s performance might be hindered by constraints in the processing power, memory capacity, and communication capabilities of robotic devices. To address these issues, this paper proposes an improved you only look once version 5 (YOLOv5)-Lite object detection algorithm based on efficient multi-scale attention and bounding box regression combined with ROS. The algorithm incorporates efficient multi-scale attention (EMA) into the traditional YOLOv5-Lite model and replaces the C3 module with a lightweight C3Ghost module to reduce computation and model size during the convolution process. To enhance bounding box localization accuracy, modified precision-defined intersection over union (MPDIoU) is employed to optimize the model, resulting in the ROS–YOLOv5–FleetEMA model. The results indicated that relative to the conventional YOLOv5-Lite model, the ROS–YOLOv5–FleetEMA model enhanced the mean average precision (mAP) by 2.7% post-training, reduced giga floating-point operations per second (GFLOPS) by 13.2%, and decreased the params by 15.1%. In light of these experimental findings, the model was incorporated into ROS, leading to the development of a ROS-based object detection platform that offers rapid and precise object detection capabilities. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

17 pages, 9871 KiB

Open AccessArticle

Vision AI System Development for Improved Productivity in Challenging Industrial Environments: A Sustainable and Efficient Approach

by Changmo Yang, JinSeok Kim, DongWeon Kang and Doo-Seop Eom

Appl. Sci. 2024, 14(7), 2750; https://doi.org/10.3390/app14072750 - 25 Mar 2024

Cited by 3 | Viewed by 1514

Abstract

This study presents a development plan for a vision AI system to enhance productivity in industrial environments, where environmental control is challenging, by using AI technology. An image pre-processing algorithm was developed using a mobile robot that can operate in complex environments alongside workers to obtain high-quality learning and inspection images. Additionally, the proposed architecture for sustainable AI system development included cropping the inspection part images to minimize the technology development time, investment costs, and the reuse of images. The algorithm was retrained using mixed learning data to maintain and improve its performance in industrial fields. This AI system development architecture effectively addresses the challenges faced in applying AI technology at industrial sites and was demonstrated through experimentation and application. Full article

(This article belongs to the Special Issue Object Detection and Image Classification)

► Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Displaying articles 1-6

Journal Menu

Journal Browser

Object Detection and Image Classification

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI