[go: up one dir, main page]

 
 
applsci-logo

Journal Browser

Journal Browser

Current Advances in 3D Scene Classification and Object Recognition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 April 2025 | Viewed by 3273

Special Issue Editors


E-Mail Website
Guest Editor
University Institute for Computer Research, University of Alicante, P.O. Box 99, 03080 Alicante, Spain
Interests: machine learning; computer vision; pattern recognition; gesture recognition; object recognition; neural networks; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
University Institute for Computer Research, University of Alicante, P.O. Box 99, 03080 Alicante, Spain
Interests: computer science and artificial intelligence

E-Mail Website
Guest Editor
University Institute for Computer Research, University of Alicante, P.O. Box 99, 03080 Alicante, Spain
Interests: computer vision; deep learning; 3D object recognition; mapping; navigation; robotics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning algorithms have significantly transformed the landscape of recognition technologies. In contemporary settings, these advanced algorithms excel, often outperforming humans in tasks such as image classification. This process is not only limited to recognizing flat images but extends to more complex operations like pixel-wise classification and object detection, where their performance remains impressive.

However, the scenario shifts when dealing with three-dimensional data. Tasks involving the recognition of objects in range images, depth maps, point clouds, and stereo images introduce unique challenges. Despite substantial progress and dedicated efforts in the field, these challenges are yet to be fully mastered. The complexities of three-dimensional data processing demand innovative approaches and refined algorithms to achieve results comparable to those obtained with two-dimensional data.

In response to these ongoing developments, we are proud to announce a Special Issue entitled "Current Advances in 3D Scene Classification and Object Recognition". This edition is dedicated to exploring groundbreaking and rigorous research that integrates various learning paradigms with three-dimensional data across multiple contexts. This Special Issue aims to highlight novel methodologies and significant advancements in several areas, including, but not limited to, enhanced algorithms for depth perception, improved techniques for processing point clouds, and innovative applications in stereo imaging. This issue will serve as a platform for scholars and practitioners to disseminate their findings and contribute to the evolving field of 3D data analysis.

Specifically, this Special Issue will cover the following topics, among others:

  • Deep Learning and machine learning with point clouds;
  • Deep Learning and machine learning with depth maps;
  • Deep Learning and machine learning with stereo vision;
  • Three-dimensional keypoint detectors and descriptors;
  • Novel representations of 3D data;
  • Three-dimensional data summarization and compression;
  • Object detection taking as input any 3D data;
  • Point-wise classification taking as input any 3D data.

Dr. Francisco Gomez-Donoso
Dr. Gonzalez-Serrano German
Dr. Félix Escalona Moncholí
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • 3D scene understanding
  • 3D object detection
  • 3D data
  • point clouds
  • depth maps

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

23 pages, 3232 KiB  
Article
Comparative Analysis of LiDAR and Photogrammetry for 3D Crime Scene Reconstruction
by Fatemah M. Sheshtar, Wajd M. Alhatlani, Michael Moulden and Jong Hyuk Kim
Appl. Sci. 2025, 15(3), 1085; https://doi.org/10.3390/app15031085 - 22 Jan 2025
Viewed by 1105
Abstract
Accurate and fast 3D mapping of crime scenes is crucial in law enforcement, and first responders often need to document scenes in detail under challenging conditions and within a limited time. Traditional methods often fail to capture the details required to understand these [...] Read more.
Accurate and fast 3D mapping of crime scenes is crucial in law enforcement, and first responders often need to document scenes in detail under challenging conditions and within a limited time. Traditional methods often fail to capture the details required to understand these scenes comprehensively. This study investigates the effectiveness of recent mobile phone-based mapping technologies equipped with a LiDAR (Light Detection and Ranging) sensor. The performance of LiDAR and pure photogrammetry is evaluated under different illumination (day and night) and scanning conditions (slow and fast scanning) in a mock-up crime scene. The results reveal that the mapping using an iPhone LIDAR in daylight conditions with 5 min of fast scanning shows the best results, yielding 0.1084 m of error. Also, the cloud-to-cloud distance showed that 90% of the point clouds exhibited under 0.1224 m of error, demonstrating the utility of these tools for rapid and portable scanning in crime scenes. Full article
(This article belongs to the Special Issue Current Advances in 3D Scene Classification and Object Recognition)
Show Figures

Figure 1

Figure 1
<p>A mock-up crime scene, showing a mannequin, markers, blood spatters, and a LIDAR scanner for ground truth comparison. The scene was prepared by crime scene professionals and has been utilized for training law enforcement officers.</p>
Full article ">Figure 2
<p>A laser ruler and tape measure are used to measure between markers.</p>
Full article ">Figure 3
<p>Measurements of five pairs of markers using CloudCompare.</p>
Full article ">Figure 4
<p>Three-dimensional map results of Model 1 using LiDAR with four different scanning scenarios. Although the nightime scans are visually poor, their maps are more accurate due to the minimal sunlight interference.</p>
Full article ">Figure 5
<p>Three-dimensional map results of Model 2 using photogrammetry with four scanning scenarios, showing the best results in daytime.</p>
Full article ">Figure 6
<p>Comparison of MAE for Model 1 (LiDAR) and Model 2 (photogrammetry) across 4 scanning scenarios (the lower, the better), showing the best LiDAR result at nighttime with fast scan, while the best photogrammetry at daytime with slow scan.</p>
Full article ">Figure 7
<p>Cloud-to-cloud distance heatmap (in meters) (<b>left</b>) and related histogram showing most point clouds (90%) fall under 0.12m of error (<b>right</b>). The regions of high errors are from the sensing boundary or blind spots during the scanning.</p>
Full article ">Figure 8
<p>A 3D reconstructed footprint as a piece of evidence at the scene.</p>
Full article ">Figure 9
<p>Failed image alignments in a marker on the floor and an April Tag on the wall (with arrows).</p>
Full article ">
16 pages, 2646 KiB  
Article
Performance Comparison of Vertex Block Descent and Position Based Dynamics Algorithms Using Cloth Simulation in Unity
by Jun Ma, Nak-Jun Sung, Min-Hyung Choi and Min Hong
Appl. Sci. 2024, 14(23), 11072; https://doi.org/10.3390/app142311072 - 28 Nov 2024
Viewed by 837
Abstract
This paper presents a comparative study of the Vertex Block Descent (VBD) and Position-Based Dynamics (PBD) algorithms, focusing on their performance in physical simulation tasks. Unity, a versatile physics engine, served as the simulation platform for the experiments. Among various types of physical [...] Read more.
This paper presents a comparative study of the Vertex Block Descent (VBD) and Position-Based Dynamics (PBD) algorithms, focusing on their performance in physical simulation tasks. Unity, a versatile physics engine, served as the simulation platform for the experiments. Among various types of physical simulations of deformable objects, fluids, and cloth dynamics, cloth simulations were chosen for implementation with both algorithms. The experimental setup ensured identical parameters, including time steps and movement behavior, for both algorithms across scenarios involving hanging, object-to-object collisions, and self-collisions. The results indicate that while the performance difference in frames per second (fps) between the two algorithms is negligible for simulations with a small number of nodes, the VBD algorithm consistently outperforms the PBD algorithm as the node count increases. Furthermore, this study provides practical guidelines for maintaining real-time performance, detailing the maximum node count each algorithm can support, while sustaining a minimum threshold of 30 fps, which is necessary for real-time applications. The comparison was conducted using CPU-based computation to establish a baseline for future studies in GPU-accelerated environments, where parallel processing is expected to further highlight the performance advantages of VBD. Future work will extend this research by evaluating additional physical simulation models, including the Mass-Spring System and Extended Position-Based Dynamics (XPBD), and developing optimizations to enhance the efficiency and scalability of these algorithms. Full article
(This article belongs to the Special Issue Current Advances in 3D Scene Classification and Object Recognition)
Show Figures

Figure 1

Figure 1
<p>Result of hanging cloth simulation using PBD (<b>up</b>) and VBD (<b>down</b>).</p>
Full article ">Figure 2
<p>Graph of average fps for each node in the hanging cloth simulation.</p>
Full article ">Figure 3
<p>Result of object-to-object collision cloth simulation using PBD (<b>up</b>) and VBD (<b>down</b>).</p>
Full article ">Figure 4
<p>Graph of average fps for each node in the object-to-object collision simulation.</p>
Full article ">Figure 5
<p>Result of self-collision cloth simulation using PBD (<b>up</b>) and VBD (<b>down</b>).</p>
Full article ">Figure 6
<p>Graph of average fps for each node in the self-collision simulation.</p>
Full article ">Figure 7
<p>The 64 × 64 node, 0.02 s time step cloth simulation using PBD (<b>left</b>) and VBD (<b>right</b>).</p>
Full article ">Figure 8
<p>The 16 × 16 node, 0.1 s time step cloth simulation using PBD (<b>left</b>) and VBD (<b>right</b>).</p>
Full article ">
23 pages, 10682 KiB  
Article
VFLD: Voxelized Fractal Local Descriptor
by Francisco Gomez-Donoso, Felix Escalona, Florian Dargère and Miguel Cazorla
Appl. Sci. 2024, 14(20), 9414; https://doi.org/10.3390/app14209414 - 15 Oct 2024
Viewed by 781
Abstract
A variety of methods for 3D object recognition and registration based on a deep learning pipeline have recently emerged. Nonetheless, these methods require large amounts of data that are not easy to obtain, sometimes rendering them virtually useless in real-life scenarios due to [...] Read more.
A variety of methods for 3D object recognition and registration based on a deep learning pipeline have recently emerged. Nonetheless, these methods require large amounts of data that are not easy to obtain, sometimes rendering them virtually useless in real-life scenarios due to a lack of generalization capabilities. To counter this, we propose a novel local descriptor that takes advantage of the fractal dimension. For each 3D point, we create a descriptor by computing the fractal dimension of the neighbors at different radii. Our redmethod has many benefits, such as being agnostic to the sensor of choice and noise, up to a level, and having few parameters to tinker with. Furthermore, it requires no training and does not rely on semantic information. We test our descriptor using well-known datasets and it largely outperforms Fast Point Feature Histogram, which is the state-of-the-art descriptor for 3D data. We also apply our descriptor to a registration pipeline and achieve accurate three-dimensional representations of the scenes, which are captured with a commercial sensor. Full article
(This article belongs to the Special Issue Current Advances in 3D Scene Classification and Object Recognition)
Show Figures

Figure 1

Figure 1
<p>Visualization of the occupied boxes (blue) of a point cloud after applying the voxel grid. In black, we can see the main bounding box of the object that marks the size for the divisions.</p>
Full article ">Figure 2
<p>Effects of the <math display="inline"><semantics> <mrow> <mi>n</mi> <mi>I</mi> <mi>t</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> </mrow> </semantics></math> parameter in the box-counting process from the original point cloud (leftmost) to the generated grid with 100 iterations (rightmost).</p>
Full article ">Figure 3
<p>Plot of the computed fractal dimension for <math display="inline"><semantics> <mrow> <mi>n</mi> <mi>I</mi> <mi>t</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> <mo>=</mo> <mo>{</mo> <mn>3</mn> <mo>,</mo> <mn>5</mn> <mo>,</mo> <mn>7</mn> <mo>,</mo> <mn>15</mn> <mo>}</mo> </mrow> </semantics></math> for a random set of points. This is a log–log plot in which the X-axis refers to the inverse of the voxel size in the box-counting method, while the Y-axis is the number of occupied boxes. Note the difference in the slope (FD, fractal dimension) of the fitted line when <math display="inline"><semantics> <mrow> <mi>n</mi> <mi>I</mi> <mi>t</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> </mrow> </semantics></math> is set too high.</p>
Full article ">Figure 4
<p>Diagram of the VLFD generation process.</p>
Full article ">Figure 5
<p>Visualization of the steps that comprise the computation of the descriptor. (<b>a</b>) The surrounding points at different radii are obtained. Two radii are used in this example for visualization purposes. (<b>b</b>) Box-counting is used to obtain the leaf size and the occupied boxes on each subset. Four iterations are visualized. (<b>c</b>) The log–log curve is generated for the data obtained, and a line is fitted. Its slope is the fractal dimension.</p>
Full article ">Figure 6
<p>Random samples of the ModelNet10 dataset.</p>
Full article ">Figure 7
<p>Random samples of the Simple Figures dataset.</p>
Full article ">Figure 8
<p>Examples of ScanNet RGB-D scenes, viewed from above.</p>
Full article ">Figure 9
<p>Random examples of point clouds included in the ViDRILO dataset.</p>
Full article ">Figure 10
<p>Accuracy (top) and precision–recall (bottom) curves for different starting values for the search radii for ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 11
<p>Accuracy (top) and precision–recall (bottom) curves for different increment values for the search radii for ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 12
<p>Accuracy (top) and precision–recall (bottom) curves for different box-counting iterations for ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 13
<p>Accuracy (top) and precision–recall (bottom) curves for different amounts of the search radii for ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 14
<p>Accuracy (top) and precision–recall (bottom) curves for different densities of the sampling process for ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 15
<p>Result of applying different Gaussian noise levels <math display="inline"><semantics> <mrow> <mi>F</mi> <mo>=</mo> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mn>0.5</mn> <mo>,</mo> <mn>1</mn> <mo>,</mo> <mn>3</mn> <mo>)</mo> </mrow> </semantics></math> to a random sample of the ModelNet10 dataset.</p>
Full article ">Figure 16
<p>Accuracy (top) and precision–recall (bottom) curves for different noise levels added to ModelNet (leftmost) and Simple Figures (rightmost) datasets.</p>
Full article ">Figure 17
<p>Accuracy (top) and precision–recall (bottom) curves for different state-of-the-art methods for ModelNet (leftmost) and Simple Figures (rightmost) datasets. (<b>a</b>) ModelNet accuracy; (<b>b</b>) Simple Figures accuracy; (<b>c</b>) ModelNet precision–recall; (<b>d</b>) Simple figures precision–recall.</p>
Full article ">Figure 18
<p>ScanNet scenes splitting steps. (<b>a</b>) The initial scene is downsampled into a point cloud of 12,000 points. (<b>b</b>) The 2D minimum area rectangle including all the points is obtained. (<b>c</b>) This two-dimensional rectangle is divided into four equal parts. (<b>d</b>) Three more clouds are created by iterating clockwise from the leftmost cloud.</p>
Full article ">Figure 19
<p>Two types of downsampling used for the evaluation protocol. (<b>a</b>) A ScanNet scene, obtained with uniform downsampling. (<b>b</b>) A ScanNet scene, obtained with voxelized downsampling.</p>
Full article ">Figure 20
<p>Amount of scenes (in ordinate) for each error rate interval (in abscissa) of VFLD (in blue) and FPFH (in red) for the registration evaluation protocol on the uniform environment.</p>
Full article ">Figure 21
<p>Number of scenes (in ordinate) for each error rate interval (in abscissa) of VFLD (in blue) and FPFH (in red) for the registration evaluation protocol on the voxelized environment.</p>
Full article ">Figure 22
<p>Registration results of two different environments using VFLD as the descriptor of choice for the feature-matching step. Four different clouds are shown for each example; each one is of a different color. (<b>a</b>–<b>d</b>,<b>h</b>–<b>k</b>) are color images from a sequences of the dataset, and (<b>e</b>–<b>g</b>,<b>l</b>–<b>n</b>) are the tridimensional reconstruction of the scene achieved using the proposed VFLD.</p>
Full article ">
Back to TopTop