[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (756)

Search Parameters:
Keywords = point supervised

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 710 KiB  
Article
Solving the Control Synthesis Problem Through Supervised Machine Learning of Symbolic Regression
by Askhat Diveev, Elena Sofronova and Nurbek Konyrbaev
Mathematics 2024, 12(22), 3595; https://doi.org/10.3390/math12223595 (registering DOI) - 17 Nov 2024
Viewed by 90
Abstract
This paper considers the control synthesis problem and its solution using symbolic regression. Symbolic regression methods, which were previously called genetic programming methods, allow one to use a computer to find not only the parameters of a given regression function but also its [...] Read more.
This paper considers the control synthesis problem and its solution using symbolic regression. Symbolic regression methods, which were previously called genetic programming methods, allow one to use a computer to find not only the parameters of a given regression function but also its structure. Unlike other works on solving the control synthesis problem using symbolic regression, the novelty of this paper is that for the first time this work employs a training dataset to address the problem of general control synthesis. Initially, the optimal control problem is solved from each point in a given set of initial states, resulting in a collection of control functions expressed as functions of time. A reference model is then integrated into the control object model, which generates optimal motion trajectories using the derived optimal control functions. The control synthesis problem is framed as an approximation task for all optimal trajectories, where the control function is sought as a function of the deviation of the object from the specified terminal state. The optimization criterion for solving the synthesis problem is the accuracy of the object’s movement along the optimal trajectory. The paper includes an example of solving the control synthesis problem for a mobile robot using a supervised machine learning method. A relatively new method of symbolic regression, the method of variational complete binary genetic programming, is studied and proposed for the solution of the control synthesis problem. Full article
(This article belongs to the Special Issue Advanced Computational Intelligence)
22 pages, 27622 KiB  
Article
Integrated Assessment of Security Risk Considering Police Resources
by Jieying Chen, Weihong Li, Yaxing Li and Yebin Chen
ISPRS Int. J. Geo-Inf. 2024, 13(11), 415; https://doi.org/10.3390/ijgi13110415 (registering DOI) - 16 Nov 2024
Viewed by 262
Abstract
The existing research on security risk often focuses on specific types of crime, overlooking an integrated assessment of security risk by leveraging existing police resources. Thus, we draw on crime geography theories, integrating public security business data, socioeconomic data, and spatial analysis techniques, [...] Read more.
The existing research on security risk often focuses on specific types of crime, overlooking an integrated assessment of security risk by leveraging existing police resources. Thus, we draw on crime geography theories, integrating public security business data, socioeconomic data, and spatial analysis techniques, to identify integrated risk points and areas by examining the distribution of police resources and related factors and their influence on security risk. The findings indicate that security risk areas encompass high-incidence areas of public security issues, locations with concentrations of dangerous individuals and key facilities, and regions with a limited police presence, characterized by dense populations, diverse urban functions, high crime probabilities, and inadequate supervision. While both police resources and security risk are concentrated in urban areas, the latter exhibits a more scattered distribution on the urban periphery, suggesting opportunities to optimize resource allocation by extending police coverage to risk hotspots lacking patrol stations. Notably, Level 1 security risk areas often coincide with areas lacking a police presence, underscoring the need for strategic resource allocation. By comprehensively assessing the impact of police resources and public security data on spatial risk distribution, this study provides valuable insights for public security management and police operations. Full article
13 pages, 5412 KiB  
Article
Supervised Contrastive Learning for 3D Cross-Modal Retrieval
by Yeon-Seung Choo, Boeun Kim, Hyun-Sik Kim and Yong-Suk Park
Appl. Sci. 2024, 14(22), 10322; https://doi.org/10.3390/app142210322 - 10 Nov 2024
Viewed by 409
Abstract
Interoperability between different virtual platforms requires the ability to search and transfer digital assets across platforms. Digital assets in virtual platforms are represented in different forms or modalities, such as images, meshes, and point clouds. The cross-modal retrieval of three-dimensional (3D) object representations [...] Read more.
Interoperability between different virtual platforms requires the ability to search and transfer digital assets across platforms. Digital assets in virtual platforms are represented in different forms or modalities, such as images, meshes, and point clouds. The cross-modal retrieval of three-dimensional (3D) object representations is challenging due to data representation diversity, making common feature space discovery difficult. Recent studies have been focused on obtaining feature consistency within the same classes and modalities using cross-modal center loss. However, center features are sensitive to hyperparameter variations, making cross-modal center loss susceptible to performance degradation. This paper proposes a new 3D cross-modal retrieval method that uses cross-modal supervised contrastive learning (CSupCon) and the fixed projection head (FPH) strategy. Contrastive learning mitigates the influence of hyperparameters by maximizing feature distinctiveness. The FPH strategy prevents gradient updates in the projection network, enabling the focused training of the backbone networks. The proposed method shows a mean average precision (mAP) increase of 1.17 and 0.14 in 3D cross-modal object retrieval experiments using ModelNet10 and ModelNet40 datasets compared to state-of-the-art (SOTA) methods. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Traditional SimCLR [<a href="#B8-applsci-14-10322" class="html-bibr">8</a>] method using single modal data augmentations (<b>left</b>). The augmented data (marked with symbols ′ and ″) are adapted in supervised learning to aggregate representation features. Supervised contrastive learning (SupCon) [<a href="#B14-applsci-14-10322" class="html-bibr">14</a>] adapted SimCLR into supervised learning tasks in single modality (<b>middle</b>). Our method, CSupCon, applied contrastive learning to cross-modal tasks (<b>right</b>). The numbers represent different data instances. The rectangles and circles represent different modalities, and the different colors represent different classes. The blue and red lines indicate positive and negative instances.</p>
Full article ">Figure 2
<p>Overview of the proposed method. In the feature extraction stage, an augmented data instance <math display="inline"><semantics> <msup> <mi>x</mi> <mo>′</mo> </msup> </semantics></math> and another augmented data instance <math display="inline"><semantics> <msup> <mi>x</mi> <mrow> <mo>″</mo> </mrow> </msup> </semantics></math> from input <span class="html-italic">x</span>, embedding features <math display="inline"><semantics> <msup> <mi>v</mi> <mo>′</mo> </msup> </semantics></math> and <math display="inline"><semantics> <msup> <mi>v</mi> <mrow> <mo>″</mo> </mrow> </msup> </semantics></math>, are extracted from each modality using its corresponding backbone network. The proposed method, cross-modal supervised contrastive learning (CSupCon), pushes the features away for different classes and pulls the features towards each other for the same classes. On the other side, in the fixed projection head (FPH) strategy, the <math display="inline"><semantics> <msup> <mi>v</mi> <mo>′</mo> </msup> </semantics></math> features are used to predict semantic labels for classification.</p>
Full article ">Figure 3
<p>The visualization result of feature clustering from the ModelNet40 test data.</p>
Full article ">Figure 4
<p>The results of cross-modal retrieval using the proposed method from the ModelNet40 test data.</p>
Full article ">Figure 5
<p>The result of cross-modal retrieval on the ModelNet40 test data by class. The illustration depicts sorted classes based on the amount of training data and their corresponding mAPs. In general, results are not favorable for classes with a small (less than 200 in this example) number of training data (i.e., classes included in the blue dotted rectangle).</p>
Full article ">
13 pages, 8320 KiB  
Technical Note
Unmanned Aerial Vehicle-Neural Radiance Field (UAV-NeRF): Learning Multiview Drone Three-Dimensional Reconstruction with Neural Radiance Field
by Li Li, Yongsheng Zhang, Zhipeng Jiang, Ziquan Wang, Lei Zhang and Han Gao
Remote Sens. 2024, 16(22), 4168; https://doi.org/10.3390/rs16224168 - 8 Nov 2024
Viewed by 328
Abstract
In traditional 3D reconstruction using UAV images, only radiance information, which is treated as a geometric constraint, is used in feature matching, allowing for the restoration of the scene’s structure. After introducing radiance supervision, NeRF can adjust the geometry in the fixed-ray direction, [...] Read more.
In traditional 3D reconstruction using UAV images, only radiance information, which is treated as a geometric constraint, is used in feature matching, allowing for the restoration of the scene’s structure. After introducing radiance supervision, NeRF can adjust the geometry in the fixed-ray direction, resulting in a smaller search space and higher robustness. Considering the lack of NeRF construction methods for aerial scenarios, we propose a new NeRF point sampling method, which is generated using a UAV imaging model, compatible with a global geographic coordinate system, and suitable for a UAV view. We found that NeRF is optimized entirely based on the radiance while ignoring the direct geometry constraint. Therefore, we designed a radiance correction strategy that considers the incidence angle. Our method can complete point sampling in a UAV imaging scene, as well as simultaneously perform digital surface model construction and ground radiance information recovery. When tested on self-acquired datasets, the NeRF variant proposed in this paper achieved better reconstruction accuracy than the original NeRF-based methods. It also reached a level of precision comparable to that of traditional photogrammetry methods, and it is capable of outputting a surface albedo that includes shadow information. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p><b>The main motivation for our proposed method</b>. We analyzed NeRF’s 3D reconstruction workflow (<b>c</b>) from a photogrammetric perspective (<b>b</b>) and found that the latter uses reprojection errors to geometrically adjust the ray direction, while the former can adjust the transmittance by radiance help, thereby narrowing the search space to a single ray. However, there is a rare NeRF method specifically for drone imaging scenarios. In addition, NeRF does not consider the influence of geometric structure changes on the radiance (<b>a</b>); as such, we designed a new geographical NeRF point sampling method for UAVs and introduced the photogrammetry incident angle model to optimize NeRF radiance features, thus completing end-to-end 3D reconstruction and radiance acquisition (<b>d</b>).</p>
Full article ">Figure 2
<p><b>Main workflow of the multitask UAV-NeRF</b>. We used traditional photogrammetry methods to sample the NeRF point, and we then used a geometric imaging model to perform radiation correction and decoding. “MLP” represents the multilayer perceptrons that were used to decode the different radiation information.</p>
Full article ">Figure 3
<p>The study location along the with general collection pattern, flight lines for the drone imagery.</p>
Full article ">Figure 4
<p><b>Intuitive performance comparison of different methods.</b> These experiments were conducted on the DengFeng and XinMi areas, and they demonstrate the improvement in 3D surface construction achieved using our proposed method. From (<b>a</b>) to (<b>d</b>), the images are as follows: (<b>a</b>) the original imagery captured by the drone, (<b>b</b>) the corresponding ground DSM truth obtained from LIDAR, (<b>c</b>) the DSM predicted by the method proposed in this paper, and (<b>d</b>) the DSM obtained using the CC method. Additionally, another representation of the results, showcasing additional details, is provided in <a href="#remotesensing-16-04168-t001" class="html-table">Table 1</a>.</p>
Full article ">Figure 5
<p>The UAV image, albedo, shadow scalar <span class="html-italic">s</span>, and transient scalar <math display="inline"><semantics> <mi>β</mi> </semantics></math>.</p>
Full article ">
15 pages, 6362 KiB  
Article
Impact of a 20-Week Resistance Training Program on the Force–Velocity Profile in Novice Lifters Using Isokinetic Two-Point Testing
by Joffrey Drigny, Nicolas Pamart, Hélène Azambourg, Marion Remilly, Emmanuel Reboursière, Antoine Gauthier and Amir Hodzic
J. Funct. Morphol. Kinesiol. 2024, 9(4), 222; https://doi.org/10.3390/jfmk9040222 - 5 Nov 2024
Viewed by 553
Abstract
Objectives: This study aimed to assess the impact of a 20-week resistance training program on force–velocity (F-V) parameters using an isokinetic two-point method and comparing one-repetition maximum (1-RM) methods in novice lifters. Methods: Previously untrained individuals completed a supervised, three-session weekly [...] Read more.
Objectives: This study aimed to assess the impact of a 20-week resistance training program on force–velocity (F-V) parameters using an isokinetic two-point method and comparing one-repetition maximum (1-RM) methods in novice lifters. Methods: Previously untrained individuals completed a supervised, three-session weekly resistance training program involving concentric, eccentric, and isometric phases, repeated every 2 to 4 weeks. Isokinetic dynamometry measured the strength of elbow flexors/extensors at 60°/s and 150°/s, and knee flexors/extensors at 60°/s and 240°/s at Baseline, 3 months, and 5 months. F-V parameters, including maximal theoretical force (F0) and the F-V slope, were calculated. Participants also performed 1-RM tests for the upper and lower limbs. Repeated measures ANOVA with effect size (η2 > 0.14 as large) was used to analyze changes in F-V parameters and repeated measures correlation was used to test their association with 1-RM outcomes. Results: Eighteen male participants (22.0 ± 3.4 years) were analyzed. F0 significantly increased for all muscle groups (η2 = 0.423 to 0.883) except elbow flexors. F-V slope significantly decreased (steeper) for knee extensors and flexors (η2 = 0.348 to 0.695). Knee extensors showed greater F0 gains and steeper F-V slopes than flexors (η2 = 0.398 to 0.686). F0 gains were associated with 1-RM changes (r = 0.38 to 0.83), while F-V slope changes correlated only with lower limb 1-RM (r = −0.37 to −0.68). Conclusions: The 20-week resistance training program significantly increased F0 and shifted the F-V profile towards a more “force-oriented” state in knee muscles. These changes correlated with improved 1-RM performance. Future studies should include longer follow-ups and control groups. Full article
Show Figures

Figure 1

Figure 1
<p>Chronological overview of the 20-week resistance training program with 1-RM assessment.</p>
Full article ">Figure 2
<p>Setup and positioning for knee and elbow flexion/extension testing.</p>
Full article ">Figure 3
<p>Example of the linear regression models obtained from the force and velocity data during the knee extension using data measured at 60°/s and 240°/s (data were obtained from one participant and assessed at Baseline).</p>
Full article ">Figure 4
<p>Force–velocity parameter changes illustrated through linear regression models of force and velocity data during (<b>a</b>) elbow extension and flexion and (<b>b</b>) knee extension and flexion isokinetic tasks with repeated measures ANOVA. ext: extensors.</p>
Full article ">Figure 4 Cont.
<p>Force–velocity parameter changes illustrated through linear regression models of force and velocity data during (<b>a</b>) elbow extension and flexion and (<b>b</b>) knee extension and flexion isokinetic tasks with repeated measures ANOVA. ext: extensors.</p>
Full article ">
15 pages, 230 KiB  
Article
Male and Female Perceptions of Supervision During Strength Training
by Luke Carlson, Maria Hauger, Grace Vaughan-Wenner and James P. Fisher
Sports 2024, 12(11), 301; https://doi.org/10.3390/sports12110301 - 5 Nov 2024
Viewed by 750
Abstract
A cross-sectional survey was distributed to 1322 members of a 1-on-1 personalized strength training studio. A total of 366 respondents (n = 134 male and n = 232 female), all aged over 20 years, reported considerable training experience, with 55% of the [...] Read more.
A cross-sectional survey was distributed to 1322 members of a 1-on-1 personalized strength training studio. A total of 366 respondents (n = 134 male and n = 232 female), all aged over 20 years, reported considerable training experience, with 55% of the males and 42% of the females reporting 5+ years of experience. The data were analyzed and reported descriptively with differences >5% identified based on the use of a 5-point Likert scale, the sample size, and the nature of the observations. Disparities between the males and females were identified; the males reported higher perceptions of managing effort, technique, and programming without supervision compared to the females. Safety was noted as being more important to the females compared to the males. Qualitatively, additional themes were raised including an analogy of the personal relationship between the trainer and trainee being similar to that between medical professionals and patients. This was validated where the participants discussed their adaptations from supervised strength training for maintaining quality of life in aging and recovering from medical conditions and injury. The data are discussed in the context of a previous body of literature suggesting males falsely report higher levels of confidence in tasks compared to females, particularly in relation to effort, role models, and verbal encouragement. We posit that the greater confidence expressed by males at least partially explains the greater engagement in strength training practices by males compared to females, as well as explaining the higher level of participation in supervised strength training by females compared to males. This research proves beneficial for strength training practitioners in enhancing their understanding and expectations of clients, as well as hopefully proving insightful in engaging more people in strength training. Full article
22 pages, 16745 KiB  
Article
Unsupervised PolSAR Image Classification Based on Superpixel Pseudo-Labels and a Similarity-Matching Network
by Lei Wang, Lingmu Peng, Rong Gui, Hanyu Hong and Shenghui Zhu
Remote Sens. 2024, 16(21), 4119; https://doi.org/10.3390/rs16214119 - 4 Nov 2024
Viewed by 794
Abstract
Supervised polarimetric synthetic aperture radar (PolSAR) image classification demands a large amount of precisely labeled data. However, such data are difficult to obtain. Therefore, many unsupervised methods have been proposed for unsupervised PolSAR image classification. The classification maps of unsupervised methods contain many [...] Read more.
Supervised polarimetric synthetic aperture radar (PolSAR) image classification demands a large amount of precisely labeled data. However, such data are difficult to obtain. Therefore, many unsupervised methods have been proposed for unsupervised PolSAR image classification. The classification maps of unsupervised methods contain many high-confidence samples. These samples, which are often ignored, can be used as supervisory information to improve classification performance on PolSAR images. This study proposes a new unsupervised PolSAR image classification framework. The framework combines high-confidence superpixel pseudo-labeled samples and semi-supervised classification methods. The experiments indicated that this framework could achieve higher-level effectiveness in unsupervised PolSAR image classification. First, superpixel segmentation was performed on PolSAR images, and the geometric centers of the superpixels were generated. Second, the classification maps of rotation-domain deep mutual information (RDDMI), an unsupervised PolSAR image classification method, were used as the pseudo-labels of the central points of the superpixels. Finally, the unlabeled samples and the high-confidence pseudo-labeled samples were used to train an excellent semi-supervised method, similarity matching (SimMatch). Experiments on three real PolSAR datasets illustrated that, compared with the excellent RDDMI, the accuracy of the proposed method was increased by 1.70%, 0.99%, and 0.8%. The proposed framework provides significant performance improvements and is an efficient method for improving unsupervised PolSAR image classification. Full article
(This article belongs to the Special Issue SAR in Big Data Era III)
Show Figures

Figure 1

Figure 1
<p>The five parts of the framework. The Wide ResNet model adopts the classic wide residual networks (WRNs) [<a href="#B37-remotesensing-16-04119" class="html-bibr">37</a>]. The useful features from the input data are extracted by the backbone to obtain an embedding vector. <math display="inline"><semantics> <msub> <mi>L</mi> <mi>s</mi> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>L</mi> <mi>u</mi> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>L</mi> <mrow> <mi>i</mi> <mi>n</mi> </mrow> </msub> </semantics></math> represent the supervised loss, unsupervised loss, and similarity distribution, respectively.</p>
Full article ">Figure 2
<p>The pseudo-label generation structure of SimMatch. SimMatch generates semantic and instance pseudo-labels using weakly augmented views and calculates semantic and instance similarities through class centers. These two similarities are then propagated to each other using expansion and aggregation to obtain better pseudo-labels.</p>
Full article ">Figure 3
<p>The propagation of pseudo-labels’ information. As the example in the red box shows. If the similarity between semantics and instances are different, the histogram will be flatter, and if the semantics similarities and the instances similarities are similar, the resulting histogram will be sharper.</p>
Full article ">Figure 4
<p>RS-2 Flevoland dataset. (<b>a</b>) Pauli pseudo-color image. (<b>b</b>) Ground-truth map.</p>
Full article ">Figure 5
<p>RS-2 Wuhan dataset. (<b>a</b>) Pauli pseudo-color image. (<b>b</b>) Ground-truth map. (<b>c</b>) An optical image of ROI_1. (<b>d</b>) An optical image of ROI_2.</p>
Full article ">Figure 6
<p>AIRSAR Flevoland dataset. (<b>a</b>) Pauli pseudo-color image. (<b>b</b>) Ground-truth map.</p>
Full article ">Figure 7
<p>Classification results on the RS-2 Flevoland dataset. The black boxes show that SP-SIM has more fine classification results than RDDMI. (<b>a</b>) Ground-truth map. (<b>b</b>) Wishart. (<b>c</b>) RDDMI. (<b>d</b>) SP-SIM.</p>
Full article ">Figure 8
<p>Classification results on the RS-2 Wuhan dataset. (<b>a</b>) Ground-truth map. (<b>b</b>) Wishart. (<b>c</b>) RDDMI. (<b>d</b>) SP-SIM.</p>
Full article ">Figure 9
<p>Similar backscattering properties on the AIRSAR Flevoland dataset. (<b>a</b>) Four similar backscattering properties. (<b>b</b>) Water. (<b>c</b>) Bare soil. (<b>d</b>) Lucerne. (<b>e</b>) Rape seed.</p>
Full article ">Figure 10
<p>Classification results on the AIRSAR Flevoland dataset. (<b>a</b>) Ground-truth map. (<b>b</b>) Wishart. (<b>c</b>) RDDMI. (<b>d</b>) SP-SIM.</p>
Full article ">
17 pages, 7527 KiB  
Article
Improving Safety in High-Altitude Work: Semantic Segmentation of Safety Harnesses with CEMFormer
by Qirui Zhou and Dandan Liu
Symmetry 2024, 16(11), 1449; https://doi.org/10.3390/sym16111449 - 1 Nov 2024
Viewed by 472
Abstract
The symmetry between production efficiency and safety is a crucial aspect of industrial operations. To enhance the identification of proper safety harness use by workers at height, this study introduces a machine vision approach as a substitute for manual supervision. By focusing on [...] Read more.
The symmetry between production efficiency and safety is a crucial aspect of industrial operations. To enhance the identification of proper safety harness use by workers at height, this study introduces a machine vision approach as a substitute for manual supervision. By focusing on the safety rope that connects the worker to an anchor point, we propose a semantic segmentation mask annotation principle to evaluate proper harness use. We introduce CEMFormer, a novel semantic segmentation model utilizing ConvNeXt as the backbone, which surpasses the traditional ResNet in accuracy. Efficient Multi-Scale Attention (EMA) is incorporated to optimize channel weights and integrate spatial information. Mask2Former serves as the segmentation head, enhanced by Poly Loss for classification and Log-Cosh Dice Loss for mask loss, thereby improving training efficiency. Experimental results indicate that CEMFormer achieves a mean accuracy of 92.31%, surpassing the baseline and five state-of-the-art models. Ablation studies underscore the contribution of each component to the model’s accuracy, demonstrating the effectiveness of the proposed approach in ensuring worker safety. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Figure 1
<p>The overall structure of CEMFormer.</p>
Full article ">Figure 2
<p>An example of annotations in the original dataset: (<b>a</b>) the original size; (<b>b</b>) an enlarged view of the local region.</p>
Full article ">Figure 3
<p>An example of annotations in the new dataset: (<b>a</b>) an example of an unsecured safety rope; (<b>b</b>) an example of a secured safety rope; (<b>c</b>) an enlarged view of (<b>b</b>).</p>
Full article ">Figure 4
<p>The structure of the ConvNeXt Block.</p>
Full article ">Figure 5
<p>The structure of the Efficient Multi-Scale Attention (EMA) block.</p>
Full article ">Figure 6
<p>The structure of the DETR Encoder.</p>
Full article ">Figure 7
<p>The structure of the Decoder block.</p>
Full article ">Figure 8
<p>An example image.</p>
Full article ">Figure 9
<p>An example semantic segmentation result.</p>
Full article ">
19 pages, 1689 KiB  
Article
PE-MCAT: Leveraging Image Sensor Fusion and Adaptive Thresholds for Semi-Supervised 3D Object Detection
by Bohao Li, Shaojing Song and Luxia Ai
Sensors 2024, 24(21), 6940; https://doi.org/10.3390/s24216940 - 29 Oct 2024
Viewed by 481
Abstract
Existing 3D object detection frameworks in sensor-based applications heavily rely on large-scale annotated data to achieve optimal performance. However, obtaining such annotations from sensor data—like LiDAR or image sensors—is both time-consuming and costly. Semi-supervised learning offers an efficient solution to this challenge and [...] Read more.
Existing 3D object detection frameworks in sensor-based applications heavily rely on large-scale annotated data to achieve optimal performance. However, obtaining such annotations from sensor data—like LiDAR or image sensors—is both time-consuming and costly. Semi-supervised learning offers an efficient solution to this challenge and holds significant potential for sensor-driven artificial intelligence (AI) applications. While it reduces the need for labeled data, semi-supervised learning still depends on a small amount of labeled samples for training. In the initial stages, relying on such limited samples can adversely affect the effective training of student–teacher networks. In this paper, we propose PE-MCAT, a semi-supervised 3D object detection method that generates high-precision pseudo-labels. First, to address the challenges of insufficient local feature capture and poor robustness in point cloud data, we introduce a point enrichment module. This module incorporates information from image sensors and combines multiple feature fusion methods of local and self-features to directly enhance the quality of point clouds and pseudo-labels, compensating for the limitations posed by using only a few labeled samples. Second, we explore the relationship between the teacher network and the pseudo-labels it generates. We propose a multi-class adaptive threshold strategy to initially filter and create a high-quality pseudo-label set. Furthermore, a joint variable threshold strategy is introduced to refine this set further, enhancing the selection of superior pseudo-labels.Extensive experiments demonstrate that PE-MCAT consistently outperforms recent state-of-the-art methods across different datasets. Specifically, on the KITTI dataset and using only 2% of labeled samples, our method improved the mean Average Precision (mAP) by 0.7% for cars, 3.7% for pedestrians, and 3.0% for cyclists. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>In the strategy for generating pseudo-labels, we compared our PE-MCAT method with previous methods. (<b>a</b>) Previous methods directly processed input data through a teacher network to initially generate pseudo-labels. Then, used a fixed threshold to select high-quality pseudo-labels. (<b>b</b>) The method proposed in PE-MCAT first processes the input data through point enrichment. Next, the enhanced data is input into the teacher network to initially generate pseudo-labels. Multi-class adaptive threshold (MCAT) is then applied to determine a more precise dynamic threshold. This threshold is used to filter the pseudo-labels.</p>
Full article ">Figure 2
<p>Overview of the proposed method pipeline.</p>
Full article ">Figure 3
<p>Comparison of the three-dimensional scales of the three-frame point clouds.</p>
Full article ">Figure 4
<p>Comparison chart of the Manhattan distance, Euclidean distance, and prominent features in the three-dimensional scales of the three-frame point cloud scenes.</p>
Full article ">Figure 5
<p>3D spherical neighborhood visualization.</p>
Full article ">Figure 6
<p>Three types of adaptive thresholding curves.</p>
Full article ">Figure 7
<p>Scores and thresholds from k-means clustering for the car category.</p>
Full article ">Figure 8
<p>Comparison of cyclist detection results.</p>
Full article ">Figure 9
<p>Comparison of car detection results.</p>
Full article ">Figure 10
<p>Comparison of pedestrian detection results.</p>
Full article ">
22 pages, 9696 KiB  
Article
Text-Enhanced Graph Attention Hashing for Cross-Modal Retrieval
by Qiang Zou, Shuli Cheng, Anyu Du and Jiayi Chen
Entropy 2024, 26(11), 911; https://doi.org/10.3390/e26110911 - 27 Oct 2024
Viewed by 618
Abstract
Deep hashing technology, known for its low-cost storage and rapid retrieval, has become a focal point in cross-modal retrieval research as multimodal data continue to grow. However, existing supervised methods often overlook noisy labels and multiscale features in different modal datasets, leading to [...] Read more.
Deep hashing technology, known for its low-cost storage and rapid retrieval, has become a focal point in cross-modal retrieval research as multimodal data continue to grow. However, existing supervised methods often overlook noisy labels and multiscale features in different modal datasets, leading to higher information entropy in the generated hash codes and features, which reduces retrieval performance. The variation in text annotation information across datasets further increases the information entropy during text feature extraction, resulting in suboptimal outcomes. Consequently, reducing the information entropy in text feature extraction, supplementing text feature information, and enhancing the retrieval efficiency of large-scale media data are critical challenges in cross-modal retrieval research. To tackle these, this paper introduces the Text-Enhanced Graph Attention Hashing for Cross-Modal Retrieval (TEGAH) framework. TEGAH incorporates a deep text feature extraction network and a multiscale label region fusion network to minimize information entropy and optimize feature extraction. Additionally, a Graph-Attention-based modal feature fusion network is designed to efficiently integrate multimodal information, enhance the affinity of the network for different modes, and retain more semantic information. Extensive experiments on three multilabel datasets demonstrate that the TEGAH framework significantly outperforms state-of-the-art cross-modal hashing methods. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

Figure 1
<p>The overall framework of TEGAH can be divided into five parts: (1) Image-Net: employing the Swin Transformer-Small (SwinT-S) model to extract semantic features from images and map these features into the feature space; (2) Graph Attention Feature Fusion Module (GAFM): a feature fusion and alignment network that weights and merges image and text features to address semantic discrepancies between different modalities; (3) Multiscale Label Area Hybrid Network (MLAH): utilizing multiscale features across four layers and incorporating multiscale attention to mitigate issues related to insufficient textual information; (4) Deep Text Feature Extraction Network (DTFEN): improving upon traditional methods by capturing high-quality textual feature information; (5) Hash Learning Module: transforming features into hash codes through nonlinear changes, with training assisted by a combination of cosine-weighted triplet loss, label distillation loss, Wasserstein loss, and quantization loss, each component specifically designed to enhance the extraction, fusion, and representation of multimodal features, thereby improving the accuracy and efficiency of cross-modal hash retrieval.</p>
Full article ">Figure 2
<p>Graph Attention Feature Fusion Module (GAFM) architecture integrates and aligns image and text features through the interaction of Layerwise Propagation Rule (LPR) and Gated Recurrent Unit (GRU), employing Local Linear Fusion (LLF) to mine multiscale information internally. The features are ultimately fed into the GAT to generate predicted pseudo-labels.</p>
Full article ">Figure 3
<p>Multiscale Label Area Hybrid Network (MLAH) consists of a feature extraction module followed by four hierarchical multiscale attention modules, which are ultimately integrated through weighted fusion.</p>
Full article ">Figure 4
<p>Deep Text Feature Extraction Network (DTFEN) comprises two deep extraction modules and an autoencoder.</p>
Full article ">Figure 5
<p>Results of PR curves of 32 bits and 64 bits on MIRFLICKR-25K dataset.</p>
Full article ">Figure 6
<p>Results of PR curves of 32 bits and 64 bits on NUS-WIDE dataset.</p>
Full article ">Figure 7
<p>Results of PR curves of 32 bits and 64 bits on MS-COCO dataset.</p>
Full article ">Figure 8
<p>Utilizing our TEGAH framework, original samples are encoded and subjected to retrieval within the MS-COCO dataset, employing 64-bit hash codes to ascertain the top 5 results. Samples returned and denoted with a blue marker signify relevance to the query sample.</p>
Full article ">Figure 9
<p>The results of visualization of 10 images randomly selected in three datasets using the Grad-CAM method.</p>
Full article ">
22 pages, 6160 KiB  
Article
WaterGPT: Training a Large Language Model to Become a Hydrology Expert
by Yi Ren, Tianyi Zhang, Xurong Dong, Weibin Li, Zhiyang Wang, Jie He, Hanzhi Zhang and Licheng Jiao
Water 2024, 16(21), 3075; https://doi.org/10.3390/w16213075 - 27 Oct 2024
Viewed by 900
Abstract
This paper introduces WaterGPT, a language model designed for complex multimodal tasks in hydrology. WaterGPT is applied in three main areas: (1) processing and analyzing data such as images and text in water resources, (2) supporting intelligent decision-making for hydrological tasks, and (3) [...] Read more.
This paper introduces WaterGPT, a language model designed for complex multimodal tasks in hydrology. WaterGPT is applied in three main areas: (1) processing and analyzing data such as images and text in water resources, (2) supporting intelligent decision-making for hydrological tasks, and (3) enabling interdisciplinary information integration and knowledge-based Q&A. The model has achieved promising results. One core aspect of WaterGPT involves the meticulous segmentation of training data for the supervised fine-tuning phase, sourced from real-world data and annotated with high quality using both manual methods and GPT-series model annotations. These data are carefully categorized into four types: knowledge-based, task-oriented, negative samples, and multi-turn dialogues. Additionally, another key component is the development of a multi-agent framework called Water_Agent, which enables WaterGPT to intelligently invoke various tools to solve complex tasks in the field of water resources. This framework handles multimodal data, including text and images, allowing for deep understanding and analysis of complex hydrological environments. Based on this framework, WaterGPT has achieved over a 90% success rate in tasks such as object detection and waterbody extraction. For the waterbody extraction task, using Dice and mIoU metrics, WaterGPT’s performance on high-resolution images from 2013 to 2022 has remained stable, with accuracy exceeding 90%. Moreover, we have constructed a high-quality water resources evaluation dataset, EvalWater, which covers 21 categories and approximately 10,000 questions. Using this dataset, WaterGPT achieved the highest accuracy to date in the field of water resources, reaching 83.09%, which is about 17.83 points higher than GPT-4. Full article
Show Figures

Figure 1

Figure 1
<p>SFT training data production process.</p>
Full article ">Figure 2
<p>EvalWater dataset.</p>
Full article ">Figure 3
<p>Water_Agent framework diagram.</p>
Full article ">Figure 4
<p>Subtasks supported by Water_Agent.</p>
Full article ">Figure 5
<p>Calculation flowchart.</p>
Full article ">Figure 6
<p>Training process change curves.</p>
Full article ">Figure 7
<p>Evaluation results of each model classified by EvalWater.</p>
Full article ">Figure 8
<p>Comparison chart of evaluation results for various models.</p>
Full article ">Figure 9
<p>Water_Agent operation diagram.</p>
Full article ">Figure 10
<p>Accuracy of water body extraction in different years.</p>
Full article ">Figure 11
<p>Completion rates of different models on simple and complex hydrology tasks.</p>
Full article ">Figure 12
<p>Average completion rate of different models on overall hydrology tasks.</p>
Full article ">Figure 13
<p>Gpt4 evaluation results of model answer quality from different dimensions.</p>
Full article ">Figure 14
<p>Gpt4’s overall evaluation results of model answer quality.</p>
Full article ">Figure 15
<p>The outcome of Gpt4 and WaterGPT.</p>
Full article ">
18 pages, 39884 KiB  
Article
CLOUDSPAM: Contrastive Learning On Unlabeled Data for Segmentation and Pre-Training Using Aggregated Point Clouds and MoCo
by Reza Mahmoudi Kouhi, Olivier Stocker, Philippe Giguère and Sylvie Daniel
Remote Sens. 2024, 16(21), 3984; https://doi.org/10.3390/rs16213984 - 26 Oct 2024
Viewed by 799
Abstract
SegContrast first paved the way for contrastive learning on outdoor point clouds. Its original formulation targeted individual scans in applications like autonomous driving and object detection. However, mobile mapping purposes such as digital twin cities and urban planning require large-scale dense datasets to [...] Read more.
SegContrast first paved the way for contrastive learning on outdoor point clouds. Its original formulation targeted individual scans in applications like autonomous driving and object detection. However, mobile mapping purposes such as digital twin cities and urban planning require large-scale dense datasets to capture the full complexity and diversity present in outdoor environments. In this paper, the SegContrast method is revisited and adapted to overcome its limitations associated with mobile mapping datasets, namely the scarcity of contrastive pairs and memory constraints. To overcome the scarcity of contrastive pairs, we propose the merging of heterogeneous datasets. However, this merging is not a straightforward procedure due to the variety of size and number of points in the point clouds of these datasets. Therefore, a data augmentation approach is designed to create a vast number of segments while optimizing the size of the point cloud samples to the allocated memory. This methodology, called CLOUDSPAM, guarantees the performance of the self-supervised model for both small- and large-scale mobile mapping point clouds. Overall, the results demonstrate the benefits of utilizing datasets with a wide range of densities and class diversity. CLOUDSPAM matched the state of the art on the KITTI-360 dataset, with a 63.6% mIoU, and came in second place on the Toronto-3D dataset. Finally, CLOUDSPAM achieved competitive results against its fully supervised counterpart with only 10% of labeled data. Full article
Show Figures

Figure 1

Figure 1
<p>An overview of CLOUDSPAM. Leveraging the proposed data augmentation method, heterogeneous mobile mapping point clouds are merged for pre-training with MoCo (Momentum Contrast). During the pre-training phase, the “query partitions” represent the positive pairs processed by the encoder, while the “memory Bank” contains the negative pairs input into the momentum encoder. Subsequently, fine-tuning is conducted separately for each dataset using the labeled partitions generated by the proposed data augmentation method.</p>
Full article ">Figure 2
<p>Segmentation of the KITTI-360 dataset (<b>a</b>) without the proposed data augmentation and (<b>b</b>) with the proposed data augmentation. The ground segment, computed using RANSAC, is displayed in gray. All the other segments, computed using the DBSCAN algorithm, are shown in colors other than gray.</p>
Full article ">Figure 3
<p>Visualization of (<b>a</b>) one partition extracted from the aggregated KITTI-360 dataset using the proposed partitioning approach and (<b>b</b>) its associated segments. White and purple squares represent the seed points selected with the FPS approach over this area. Colors in (<b>a</b>) represent true labels, while those in (<b>b</b>) represent different segments.</p>
Full article ">Figure 4
<p>Overview of three learning strategies used in the comparative study. “Baseline” strategy refers to a supervised training. “DA supervised” is equivalent to the “Baseline” but using labeled partitions generated with the proposed data augmentation approach. The “CLOUDSPAM” strategy refers to self-supervised pre-training with MoCo using unlabeled partitions, followed by supervised fine-tuning using labeled partitions, with both labeled and unlabeled partitions provided by the proposed data augmentation approach.</p>
Full article ">Figure 5
<p>Comparison of mIoU (%) scores of CLOUDSPAM per epoch of pre-training for each of 6 data regimes on (<b>a</b>) the test set of the Toronto-3D dataset and (<b>b</b>) the validation set of the Paris-Lille-3D dataset.</p>
Full article ">Figure 6
<p>Two overlapping partitions generated by the proposed data augmentation approach. Each color represents a different segment. The same objects can appear in two different segments in two partitions, such as the car outlined by a red square.</p>
Full article ">Figure 7
<p>Inference results of the CLOUDSPAM strategy on the KITTI-360 (KIT-360), Toronto-3D (T3D) and Paris-Lille-3D (PL3D) test sets for every investigated data regime compared to the ground truth (GT). The ground truth of the Paris-Lille-3D test set was not provided by the authors.</p>
Full article ">
20 pages, 790 KiB  
Article
A Study of the Mechanisms of Government Embedment and Organizational Environment Related to Supervisory Effectiveness in the Collective Governance of Rural Residential Land
by Zhongjian Yang, Hong Tang, Wenxiang Zhao and Ruiping Ran
Land 2024, 13(11), 1760; https://doi.org/10.3390/land13111760 - 26 Oct 2024
Viewed by 378
Abstract
Rural Residential Land is a kind of public pond resource, and the implementation of the collective governance of Rural Residential Land is a response to the dilemma of utilizing the land. This paper takes the attribute of using Rural Residential Land as a [...] Read more.
Rural Residential Land is a kind of public pond resource, and the implementation of the collective governance of Rural Residential Land is a response to the dilemma of utilizing the land. This paper takes the attribute of using Rural Residential Land as a public pond resource as an entry point, draws on Autonomous Governance Theory, and utilizes the OLS model and the mediated effect model to explore the influence of Government Embedment on supervisory effectiveness in the collective governance of Rural Residential Land, based on the fieldwork data of 450 farming households in three districts (cities and counties) in Sichuan Province. This study shows that Government Embedment, the Technological Environment and the Cultural Environment can significantly enhance supervisory effectiveness in the collective governance of Rural Residential Land, while the Resource Environment has a negative effect on it. The Resource Environment has a masking effect on the influence of Government Embedment on supervisory effectiveness in the collective governance of Rural Residential Land, while the Technological Environment and the Cultural Environment play a part in the mediating effect. The anonymous whistleblowing mechanism positively moderates the influence of the Technological and Cultural Environments on supervisory effectiveness. Additionally, there is obvious locational heterogeneity in the influence of Government Embedment and the Organizational Environment on supervisory effectiveness. Therefore, a coordinated supervision and sanctioning mechanism should be constructed in towns and villages to promote the integration and complementarity of formal and informal systems; a sound mechanism for anonymous reporting of illegal and irregular use of Rural Residential Land should be established; and the effective implementation of the collective governance of Rural Residential Land should be promoted in accordance with the differences in the location of the villages. Full article
Show Figures

Figure 1

Figure 1
<p>Research framework.</p>
Full article ">Figure 2
<p>Investigation area.</p>
Full article ">
33 pages, 6528 KiB  
Article
TVGeAN: Tensor Visibility Graph-Enhanced Attention Network for Versatile Multivariant Time Series Learning Tasks
by Mohammed Baz
Mathematics 2024, 12(21), 3320; https://doi.org/10.3390/math12213320 - 23 Oct 2024
Viewed by 502
Abstract
This paper introduces Tensor Visibility Graph-enhanced Attention Networks (TVGeAN), a novel graph autoencoder model specifically designed for MTS learning tasks. The underlying approach of TVGeAN is to combine the power of complex networks in representing time series as graphs with the strengths of [...] Read more.
This paper introduces Tensor Visibility Graph-enhanced Attention Networks (TVGeAN), a novel graph autoencoder model specifically designed for MTS learning tasks. The underlying approach of TVGeAN is to combine the power of complex networks in representing time series as graphs with the strengths of Graph Neural Networks (GNNs) in learning from graph data. TVGeAN consists of two new main components: TVG which extend the capabilities of visibility graph algorithms in representing MTSs by converting them into weighted temporal graphs where both the nodes and the edges are tensors. Each node in the TVG represents the MTS observations at a particular time, while the weights of the edges are defined based on the visibility angle algorithm. The second main component of the proposed model is GeAN, a novel graph attention mechanism developed to seamlessly integrate the temporal interactions represented in the nodes and edges of the graphs into the core learning process. GeAN achieves this by using the outer product to quantify the pairwise interactions of nodes and edges at a fine-grained level and a bilinear model to effectively distil the knowledge interwoven in these representations. From an architectural point of view, TVGeAN builds on the autoencoder approach complemented by sparse and variational learning units. The sparse learning unit is used to promote inductive learning in TVGeAN, and the variational learning unit is used to endow TVGeAN with generative capabilities. The performance of the TVGeAN model is extensively evaluated against four widely cited MTS benchmarks for both supervised and unsupervised learning tasks. The results of these evaluations show the high performance of TVGeAN for various MTS learning tasks. In particular, TVGeAN can achieve an average root mean square error of 6.8 for the C-MPASS dataset (i.e., regression learning tasks) and a precision close to one for the SMD, MSL, and SMAP datasets (i.e., anomaly detection learning tasks), which are better results than most published works. Full article
(This article belongs to the Section Mathematics and Computer Science)
Show Figures

Figure 1

Figure 1
<p>High-level abstraction of the proposed model. The raw MTS datasets are passed to the TVG algorithms to transform them into graphs, which are then processed by three main parts of the proposed model (the encoder, the stochastic layer, and the decoder) to finally produce the synthesized graphs.</p>
Full article ">Figure 2
<p>High-level abstraction of the TVG algorithm showing how the raw MTS is mapped into a graph.</p>
Full article ">Figure 3
<p>Normalized MAE vs. number of epochs for the seven datasets: (<b>a</b>) FD001, (<b>b</b>) FD002, (<b>c</b>) FD003, (<b>d</b>) FD004, (<b>e</b>) SMD, (<b>f</b>) MSL, (<b>g</b>) SMAP.</p>
Full article ">Figure 3 Cont.
<p>Normalized MAE vs. number of epochs for the seven datasets: (<b>a</b>) FD001, (<b>b</b>) FD002, (<b>c</b>) FD003, (<b>d</b>) FD004, (<b>e</b>) SMD, (<b>f</b>) MSL, (<b>g</b>) SMAP.</p>
Full article ">Figure 4
<p>Snapshots of graphs generated by the (<b>a</b>) TVG, (<b>b</b>) monoplex, and (<b>c</b>) multiplex algorithms for a subset of SDM.</p>
Full article ">Figure 4 Cont.
<p>Snapshots of graphs generated by the (<b>a</b>) TVG, (<b>b</b>) monoplex, and (<b>c</b>) multiplex algorithms for a subset of SDM.</p>
Full article ">Figure 4 Cont.
<p>Snapshots of graphs generated by the (<b>a</b>) TVG, (<b>b</b>) monoplex, and (<b>c</b>) multiplex algorithms for a subset of SDM.</p>
Full article ">Figure 5
<p>Normalized RMAE vs. number of epochs for the seven datasets: (<b>a</b>) FD001, (<b>b</b>) FD002, (<b>c</b>) FD003, (<b>d</b>) FD004, (<b>e</b>) SMD, (f) MSL, (<b>g</b>) SMAP.</p>
Full article ">Figure 5 Cont.
<p>Normalized RMAE vs. number of epochs for the seven datasets: (<b>a</b>) FD001, (<b>b</b>) FD002, (<b>c</b>) FD003, (<b>d</b>) FD004, (<b>e</b>) SMD, (f) MSL, (<b>g</b>) SMAP.</p>
Full article ">
21 pages, 1883 KiB  
Article
Adaptive Point Learning with Uncertainty Quantification to Generate Margin Lines on Prepared Teeth
by Ammar Alsheghri, Yoan Ladini, Golriz Hosseinimanesh, Imane Chafi, Julia Keren, Farida Cheriet and François Guibault
Appl. Sci. 2024, 14(20), 9486; https://doi.org/10.3390/app14209486 - 17 Oct 2024
Viewed by 887
Abstract
During a crown generation procedure, dental technicians depend on commercial software to generate a margin line to define the design boundary for the crown. The margin line generation remains a non-reproducible, inconsistent, and challenging procedure. In this work, we propose to generate margin [...] Read more.
During a crown generation procedure, dental technicians depend on commercial software to generate a margin line to define the design boundary for the crown. The margin line generation remains a non-reproducible, inconsistent, and challenging procedure. In this work, we propose to generate margin line points on prepared teeth meshes using adaptive point learning inspired by the AdaPointTr model. We extracted ground truth margin lines as point clouds from the prepared teeth and crown bottom meshes. The chamfer distance (CD) and infoCD loss functions were used for training a supervised deep learning model that outputs a margin line as a point cloud. To enhance the generation results, the deep learning model was trained based on three different resolutions of the target margin lines, which were used to back-propagate the losses. Five folds were trained and an ensemble model was constructed. The training and test sets contained 913 and 134 samples, respectively, covering all teeth positions. Intraoral scanning was used to collect all samples. Our post-processing involves removing outlier points based on local point density and principal component analysis (PCA) followed by a spline prediction. Comparing our final spline predictions with the ground truth margin line using CD, we achieved a median distance of 0.137 mm. The median Hausdorff distance was 0.242 mm. We also propose a novel confidence metric for uncertainty quantification of generated margin lines during deployment. The metric was defined based on the percentage of removed outliers during the post-processing stage. The proposed end-to-end framework helps dental professionals in generating and evaluating margin lines consistently. The findings underscore the potential of deep learning to revolutionize the detection and extraction of 3D landmarks, offering personalized and robust methods to meet the increasing demands for precision and efficiency in the medical field. Full article
Show Figures

Figure 1

Figure 1
<p>Converting die meshes to point clouds and downsampling the point clouds to 10,000 points.</p>
Full article ">Figure 2
<p>Extracting ground truth margin lines. A crown bottom is first extracted from a crown designed by a dental technician. The internal edge of crown bottom lower horizontal thickness coincides with the margin line on the dental preparation. The internal points are extracted, projected on the die, and augmented to represent the margin line.</p>
Full article ">Figure 3
<p>AdaPoinTr architecture showing the forward pass in blue arrows and backpropagation pass in red.</p>
Full article ">Figure 4
<p>One case augmented 20 times.</p>
Full article ">Figure 5
<p>Identifying outliers with (<b>a</b>) local density only; (<b>b</b>) with local density and PCA; (<b>c</b>) first component of PCA. Purple represents outliers in (<b>a</b>,<b>b</b>). With both local density and PCA, less outliers are observed.</p>
Full article ">Figure 6
<p>Illustration of the post-processing procedures to remove outliers.</p>
Full article ">Figure 7
<p>Predicted margin line point clouds of four test cases of different positions compared with ground truth. Red is the prediction, green is the ground truth.</p>
Full article ">Figure 8
<p>Qualitative comparison of margin lines obtained using the proposed framework showing the predicted points with outliers highlighted, the predicted margin line splines with outliers (baseline), the predicted splines without outliers improvement, and the ground truth margin lines. The chamfer distance and confidence metric are also presented for each test case.</p>
Full article ">Figure 9
<p>Qualitative and quantitative results comparing the margin line predictions using our proposed model with the ground truth.</p>
Full article ">Figure 10
<p>Challenging test case showing margin line prediction (dotted) compared with ground truth (solid), both overlaid on the die. The contours of the mean curvatures values of the die mesh are shown. Blue represents high curvature and red represents low curvature. The die geometry is also shown without contours.</p>
Full article ">Figure A1
<p>Worst margin line point cloud prediction recorded on a test case considered as a special case.</p>
Full article ">Figure A2
<p>Representative frequencies of (<b>a</b>) CD values; (<b>b</b>) percentage of outliers, for the test set obtained using fold 2 model. CD values start from 0.062 mm because the prediction never matches the ground truth exactly.</p>
Full article ">Figure A3
<p>Representative CD training and validation loss curves.</p>
Full article ">Figure A4
<p>Ordering point cloud using travel sales person algorithm. The 10 lasts points of the point cloud are shown in red, and the first 10 are shown in blue. Notice that one red point is far from where it is supposed to be.</p>
Full article ">
Back to TopTop