[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (368)

Search Parameters:
Keywords = local binary pattern

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 4502 KiB  
Article
MTC-GAN Bearing Fault Diagnosis for Small Samples and Variable Operating Conditions
by Jinghua Li, Yonghe Wei and Xiaojiao Gu
Appl. Sci. 2024, 14(19), 8791; https://doi.org/10.3390/app14198791 - 29 Sep 2024
Viewed by 400
Abstract
In response to the challenges of bearing fault diagnosis under small sample sizes and variable operating conditions, this paper proposes a novel method based on the two-dimensional analysis of vibration acceleration signals and a Multi-Task Conditional Generative Adversarial Network (MTC-GAN). This method first [...] Read more.
In response to the challenges of bearing fault diagnosis under small sample sizes and variable operating conditions, this paper proposes a novel method based on the two-dimensional analysis of vibration acceleration signals and a Multi-Task Conditional Generative Adversarial Network (MTC-GAN). This method first constructs two-dimensional images of vibration signals by leveraging the physical properties of the bearing acceleration signals and employs Local Binary Patterns (LBP) to extract subtle texture features from these images, thereby generating fault feature signatures with high discriminative power across different operating conditions. Subsequently, MTC-GAN is utilized for data augmentation, and the trained discriminator is used to perform fault classification tasks, improving classification accuracy under conditions with small sample sizes. Experimental results demonstrate that the proposed method achieves excellent fault diagnosis accuracy and robustness under both small sample sizes and varying operating conditions. Compared to traditional methods, this approach exhibits higher efficiency and reliability in handling complex operating conditions and data scarcity. Full article
Show Figures

Figure 1

Figure 1
<p>Generator structure diagram.</p>
Full article ">Figure 2
<p>Discriminator structure diagram.</p>
Full article ">Figure 3
<p>Proposed framework diagram for bearing fault diagnosis under small sample size and variable operating conditions.</p>
Full article ">Figure 4
<p>Case Western Reserve University (CWRU) bearing test bench.</p>
Full article ">Figure 5
<p>The vibration images for (<b>a</b>) the inner race fault, (<b>b</b>) the ball, and (<b>c</b>) the outer race fault.</p>
Full article ">Figure 6
<p>The LBP images for (<b>a</b>) the inner race fault, (<b>b</b>) the ball, and (<b>c</b>) the outer race fault.</p>
Full article ">Figure 7
<p>Test set accuracy curves with different numbers of additional generated samples at 1797 rpm.</p>
Full article ">Figure 8
<p>The generator loss and discriminator loss curves of the MTC-GAN model.</p>
Full article ">Figure 9
<p>Final accuracy vs. initial accuracy at different rotational speeds.</p>
Full article ">Figure 10
<p>Final accuracy vs. initial accuracy under variable operating conditions.</p>
Full article ">Figure 11
<p>Confusion matrix for classification results: (<b>a</b>) Scenario 1; (<b>b</b>) Scenario 2; (<b>c</b>) Scenario 3; and (<b>d</b>) Scenario 4.</p>
Full article ">
20 pages, 27260 KiB  
Article
An Improved Product Defect Detection Method Combining Centroid Distance and Textural Information
by Haorong Wu, Xiaoxiao Li, Fuchun Sun, Limin Huang, Tao Yang, Yuechao Bian and Qiurong Lv
Electronics 2024, 13(19), 3798; https://doi.org/10.3390/electronics13193798 - 25 Sep 2024
Viewed by 405
Abstract
In order to solve the problems of a high mismatching rate and being easily affected by noise and gray transformation, an improved product defect detection method combining centroid distance and textural information is proposed in this paper. Based on image preprocessing, the improved [...] Read more.
In order to solve the problems of a high mismatching rate and being easily affected by noise and gray transformation, an improved product defect detection method combining centroid distance and textural information is proposed in this paper. Based on image preprocessing, the improved fuzzy C-means clustering method is used to extract the closed contour features. Then, the contour center distance description operator is used for bidirectional matching, and a robust coarse matching contour pair is obtained. After the coarse matching contour pair is screened, the refined matching result is obtained by using the improved local binary pattern operator. Finally, by comparing whether the number of fine matching pairs is consistent with the number of template outlines, the detection of good and bad industrial products is realized, and the closed contour extraction experiment, the anti-rotation matching experiment, the anti-gray difference matching experiment, and the defect detection experiment of three different products are designed. The experimental results show that the improved product defect detection method has good performance in relation to anti-rotation transformation and anti-gray difference, the detection accuracy can reach more than 90%, and the detection time is up to 362.6 ms, which can meet the requirements of industrial real-time detection. Full article
Show Figures

Figure 1

Figure 1
<p>An improved algorithmic flow diagram of a closed contour matching method for detecting defects in industrial products.</p>
Full article ">Figure 2
<p>Schematic diagram of the original LBP thresholding process.</p>
Full article ">Figure 3
<p>LBP rotation invariant mode.</p>
Full article ">Figure 4
<p>Improved LBP operator in the fine matching model.</p>
Full article ">Figure 5
<p>The matching process from “Rough” to “Fine” based on the combined centroid distance and textural information.</p>
Full article ">Figure 6
<p>The misjudgment of good products because of distractors being too close.</p>
Full article ">
29 pages, 4861 KiB  
Article
A New Approach for Effective Retrieval of Medical Images: A Step towards Computer-Assisted Diagnosis
by Suchita Sharma and Ashutosh Aggarwal
J. Imaging 2024, 10(9), 210; https://doi.org/10.3390/jimaging10090210 - 26 Aug 2024
Viewed by 647
Abstract
The biomedical imaging field has grown enormously in the past decade. In the era of digitization, the demand for computer-assisted diagnosis is increasing day by day. The COVID-19 pandemic further emphasized how retrieving meaningful information from medical repositories can aid in improving the [...] Read more.
The biomedical imaging field has grown enormously in the past decade. In the era of digitization, the demand for computer-assisted diagnosis is increasing day by day. The COVID-19 pandemic further emphasized how retrieving meaningful information from medical repositories can aid in improving the quality of patient’s diagnosis. Therefore, content-based retrieval of medical images has a very prominent role in fulfilling our ultimate goal of developing automated computer-assisted diagnosis systems. Therefore, this paper presents a content-based medical image retrieval system that extracts multi-resolution, noise-resistant, rotation-invariant texture features in the form of a novel pattern descriptor, i.e., MsNrRiTxP, from medical images. In the proposed approach, the input medical image is initially decomposed into three neutrosophic images on its transformation into the neutrosophic domain. Afterwards, three distinct pattern descriptors, i.e., MsTrP, NrTxP, and RiTxP, are derived at multiple scales from the three neutrosophic images. The proposed MsNrRiTxP pattern descriptor is obtained by scale-wise concatenation of the joint histograms of MsTrP×RiTxP and NrTxP×RiTxP. To demonstrate the efficacy of the proposed system, medical images of different modalities, i.e., CT and MRI, from four test datasets are considered in our experimental setup. The retrieval performance of the proposed approach is exhaustively compared with several existing, recent, and state-of-the-art local binary pattern-based variants. The retrieval rates obtained by the proposed approach for the noise-free and noisy variants of the test datasets are observed to be substantially higher than the compared ones. Full article
Show Figures

Figure 1

Figure 1
<p>Neutrosophic images of input medical image when transformed into neutrosophic domain: (<b>a</b>) Sample noise-free and noisy medical images, (<b>b</b>) truth image (<math display="inline"><semantics> <msub> <mi>T</mi> <mrow> <mi>N</mi> <mi>S</mi> </mrow> </msub> </semantics></math>), (<b>c</b>) indeterminacy image (<math display="inline"><semantics> <msub> <mi>I</mi> <mrow> <mi>N</mi> <mi>S</mi> </mrow> </msub> </semantics></math>), and (<b>d</b>) falsity image (<math display="inline"><semantics> <msub> <mi>F</mi> <mrow> <mi>N</mi> <mi>S</mi> </mrow> </msub> </semantics></math>).</p>
Full article ">Figure 2
<p>A sample image patch (around a center pixel <math display="inline"><semantics> <msub> <mi>p</mi> <mi>c</mi> </msub> </semantics></math>, highlighted in red) from noise-free and noisy image for illustration of noise robustness of the <math display="inline"><semantics> <msub> <mrow> <mi>T</mi> <mi>r</mi> <mi>P</mi> </mrow> <mi>r</mi> </msub> </semantics></math> pattern. The figure also shows the multi-resolution view of the image patches at four scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math>, corresponding to <math display="inline"><semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>3</mn> <mo>,</mo> <mn>4</mn> </mrow> </semantics></math>, respectively.</p>
Full article ">Figure 3
<p>Example illustrating the computation of proposed <math display="inline"><semantics> <msub> <mrow> <mi>T</mi> <mi>r</mi> <mi>P</mi> </mrow> <mi>r</mi> </msub> </semantics></math> pattern for center pixel <math display="inline"><semantics> <msub> <mi>p</mi> <mi>c</mi> </msub> </semantics></math> (highlighted in RED color) at multiples scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math>, corresponding to <math display="inline"><semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>3</mn> <mo>,</mo> <mn>4</mn> </mrow> </semantics></math>, respectively, on noise-free and noisy image patch shown in <a href="#jimaging-10-00210-f002" class="html-fig">Figure 2</a>: (<b>a</b>) Neighbor vectors <math display="inline"><semantics> <msub> <mi mathvariant="bold-italic">p</mi> <mi>r</mi> </msub> </semantics></math> at scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math> for noise-free image patch; (<b>b</b>) median quantized neighbor vectors <math display="inline"><semantics> <msub> <mi mathvariant="bold-italic">mqp</mi> <mi>r</mi> </msub> </semantics></math> at scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math> for noise-free image patch; (<b>c</b>) proposed <math display="inline"><semantics> <msub> <mrow> <mi>T</mi> <mi>r</mi> <mi>P</mi> </mrow> <mi>r</mi> </msub> </semantics></math> binary pattern; (<b>d</b>) median quantized neighbor vectors <math display="inline"><semantics> <msub> <mi mathvariant="bold-italic">mqp</mi> <mi>r</mi> </msub> </semantics></math> at scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math> for noisy image patch; (<b>e</b>) neighbor vectors <math display="inline"><semantics> <msub> <mi mathvariant="bold-italic">p</mi> <mi>r</mi> </msub> </semantics></math> at scales <math display="inline"><semantics> <msub> <mi>S</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mn>3</mn> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>S</mi> <mn>4</mn> </msub> </semantics></math> for noisy image patch.</p>
Full article ">Figure 4
<p>Sample image from each class of (<b>a</b>) Emphysema CT database, (<b>b</b>) NEMA CT database, (<b>c</b>) OASIS MRI database, and (<b>d</b>) NEMA MRI database.</p>
Full article ">Figure 5
<p>Sample noisy image from each class of (<b>a</b>) Emphysema CT database, (<b>b</b>) NEMA CT database, (<b>c</b>) OASIS MRI database, and (<b>d</b>) NEMA MRI database.</p>
Full article ">Figure 6
<p>Query results of the proposed method for noise-free query images on (<b>a</b>) Emphysema CT database, (<b>b</b>) NEMA CT database, (<b>c</b>) OASIS MRI database, and (<b>d</b>) NEMA MRI database.</p>
Full article ">Figure 7
<p>Query results of the proposed method for noisy query image on (<b>a</b>) Emphysema CT database, (<b>b</b>) NEMA CT database, (<b>c</b>) OASIS MRI database, and (<b>d</b>) NEMA MRI database.</p>
Full article ">Figure 8
<p>Proposed approach’s retrieval performance in comparison to all other methods in terms of <math display="inline"><semantics> <mrow> <mi>a</mi> <mi>v</mi> <mi>g</mi> <mi>P</mi> </mrow> </semantics></math> on noisy and noise-free images of four test datasets.</p>
Full article ">Figure 9
<p>Proposed approach’s retrieval performance in comparison to all other methods in terms of <math display="inline"><semantics> <mrow> <mi>M</mi> <mi>a</mi> <mi>v</mi> <mi>g</mi> <mi>P</mi> </mrow> </semantics></math> on noisy and noise-free images of four test datasets.</p>
Full article ">Figure 10
<p>Proposed approach’s retrieval performance in comparison to all other methods in terms of <math display="inline"><semantics> <mrow> <mi>C</mi> <mi>V</mi> </mrow> </semantics></math> (coefficient of variation) on noisy and noise-free images of four test datasets.</p>
Full article ">
33 pages, 30114 KiB  
Article
Exploring the Influence of Object, Subject, and Context on Aesthetic Evaluation through Computational Aesthetics and Neuroaesthetics
by Fangfu Lin, Wanni Xu, Yan Li and Wu Song
Appl. Sci. 2024, 14(16), 7384; https://doi.org/10.3390/app14167384 - 21 Aug 2024
Viewed by 719
Abstract
Background: In recent years, computational aesthetics and neuroaesthetics have provided novel insights into understanding beauty. Building upon the findings of traditional aesthetics, this study aims to combine these two research methods to explore an interdisciplinary approach to studying aesthetics. Method: Abstract artworks were [...] Read more.
Background: In recent years, computational aesthetics and neuroaesthetics have provided novel insights into understanding beauty. Building upon the findings of traditional aesthetics, this study aims to combine these two research methods to explore an interdisciplinary approach to studying aesthetics. Method: Abstract artworks were used as experimental materials. Based on traditional aesthetics and in combination, features of composition, tone, and texture were selected. Computational aesthetic methods were then employed to correspond these features to physical quantities: blank space, gray histogram, Gray Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP), and Gabor filters. An electroencephalogram (EEG) experiment was carried out, in which participants conducted aesthetic evaluations of the experimental materials in different contexts (genuine, fake), and their EEG data were recorded to analyze the impact of various feature classes in the aesthetic evaluation process. Finally, a Support Vector Machines (SVMs) was utilized to model the feature data, Event-Related Potentials (ERPs), context data, and subjective aesthetic evaluation data. Result: Behavioral data revealed higher aesthetic ratings in the genuine context. ERP data indicated that genuine contexts elicited more negative deflections in the prefrontal lobes between 200 and 1000 ms. Class II compositions demonstrated more positive deflections in the parietal lobes at 50–120 ms, while Class I tones evoked more positive amplitudes in the occipital lobes at 200–300 ms. Gabor features showed significant variations in the parieto-occipital area at an early stage. Class II LBP elicited a prefrontal negative wave with a larger amplitude. The results of the SVM models indicated that the model incorporating aesthetic subject and context data (ACC = 0.76866) outperforms the model using only parameters of the aesthetic object (ACC = 0.68657). Conclusion: A positive context tends to provide participants with a more positive aesthetic experience, but abstract artworks may not respond to this positivity. During aesthetic evaluation, the ERP data activated by different features show a trend from global to local. The SVM model based on multimodal data fusion effectively predicts aesthetics, further demonstrating the feasibility of the combined research approach of computational aesthetics and neuroaesthetics. Full article
Show Figures

Figure 1

Figure 1
<p>The calculation of blank space in Suprematist Composition: Airplane Flying (images processed by authors as fair use from wikiart.org) <a href="https://www.wikiart.org/en/kazimir-malevich/aeroplane-flying-1915" target="_blank">https://www.wikiart.org/en/kazimir-malevich/aeroplane-flying-1915</a> (accessed on 4 March 2024).</p>
Full article ">Figure 2
<p>Kernels of different wavelengths <math display="inline"><semantics> <mrow> <mi>λ</mi> </mrow> </semantics></math> and angles <math display="inline"><semantics> <mrow> <mi>θ</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>Example images of features.</p>
Full article ">Figure 4
<p>Illustration of the stimulus paradigm applied.</p>
Full article ">Figure 5
<p>Grand–average event–related brain potentials and isopotential contour plot (200–1000 ms) for genuine and fake context. <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 6
<p>Grand–average event–related brain potentials and isopotential contour plot (50–120 ms) for context (genuine, fake) × composition (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 7
<p>Grand–average event–related brain potentials and isopotential contour plot (200–300 ms) for context (genuine, fake) × tone (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 8
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms) for context (genuine, fake) × Gabor–Mean (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 9
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms) for context (genuine, fake) × Gabor–Variance (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 10
<p>Grand–average event–related brain potentials and isopotential contour plot (70–130 ms and 200–300 ms) for context (genuine, fake) × Gabor–Energy (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 11
<p>Grand–average event–related brain potentials and isopotential contour plot (500–1000 ms) for context (genuine, fake) × horizontal GLCM (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 12
<p>Grand–average event–related brain potentials and isopotential contour plot (70–140 ms and 500–1000 ms) for context (genuine, fake) × diagonal GLCM (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 13
<p>Grand–average event–related brain potentials and isopotential contour plot (300–1000 ms) for context (genuine, fake) × LBP (Class I, Class II). <span class="html-italic">N</span> = 12.</p>
Full article ">Figure 14
<p>Performance of SVM models with varying C and γ values: (<b>a</b>) the ACC with different C and γ combinations; (<b>b</b>) the AUC with different C and γ combinations. (The closer the color is to red, the higher the value; the closer it is to blue, the lower the value).</p>
Full article ">
16 pages, 4067 KiB  
Article
TriCAFFNet: A Tri-Cross-Attention Transformer with a Multi-Feature Fusion Network for Facial Expression Recognition
by Yuan Tian, Zhao Wang, Di Chen and Huang Yao
Sensors 2024, 24(16), 5391; https://doi.org/10.3390/s24165391 - 21 Aug 2024
Viewed by 624
Abstract
In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression [...] Read more.
In recent years, significant progress has been made in facial expression recognition methods. However, tasks related to facial expression recognition in real environments still require further research. This paper proposes a tri-cross-attention transformer with a multi-feature fusion network (TriCAFFNet) to improve facial expression recognition performance under challenging conditions. By combining LBP (Local Binary Pattern) features, HOG (Histogram of Oriented Gradients) features, landmark features, and CNN (convolutional neural network) features from facial images, the model is provided with a rich input to improve its ability to discern subtle differences between images. Additionally, tri-cross-attention blocks are designed to facilitate information exchange between different features, enabling mutual guidance among different features to capture salient attention. Extensive experiments on several widely used datasets show that our TriCAFFNet achieves the SOTA performance on RAF-DB with 92.17%, AffectNet (7 cls) with 67.40%, and AffectNet (8 cls) with 63.49%, respectively. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Figure 1
<p>The architecture of Baseline.</p>
Full article ">Figure 2
<p>The overall architecture of TriCAFFNet, a facial landmark detector MobileFaceNet, is applied to obtain landmark features and the advanced LBPHOG features, and an image backbone IR50 is used to extract image features.</p>
Full article ">Figure 3
<p>The architecture of tri-cross-attention module.</p>
Full article ">Figure 4
<p>Confusion matrics of TriCAFFNet on RAF-DB, AffectNet (7-cls) and AffectNet (8-cls).</p>
Full article ">Figure 5
<p>Visualization of high dimensional space t-SNE visualization results on RAF-DB and AffectNet.</p>
Full article ">
18 pages, 5196 KiB  
Article
The Framework of Quantifying Biomarkers of OCT and OCTA Images in Retinal Diseases
by Xiaoli Liu, Haogang Zhu, Hanji Zhang and Shaoyan Xia
Sensors 2024, 24(16), 5227; https://doi.org/10.3390/s24165227 - 13 Aug 2024
Viewed by 957
Abstract
Despite the significant advancements facilitated by previous research in introducing a plethora of retinal biomarkers, there is a lack of research addressing the clinical need for quantifying different biomarkers and prioritizing their importance for guiding clinical decision making in the context of retinal [...] Read more.
Despite the significant advancements facilitated by previous research in introducing a plethora of retinal biomarkers, there is a lack of research addressing the clinical need for quantifying different biomarkers and prioritizing their importance for guiding clinical decision making in the context of retinal diseases. To address this issue, our study introduces a novel framework for quantifying biomarkers derived from optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images in retinal diseases. We extract 452 feature parameters from five feature types, including local binary patterns (LBP) features of OCT and OCTA, capillary and large vessel features, and the foveal avascular zone (FAZ) feature. Leveraging this extensive feature set, we construct a classification model using a statistically relevant p value for feature selection to predict retinal diseases. We obtain a high accuracy of 0.912 and F1-score of 0.906 in the task of disease classification using this framework. We find that OCT and OCTA’s LBP features provide a significant contribution of 77.12% to the significance of biomarkers in predicting retinal diseases, suggesting their potential as latent indicators for clinical diagnosis. This study employs a quantitative analysis framework to identify potential biomarkers for retinal diseases in OCT and OCTA images. Our findings suggest that LBP parameters, skewness and kurtosis values of capillary, the maximum, mean, median, and standard deviation of large vessel, as well as the eccentricity, compactness, flatness, and anisotropy index of FAZ, may serve as significant indicators of retinal conditions. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

Figure 1
<p>Illustrating the framework of quantified biomarkers in retinal diseases.</p>
Full article ">Figure 2
<p>An example of the data types contained in the OCT500 dataset. Each patient includes OCT and OCTA images with three measurement structures: FULL, ILM-OPL, and OPL-BM. Additionally, the dataset provides four binarized structures: capillary, artery, and vein, as well as the FAZ region. The dataset also includes patient demographic information such as gender, age, left or right eye, and disease details.</p>
Full article ">Figure 3
<p>LBP images of FULL, ILM-OPL, and OPL-BM structure in OCT and OCTA images. Normal expression in healthy individuals without retinal diseases, AMD expression in patients with macular degeneration. DR expression in patients with diabetic retinopathy.</p>
Full article ">Figure 4
<p>The example of 59 LBP feature parameters for LBP image of OCT FULL from <a href="#sensors-24-05227-f003" class="html-fig">Figure 3</a>a.</p>
Full article ">Figure 5
<p>The example of local indices for capillary image. VAD is vessel area density, VSD is vessel skeleton density, VPI is vessel perimeter index, VDI is vessel diameter index, VCI is vessel compactness index, VCP is vessel complexity and SP denotes vessel shape parameter by the curvature.</p>
Full article ">Figure 6
<p>The feature ranking results of random forest model in binary classification of retinal diseases.</p>
Full article ">
22 pages, 12904 KiB  
Article
Intelligent Classification and Segmentation of Sandstone Thin Section Image Using a Semi-Supervised Framework and GL-SLIC
by Yubo Han and Ye Liu
Minerals 2024, 14(8), 799; https://doi.org/10.3390/min14080799 - 5 Aug 2024
Viewed by 613
Abstract
This study presents the development and validation of a robust semi-supervised learning framework specifically designed for the automated segmentation and classification of sandstone thin section images from the Yanchang Formation in the Ordos Basin. Traditional geological image analysis methods encounter significant challenges due [...] Read more.
This study presents the development and validation of a robust semi-supervised learning framework specifically designed for the automated segmentation and classification of sandstone thin section images from the Yanchang Formation in the Ordos Basin. Traditional geological image analysis methods encounter significant challenges due to the labor-intensive and error-prone nature of manual labeling, compounded by the diversity and complexity of rock thin sections. Our approach addresses these challenges by integrating the GL-SLIC algorithm, which combines Gabor filters and Local Binary Patterns for effective superpixel segmentation, laying the groundwork for advanced component identification. The primary innovation of this research is the semi-supervised learning model that utilizes a limited set of manually labeled samples to generate high-confidence pseudo labels, thereby significantly expanding the training dataset. This methodology effectively tackles the critical challenge of insufficient labeled data in geological image analysis, enhancing the model’s generalization capability from minimal initial input. Our framework improves segmentation accuracy by closely aligning superpixels with the intricate boundaries of mineral grains and pores. Additionally, it achieves substantial improvements in classification accuracy across various rock types, reaching up to 96.3% in testing scenarios. This semi-supervised approach represents a significant advancement in computational geology, providing a scalable and efficient solution for detailed petrographic analysis. It not only enhances the accuracy and efficiency of geological interpretations but also supports broader hydrocarbon exploration efforts. Full article
Show Figures

Figure 1

Figure 1
<p>Thin section image of sandstone under plane-polarized light: the main components are quartz, kaolinite, matrix, pores and lithic fragments.</p>
Full article ">Figure 2
<p>Workflow for recognizing minerals using GL-SLIC segmentation and semi-supervised training.</p>
Full article ">Figure 3
<p>GL-BP feature extraction workflow integrating LBP operator and Gabor filters for sandstone thin section images.</p>
Full article ">Figure 4
<p>Feature extraction visualization using Gabor filters at various scales and orientations for sandstone thin section images.</p>
Full article ">Figure 5
<p>Mean feature comparison chart: (<b>a</b>) mean feature chart at scale 1; (<b>b</b>) mean feature chart at scale 2; (<b>c</b>) mean feature chart at scale 3; (<b>d</b>) mean feature chart at scale 4; (<b>e</b>) mean feature chart at scale 5; (<b>f</b>) mean feature chart at scale 6.</p>
Full article ">Figure 6
<p>LBP Feature Extraction: (<b>a</b>) mean feature chart at scale 1; (<b>b</b>) mean feature chart at scale 2; (<b>c</b>) mean feature chart at scale 3; (<b>d</b>) mean feature chart at scale 4; (<b>e</b>) mean feature chart at scale 5; (<b>f</b>) mean feature chart at scale 6.</p>
Full article ">Figure 7
<p>Semi-supervised self-training process.</p>
Full article ">Figure 8
<p>Modified VGG16 Classifier Architecture.</p>
Full article ">Figure 9
<p>Discriminator model architecture.</p>
Full article ">Figure 10
<p>Comparison of superpixel segmentation algorithms on sandstone images: (<b>a</b>) original sandstone image; (<b>b</b>) FH; (<b>c</b>) QS; (<b>d</b>) SEEDS; (<b>e</b>) Watershed; (<b>f</b>) LSC; (<b>g</b>) SLIC; (<b>h</b>) GL-SLIC.</p>
Full article ">Figure 11
<p>Comparison of segmentation results between SLIC and GL-SLIC algorithms: (<b>a</b>) pre-segmentation result by the SLIC algorithm; (<b>b</b>) pre-segmentation result using the GL-SLIC algorithm.</p>
Full article ">Figure 12
<p>Detailed comparison between SLIC and GL-SLIC algorithms: (<b>a1</b>) detail area a1 from SLIC; (<b>a2</b>) detail area a2 from SLIC; (<b>a3</b>) detail area a3 from SLIC; (<b>b1</b>) detail area b1 from GL-SLIC; (<b>b2</b>) detail area b2 from GL-SLIC; (<b>b3</b>) detail area b3 from GL-SLIC.</p>
Full article ">Figure 13
<p>Comparison of superpixel merging in medium-coarse-grained quartz sandstone: (<b>a</b>) medium-coarse-grained quartz sandstone image; (<b>b</b>) pre-segmentation result; (<b>c</b>) result after superpixel merging.</p>
Full article ">Figure 14
<p>Iterative model training and data augmentation process using labeled and unlabeled rock data to mitigate overfitting and enhance classification accuracy: (<b>a</b>) primary model; (<b>b</b>) discriminator model.</p>
Full article ">Figure 15
<p>Curves of training and testing accuracy variation with epochs for the primary model.</p>
Full article ">Figure 16
<p>Classification accuracy analysis: (<b>a</b>) training set confusion matrix, (<b>b</b>) test set confusion matrix.</p>
Full article ">Figure 17
<p>Improved model accuracy post dataset cleansing and enhancement.</p>
Full article ">Figure 18
<p>Final confusion matrices for model evaluation: (<b>a</b>) training data confusion matrix, (<b>b</b>) testing data confusion matrix.</p>
Full article ">Figure 19
<p>Component identification results: (<b>a</b>) original petrographic thin section images; (<b>b</b>) proposed method results; (<b>c</b>) UNet-based semantic segmentation results.</p>
Full article ">
15 pages, 9009 KiB  
Article
Fusing Ground-Penetrating Radar Images for Improving Image Characteristics Fidelity
by Styliani Tassiopoulou and Georgia Koukiou
Appl. Sci. 2024, 14(15), 6808; https://doi.org/10.3390/app14156808 - 4 Aug 2024
Cited by 1 | Viewed by 762
Abstract
The analysis of ground-penetrating radar (GPR) data is of vital importance for detecting various subsurface features that might manifest as hyperbolic peaks, which are indicators of a buried object or grayscale variation in the case of contrast in the soil texture. This method [...] Read more.
The analysis of ground-penetrating radar (GPR) data is of vital importance for detecting various subsurface features that might manifest as hyperbolic peaks, which are indicators of a buried object or grayscale variation in the case of contrast in the soil texture. This method focuses on identifying exaggerated patterns through a series of image-processing steps. Two GPR images are initially read and preprocessed by extracting channels, flipping, and resizing. Then, specific regions of interest (ROIs) are cropped, and the Fourier transform is further applied to turn them into the frequency domain. With the help of their frequency signatures, these patterns are extracted from the images, and binary masks are constructed to obtain features of interest. These masked images were reconstructed and merged to make hyperbolic features visible. Finally, Local Binary Pattern (LBP) analysis is used to emphasize these hyperbolic peaks, thereby facilitating their recognition across the whole image. The proposed approach improves the detection of performance subsurface features in GPR data; hence, it is an important tool for geophysical surveys and other related applications. The results prove the high performance of the proposed procedure in improving GPR image characteristics. Full article
Show Figures

Figure 1

Figure 1
<p>The two craters at Didyma in the Argolis region. Map Data: Google Earth.</p>
Full article ">Figure 2
<p>Panagopoula area in the Achaia region. Map Data: Google Earth.</p>
Full article ">Figure 3
<p>(<b>a</b>) The forward direction image scan at Didyma’s region. (<b>b</b>) The backward direction image scan at Didyma’s region.</p>
Full article ">Figure 4
<p>GPR systems are equipped with two antennas: one as a transmitter and one as a receiver. The transmitter emits electromagnetic waves into the ground, which reflect off subsurface structures. These reflections are captured by the receiver, generating detailed images of hyperbolic responses.</p>
Full article ">Figure 5
<p>(<b>a</b>) Indicative of the pixels of a part of the image above in exaggerated response for the representation of one of the neighborhoods where the local binary pattern was realized. (<b>b</b>) The 3 × 3 neighborhood of pixels.</p>
Full article ">Figure 6
<p>Block diagram of the pre-processing and GPR image analysis procedures.</p>
Full article ">Figure 7
<p>The forward direction scan (<b>left</b>) and the reverse direction scan (<b>right</b>) of the area are shown here. These scans reveal subsurface structures, providing detailed insights into the geological features of the Panagopoula region. A comparison of both scan directions helps in validating the consistency and accuracy of the detected subsurface features.</p>
Full article ">Figure 8
<p>The regions of interest of the two scans. (<b>a</b>) The ROI of the forward direction of Panagopoula’s scan to be used to export patterns. (<b>b</b>) The ROI of the backward direction of Panagopoula’s scan to be used to export patterns.</p>
Full article ">Figure 9
<p>(<b>a</b>) The binary pattern extraction from a hyperbolic response will serve as the initial input for the subsequent LBP analysis from the forward scan. (<b>b</b>) Binary pattern extraction from a hyperbolic response, which will serve as the initial input for the subsequent LBP analysis from the backward scan.</p>
Full article ">Figure 10
<p>(<b>a</b>) A reconstructed image, after masking and pattern recognition, highlighting the most powerful hyperbolic responses, and providing a detailed and accurate representation. (<b>b</b>) The same applies to the other image as well.</p>
Full article ">Figure 11
<p>A part of the result of the summation of the two images, created after applying masking and fusion techniques, effectively highlighting the most powerful hyperbolic responses.</p>
Full article ">Figure 12
<p>The summation image created after applying masking and fusion techniques effectively highlights the most powerful hyperbolic responses.</p>
Full article ">Figure 13
<p>The result of using LBP analysis on part of the summed image. This provides a clearer and more detailed view of hyperbolics.</p>
Full article ">Figure 14
<p>LBP with the summation of all images. Red Circles: These patterns could represent buried objects. Green Rectangles: Such areas could be voids, different material types, or potential points of interest that differ from the surrounding substrate.</p>
Full article ">
18 pages, 4212 KiB  
Article
A Hybrid Model for Household Waste Sorting (HWS) Based on an Ensemble of Convolutional Neural Networks
by Nengkai Wu, Gui Wang and Dongyao Jia
Sustainability 2024, 16(15), 6500; https://doi.org/10.3390/su16156500 - 30 Jul 2024
Viewed by 581
Abstract
The exponential increase in waste generation is a significant global challenge with serious implications. Addressing this issue necessitates the enhancement of waste management processes. This study introduces a method that improves waste separation by integrating learning models at various levels. The method begins [...] Read more.
The exponential increase in waste generation is a significant global challenge with serious implications. Addressing this issue necessitates the enhancement of waste management processes. This study introduces a method that improves waste separation by integrating learning models at various levels. The method begins with the creation of image features as a new matrix using the Multi-Scale Local Binary Pattern (MLBP) technique. This technique optimally represents features and patterns across different scales. Following this, an ensemble model at the first level merges two Convolutional Neural Network (CNN) models, with each model performing the detection operation independently. A second-level CNN model is then employed to obtain the final output. This model uses the information from the first-level models and combines these features to perform a more accurate detection. The study’s novelty lies in the use of a second-level CNN model in the proposed ensemble system for fusing the results obtained from the first level, replacing conventional methods such as voting and averaging. Additionally, the study employs an MLBP feature selection approach for a more accurate description of the HW image features. It uses the Simulated Annealing (SA) algorithm for fine-tuning the hyperparameters of the CNN models, thereby optimizing the system’s performance. Based on the accuracy metric, the proposed method achieved an accuracy of 99.01% on the TrashNet dataset and 99.41% on the HGCD dataset. These results indicate a minimum improvement of 0.48% and 0.36%, respectively, compared to the other methods evaluated in this study. Full article
Show Figures

Figure 1

Figure 1
<p>Some samples from the TrashNet collection for each category.</p>
Full article ">Figure 2
<p>Diagram of the proposed method for HWS.</p>
Full article ">Figure 3
<p>The architecture of the proposed CNN models.</p>
Full article ">Figure 4
<p>Mean accuracy of the presented model compared to other methods in HWS for two datasets: (<b>a</b>) TrashNet and (<b>b</b>) HGCD.</p>
Full article ">Figure 5
<p>CM of the introduced model and other models in classifying samples from the TrashNet dataset.</p>
Full article ">Figure 6
<p>CM of the (<b>a</b>) introduced model and (<b>b</b>) DSCAM method in classifying samples from the HGCD dataset.</p>
Full article ">Figure 7
<p>Average of precision, recall, and F-Measure metrics for datasets (<b>a</b>) TrashNet and (<b>b</b>) HGCD.</p>
Full article ">Figure 8
<p>Performance comparison of different methods in HWS based on accuracy metrics (<b>first row</b>), recall (<b>second row</b>), and F-Measure (<b>third row</b>) for every class in the two datasets TrashNet (<b>left column</b>) and HGCD (<b>right column</b>).</p>
Full article ">Figure 9
<p>ROC curves resulting from HWS for datasets (<b>a</b>) TrashNet and (<b>b</b>) HGCD.</p>
Full article ">
21 pages, 8540 KiB  
Article
LBCNIN: Local Binary Convolution Network with Intra-Class Normalization for Texture Recognition with Applications in Tactile Internet
by Nikolay Neshov, Krasimir Tonchev and Agata Manolova
Electronics 2024, 13(15), 2942; https://doi.org/10.3390/electronics13152942 - 25 Jul 2024
Viewed by 589
Abstract
Texture recognition is a pivotal task in computer vision, crucial for applications in material sciences, medicine, and agriculture. Leveraging advancements in Deep Neural Networks (DNNs), researchers seek robust methods to discern intricate patterns in images. In the context of the burgeoning Tactile Internet [...] Read more.
Texture recognition is a pivotal task in computer vision, crucial for applications in material sciences, medicine, and agriculture. Leveraging advancements in Deep Neural Networks (DNNs), researchers seek robust methods to discern intricate patterns in images. In the context of the burgeoning Tactile Internet (TI), efficient texture recognition algorithms are essential for real-time applications. This paper introduces a method named Local Binary Convolution Network with Intra-class Normalization (LBCNIN) for texture recognition. Incorporating features from the last layer of the backbone, LBCNIN employs a non-trainable Local Binary Convolution (LBC) layer, inspired by Local Binary Patterns (LBP), without fine-tuning the backbone. The encoded feature vector is fed into a linear Support Vector Machine (SVM) for classification, serving as the only trainable component. In the context of TI, the availability of images from multiple views, such as in 3D object semantic segmentation, allows for more data per object. Consequently, LBCNIN processes batches where each batch contains images from the same material class, with batch normalization employed as an intra-class normalization method, aiming to produce better results than single images. Comprehensive evaluations across texture benchmarks demonstrate LBCNIN’s ability to achieve very good results under different resource constraints, attributed to the variability in backbone architectures. Full article
(This article belongs to the Section Electronic Multimedia)
Show Figures

Figure 1

Figure 1
<p>The proposed architecture of LBCNIN for texture recognition. The feature tensor dimension is <math display="inline"><semantics> <mrow> <mo>(</mo> <mi>M</mi> <mo>×</mo> <mi>N</mi> <mo>×</mo> <mi>C</mi> <mo>)</mo> </mrow> </semantics></math>, where <span class="html-italic">M</span> and <span class="html-italic">N</span> are fixed at 28 for all tested backbones, while the number of channels <span class="html-italic">C</span> varies depending on the investigated backbone (see step Feature Extraction).</p>
Full article ">Figure 2
<p>Confusion matrices and accuracies utilizing ConvNeXt-XL in ImageNet-21K backbone for each dataset. Results of the folds yielding the lowest accuracy for DTD, KTH-2-b, and GTOS datasets are presented, with GTOS-Mobile shown based on its single fold.</p>
Full article ">Figure 3
<p>2D t-SNE visualization [<a href="#B47-electronics-13-02942" class="html-bibr">47</a>] of texture encoded features on samples from GTOS-Mobile dataset using ConvNeXt-XL in ImageNet-21K backbone.</p>
Full article ">Figure 4
<p>Sample images of different classes from DTD dataset (<b>top</b>) and their GradCAM [<a href="#B48-electronics-13-02942" class="html-bibr">48</a>] visualizations (<b>bottom</b>) using ConvNeXt-XL in ImageNet-21K backbone.</p>
Full article ">Figure 5
<p>Confusing cases on the GTOS-Mobile dataset using the ConvNeXt-XL in ImageNet-21K backbone. The top section showcases misclassified images with their true labels displayed above them. The bottom section displays the corresponding incorrectly predicted labels for these images, along with similar samples that really belong to the incorrectly predicted class. For example, the top most right image belongs to the class ’small_limestone,’ but it is mistakenly predicted as ’pebble’. The bottom most right image is very similar to the one on the top most right, but it actually belongs to the class ’pebble’.</p>
Full article ">
12 pages, 4467 KiB  
Article
Monitoring of the Weld Pool, Keyhole Morphology and Material Penetration State in Near-Infrared and Blue Composite Laser Welding of Magnesium Alloy
by Wei Wei, Yang Liu, Haolin Deng, Zhilin Wei, Tingshuang Wang and Guangxian Li
J. Manuf. Mater. Process. 2024, 8(4), 150; https://doi.org/10.3390/jmmp8040150 - 15 Jul 2024
Viewed by 838
Abstract
The laser welding of magnesium alloys presents challenges attributed to their low laser-absorbing efficiency, resulting in instabilities during the welding process and substandard welding quality. Furthermore, the complexity of signals during laser welding processes makes it difficult to accurately monitor the molten state [...] Read more.
The laser welding of magnesium alloys presents challenges attributed to their low laser-absorbing efficiency, resulting in instabilities during the welding process and substandard welding quality. Furthermore, the complexity of signals during laser welding processes makes it difficult to accurately monitor the molten state of magnesium alloys. In this study, magnesium alloys were welded using near-infrared and blue lasers. By varying the power of the near-infrared laser, the energy absorption pattern of magnesium alloys toward the composite laser was investigated. The U-Net model was employed for the segmentation of welding images to accurately extract the features of the melt pool and keyhole. Subsequently, the penetrating states were predicted using the convolutional neural network (CNN), and the novel approach employing Local Binary Pattern (LBP) features + a backpropagation (BP) neural network was applied for comparison. The extracted images achieved MPA and MIoU values of 89.54% and 81.81%, and the prediction accuracy of the model can reach up to 100%. The applicability of the two monitoring approaches in different scenarios was discussed, providing guidance for the quality of magnesium welding. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic of the laser welding platform and the monitoring system. (<b>a</b>) Schematic diagram of welding and monitoring system, (<b>b</b>) Schematic diagram of installation location of equipment and samples.</p>
Full article ">Figure 2
<p>Cross-sectional images and monitoring images of the weld in the three penetration states. I, II, and III, corresponding to the UP, FP, and EP states, The dashed line represents the weld edge of the welded sample cross-section, and the green dotted line represents the weld pool and keyhole edge tracks manually marked in the monitoring image.</p>
Full article ">Figure 3
<p>Overview of the state monitoring method. In the segmented image, the red area represents the molten pool and the green area represents the keyhole.</p>
Full article ">Figure 4
<p>The morphology of welding seams at different laser powers. The red dashed lines correspond to each section in <a href="#jmmp-08-00150-f002" class="html-fig">Figure 2</a>.</p>
Full article ">Figure 5
<p>U-Net architecture [<a href="#B21-jmmp-08-00150" class="html-bibr">21</a>].</p>
Full article ">Figure 6
<p>The weld seam position distribution and the number of sample images of the dataset.</p>
Full article ">Figure 7
<p>U-Net image processing flow. The green dots lines are manually marked weld pool and keyhole edge tracks.</p>
Full article ">Figure 8
<p>The segmented image of the fused monitoring image and the original monitoring image.</p>
Full article ">Figure 9
<p>U-Net performance evaluation indicators. (<b>a</b>) MPA; (<b>b</b>) MioU.</p>
Full article ">Figure 10
<p>Training loss and test accuracy of three different models.</p>
Full article ">Figure 11
<p>Confusion matrix of BP neural network. The model can predict 100% of the state of FP accurately, and the error is due to the misjudgment between UP and EP.</p>
Full article ">
22 pages, 3024 KiB  
Article
Augmenting Aquaculture Efficiency through Involutional Neural Networks and Self-Attention for Oplegnathus Punctatus Feeding Intensity Classification from Log Mel Spectrograms
by Usama Iqbal, Daoliang Li, Zhuangzhuang Du, Muhammad Akhter, Zohaib Mushtaq, Muhammad Farrukh Qureshi and Hafiz Abbad Ur Rehman
Animals 2024, 14(11), 1690; https://doi.org/10.3390/ani14111690 - 5 Jun 2024
Cited by 2 | Viewed by 786
Abstract
Understanding the feeding dynamics of aquatic animals is crucial for aquaculture optimization and ecosystem management. This paper proposes a novel framework for analyzing fish feeding behavior based on a fusion of spectrogram-extracted features and deep learning architecture. Raw audio waveforms are first transformed [...] Read more.
Understanding the feeding dynamics of aquatic animals is crucial for aquaculture optimization and ecosystem management. This paper proposes a novel framework for analyzing fish feeding behavior based on a fusion of spectrogram-extracted features and deep learning architecture. Raw audio waveforms are first transformed into Log Mel Spectrograms, and a fusion of features such as the Discrete Wavelet Transform, the Gabor filter, the Local Binary Pattern, and the Laplacian High Pass Filter, followed by a well-adapted deep model, is proposed to capture crucial spectral and spectral information that can help distinguish between the various forms of fish feeding behavior. The Involutional Neural Network (INN)-based deep learning model is used for classification, achieving an accuracy of up to 97% across various temporal segments. The proposed methodology is shown to be effective in accurately classifying the feeding intensities of Oplegnathus punctatus, enabling insights pertinent to aquaculture enhancement and ecosystem management. Future work may include additional feature extraction modalities and multi-modal data integration to further our understanding and contribute towards the sustainable management of marine resources. Full article
(This article belongs to the Special Issue Animal Health and Welfare in Aquaculture)
Show Figures

Figure 1

Figure 1
<p>The structure of the experimental system of recirculating aquaculture.</p>
Full article ">Figure 2
<p>Audio waveforms captured from aquaculture: (<b>a</b>) Audio of None class, (<b>b</b>) Audio of Medium class, and (<b>c</b>) Audio of Strong class.</p>
Full article ">Figure 3
<p>Conversion of audio waveforms to Log Mel Spectrograms (using magma colormap): (<b>a</b>) audio waveform, (<b>b</b>) corresponding short-term Fourier Transform, (<b>c</b>) Mel spectrogram, and (<b>d</b>) Log Mel Spectrogram.</p>
Full article ">Figure 4
<p>Segmentation of audio waveforms and their corresponding Log Mel Spectrograms (using magma colormap): (<b>a</b>) Audio of None class, (<b>b</b>) Audio of Medium class, (<b>c</b>) Audio of Strong class, (<b>d</b>) Log Mel Spectrogram of None class, (<b>e</b>) Log Mel Spectrogram of Medium class, and (<b>f</b>) Log Mel Spectrogram of Strong class.</p>
Full article ">Figure 5
<p>Discrete wavelet transform extracted from the corresponding Log Mel Spectrogram Images: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 6
<p>Gabor filter applied on Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 7
<p>Local Binary Pattern extracted from Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 8
<p>Laplacian High Pass filter applied on Log Mel Spectrogram Images of each class: (<b>a</b>) None, (<b>b</b>) Medium, and (<b>c</b>) Strong.</p>
Full article ">Figure 9
<p>Combined Features extracted from LMS for a sample of ’Strong’ class as an input to the model.</p>
Full article ">Figure 10
<p>Involutional Neural Network: (<b>a</b>) involution layer, (<b>b</b>) involution layer with self-attention.</p>
Full article ">Figure A1
<p>Confusion matrices: (<b>a</b>) Involutional Neural Network, (<b>b</b>) VGG16, (<b>c</b>) VGG19, (<b>d</b>) ResNet50, (<b>e</b>) Xception, (<b>f</b>) EfficinetNet-b0, (<b>g</b>) InceptionNetV3, and (<b>h</b>) MobileNetV2.</p>
Full article ">
15 pages, 1699 KiB  
Article
Enhancing Medical Image Classification with an Advanced Feature Selection Algorithm: A Novel Approach to Improving the Cuckoo Search Algorithm by Incorporating Caputo Fractional Order
by Abduljlil Abduljlil Ali Abduljlil Habeb, Mundher Mohammed Taresh, Jintang Li, Zhan Gao and Ningbo Zhu
Diagnostics 2024, 14(11), 1191; https://doi.org/10.3390/diagnostics14111191 - 5 Jun 2024
Viewed by 758
Abstract
Glaucoma is a chronic eye condition that seriously impairs vision and requires early diagnosis and treatment. Automated detection techniques are essential for obtaining a timely diagnosis. In this paper, we propose a novel method for feature selection that integrates the cuckoo search algorithm [...] Read more.
Glaucoma is a chronic eye condition that seriously impairs vision and requires early diagnosis and treatment. Automated detection techniques are essential for obtaining a timely diagnosis. In this paper, we propose a novel method for feature selection that integrates the cuckoo search algorithm with Caputo fractional order (CFO-CS) to enhance the performance of glaucoma classification. However, when using the infinite series, the Caputo definition has memory length truncation issues. Therefore, we suggest a fixed memory step and an adjustable term count for optimization. We conducted experiments integrating various feature extraction techniques, including histograms of oriented gradients (HOGs), local binary patterns (LBPs), and deep features from MobileNet and VGG19, to create a unified vector. We evaluate the informative features selected from the proposed method using the k-nearest neighbor. Furthermore, we use data augmentation to enhance the diversity and quantity of the training set. The proposed method enhances convergence speed and the attainment of optimal solutions during training. The results demonstrate superior performance on the test set, achieving 92.62% accuracy, 94.70% precision, 93.52% F1-Score, 92.98% specificity, 92.36% sensitivity, and 85.00% Matthew’s correlation coefficient. The results confirm the efficiency of the proposed method, rendering it a generalizable and applicable technique in ophthalmology. Full article
(This article belongs to the Special Issue Classification of Diseases Using Machine Learning Algorithms)
Show Figures

Figure 1

Figure 1
<p>The outline of the proposed model.</p>
Full article ">Figure 2
<p>Sample images from the datasets: (<b>a</b>,<b>b</b>) glaucoma, (<b>c</b>,<b>d</b>) healthy images.</p>
Full article ">Figure 3
<p>The average training (<b>a</b>) accuracy and (<b>b</b>) loss of different <math display="inline"><semantics> <mi>α</mi> </semantics></math> for each <span class="html-italic">M</span>.</p>
Full article ">Figure 4
<p>The convergence profile of using CFO-CS and CS for feature selection.</p>
Full article ">Figure 5
<p>The confusion matrix of k-NN performance using CS for FS. (<b>a</b>) CS and (<b>b</b>) CFO-CS for FS.</p>
Full article ">Figure 6
<p>The convergence profile of using CFO-CS and WOA for feature selection.</p>
Full article ">
14 pages, 18445 KiB  
Article
A Zero-Watermarking Algorithm Based on Scale-Invariant Feature Reconstruction Transform
by Fan Li and Zhong-Xun Wang
Appl. Sci. 2024, 14(11), 4756; https://doi.org/10.3390/app14114756 - 31 May 2024
Viewed by 560
Abstract
In order to effectively protect and verify the copyright information of multimedia digital works, this paper proposes a zero-watermarking algorithm based on carrier image feature point descriptors. The constructed feature matrix of this algorithm consists of two parts: the feature descriptor vector calculated [...] Read more.
In order to effectively protect and verify the copyright information of multimedia digital works, this paper proposes a zero-watermarking algorithm based on carrier image feature point descriptors. The constructed feature matrix of this algorithm consists of two parts: the feature descriptor vector calculated from scale-invariant feature reconstruction transform (SIFRT) and the multi-radius local binary pattern (MrLBP) descriptor vector. The algorithm performs a standardization, feature decomposition, and redundancy reduction on the traditional keypoint descriptor matrix, combines it with the texture feature matrix, and achieves the dimensional matching of copyright information. The advantage of this algorithm lies in its non-modification of the original data. Compared to computing global features, the local features computed from a subset of key points reduce the amount of attack interference introduced during copyright verification, thereby reducing the number of erroneous pixel values that are introduced. The algorithm introduces a timestamp mechanism when uploading the generated zero-watermarking image to a third-party copyright center, preventing subsequent tampering. Experimental data analysis demonstrates that the algorithm exhibits good discriminability, security, and robustness. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
Show Figures

Figure 1

Figure 1
<p>The construction process of SIFRT feature matrix.</p>
Full article ">Figure 2
<p>Schematic diagram of bilinear interpolation.</p>
Full article ">Figure 3
<p>The fractal structure of chaotic sequences.</p>
Full article ">Figure 4
<p>Zero-watermarking generation algorithm process.</p>
Full article ">Figure 5
<p>Copyright information extraction algorithm process.</p>
Full article ">Figure 6
<p>Experimental materials and names.</p>
Full article ">Figure 7
<p>Comparison of feature matrix similarity between different images.</p>
Full article ">Figure 8
<p>Comparison of feature similarity under approximate key conditions.</p>
Full article ">Figure 9
<p>Details of feature extraction for similar images.</p>
Full article ">Figure 10
<p>Comparison of feature similarity between similar images.</p>
Full article ">
12 pages, 2469 KiB  
Article
Partial Discharge Pattern Recognition Based on an Ensembled Simple Convolutional Neural Network and a Quadratic Support Vector Machine
by Zhangjun Fei, Yiying Li and Shiyou Yang
Energies 2024, 17(11), 2443; https://doi.org/10.3390/en17112443 - 21 May 2024
Viewed by 668
Abstract
Partial discharge (PD) is a crucial and intricate electrical occurrence observed in various types of electrical equipment. Identifying and characterizing PDs is essential for upholding the integrity and reliability of electrical assets. This paper proposes an ensemble methodology aiming to strike a balance [...] Read more.
Partial discharge (PD) is a crucial and intricate electrical occurrence observed in various types of electrical equipment. Identifying and characterizing PDs is essential for upholding the integrity and reliability of electrical assets. This paper proposes an ensemble methodology aiming to strike a balance between the model complexity and the predictive performance in PD pattern recognition. A simple convolutional neural network (SCNN) was constructed to efficiently decrease the model parameters (quantities). A quadratic support vector machine (QSVM) was established and ensembled with the SCNN model to effectively improve the PD recognition accuracy. The input for QSVM consisted of the circular local binary pattern (CLBP) extracted from the enhanced image. A testing prototype with three types of PD was constructed and 3D phase-resolved pulse sequence (PRPS) spectrograms were measured and recorded by ultra-high frequency (UHF) sensors. The proposed methodology was compared with three existing lightweight CNNs. The experiment results from the collected dataset emphasize the benefits of the proposed method, showcasing its advantages in high recognition accuracy and relatively few mode parameters, thereby rendering it more suitable for PD pattern recognition on resource-constrained devices. Full article
(This article belongs to the Section F1: Electrical Power System)
Show Figures

Figure 1

Figure 1
<p>Typical 3D−PRPS graphs of (<b>a</b>) suspended electrode discharge, (<b>b</b>) surface discharge, and (<b>c</b>) metal tip discharge.</p>
Full article ">Figure 2
<p>Central features of MobileNet V2: (<b>a</b>) depthwise separable convolution consisting of depthwise convolutions and pointwise convolutions; (<b>b</b>) bottleneck residual block.</p>
Full article ">Figure 3
<p>The prototype testing device for PD.</p>
Full article ">Figure 4
<p>Image preprocessing for SVM training.</p>
Full article ">Figure 5
<p>The ROC curves and AUC values for (<b>a</b>) suspended electrode discharge, (<b>b</b>) surface discharge, and (<b>c</b>) metal tip discharge.</p>
Full article ">Figure 6
<p>The procedure of the proposed PD pattern recognition methodology.</p>
Full article ">Figure 7
<p>Confusion matrices of the testing data with (<b>a</b>) SCNN, (<b>b</b>) QSVM, (<b>c</b>) ENS–SCNN–QSVM, (<b>d</b>) RF, (<b>e</b>) XGBoost, (<b>f</b>) MobileNet V2, (<b>g</b>) EfficientNetB0, and (<b>h</b>) ShuffleNet.</p>
Full article ">
Back to TopTop