[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (707)

Search Parameters:
Keywords = word embeddings

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 928 KiB  
Article
Fractal Analysis of GPT-2 Token Embedding Spaces: Stability and Evolution of Correlation Dimension
by Minhyeok Lee
Fractal Fract. 2024, 8(10), 603; https://doi.org/10.3390/fractalfract8100603 (registering DOI) - 17 Oct 2024
Abstract
This paper explores the fractal properties of token embedding spaces in GPT-2 language models by analyzing the stability of the correlation dimension, a measure of geometric complexity. Token embeddings represent words or subwords as vectors in a high-dimensional space. We hypothesize that the [...] Read more.
This paper explores the fractal properties of token embedding spaces in GPT-2 language models by analyzing the stability of the correlation dimension, a measure of geometric complexity. Token embeddings represent words or subwords as vectors in a high-dimensional space. We hypothesize that the correlation dimension D2 remains consistent across different vocabulary subsets, revealing fundamental structural characteristics of language representation in GPT-2. Our main objective is to quantify and analyze the stability of D2 in these embedding subspaces, addressing the challenges posed by their high dimensionality. We introduce a new theorem formalizing this stability, stating that for any two sufficiently large random subsets S1,S2E, the difference in their correlation dimensions is less than a small constant ε. We validate this theorem using the Grassberger–Procaccia algorithm for estimating D2, coupled with bootstrap sampling for statistical consistency. Our experiments on GPT-2 models of varying sizes demonstrate remarkable stability in D2 across different subsets, with consistent mean values and small standard errors. We further investigate how the model size, embedding dimension, and network depth impact D2. Our findings reveal distinct patterns of D2 progression through the network layers, contributing to a deeper understanding of the geometric properties of language model representations and informing new approaches in natural language processing. Full article
Show Figures

Figure 1

Figure 1
<p>Impact of subset size on correlation dimension for different GPT-2 model sizes.</p>
Full article ">Figure 2
<p>Layer-wise progression of correlation dimension in GPT-2 models.</p>
Full article ">
21 pages, 1156 KiB  
Article
EDSCVD: Enhanced Dual-Channel Smart Contract Vulnerability Detection Method
by Huaiguang Wu, Yibo Peng, Yaqiong He and Siqi Lu
Symmetry 2024, 16(10), 1381; https://doi.org/10.3390/sym16101381 - 17 Oct 2024
Viewed by 80
Abstract
Ensuring the absence of vulnerabilities or flaws in smart contracts before their deployment is crucial for the smooth progress of subsequent work. Existing detection methods heavily rely on expert rules, resulting in low robustness and accuracy. Therefore, we propose EDSCVD, an enhanced deep [...] Read more.
Ensuring the absence of vulnerabilities or flaws in smart contracts before their deployment is crucial for the smooth progress of subsequent work. Existing detection methods heavily rely on expert rules, resulting in low robustness and accuracy. Therefore, we propose EDSCVD, an enhanced deep learning vulnerability detection model based on dual-channel networks. Firstly, the contract fragments are preprocessed by BERT into the required word embeddings. Next, we utilized adversarial training FGM to the word embeddings to generate perturbations, thereby producing symmetric adversarial samples and enhancing the robustness of the model. Then, the dual-channel model combining BiLSTM and CNN is utilized for feature training to obtain more comprehensive and symmetric information on temporal and local contract features.Finally, the combined output features are passed through a classifier to classify and detect contract vulnerabilities. Experimental results show that our EDSCVD exhibits excellent detection performance in the detection of classical reentrancy vulnerabilities, timestamp dependencies, and integer overflow vulnerabilities. Full article
Show Figures

Figure 1

Figure 1
<p>Reentrancy source code. (Source: Own elaboration).</p>
Full article ">Figure 2
<p>Timestamp Dependency source code. (Source: Own elaboration).</p>
Full article ">Figure 3
<p>Integer Overflow source code. (∗ denotes the multiplication operator. <math display="inline"><semantics> <mrow> <mo>&amp;</mo> <mo>&amp;</mo> </mrow> </semantics></math> denotes a logical symbol used to combine two Boolean expressions). (Source: Own elaboration).</p>
Full article ">Figure 4
<p>The overall architecture of EDSCVD. (Source: Own elaboration).</p>
Full article ">Figure 5
<p>Contract fragment representation. (Source: Own elaboration).</p>
Full article ">Figure 6
<p>The structure of BERT. (Source: Own elaboration based on literature [<a href="#B49-symmetry-16-01381" class="html-bibr">49</a>]).</p>
Full article ">Figure 7
<p>Adversarial Training Methods FGM. (Source: Own elaboration).</p>
Full article ">Figure 8
<p>Dual-Channel Network Architecture. (Source: Own elaboration based on literature [<a href="#B49-symmetry-16-01381" class="html-bibr">49</a>]).</p>
Full article ">Figure 9
<p>Structure of a single LSTM module. (Source: Own elaboration based on literature [<a href="#B22-symmetry-16-01381" class="html-bibr">22</a>]).</p>
Full article ">Figure 10
<p>Multi-Head Attention Mechanisms. (Source: Own elaboration).</p>
Full article ">Figure 11
<p>Epochs and Evaluation Metrics in model training. (Source: Own elaboration).</p>
Full article ">
30 pages, 3530 KiB  
Article
Spotting Leaders in Organizations with Graph Convolutional Networks, Explainable Artificial Intelligence, and Automated Machine Learning
by Yunbo Xie, Jose D. Meisel, Carlos A. Meisel, Juan Jose Betancourt, Jianqi Yan and Roberto Bugiolacchi
Appl. Sci. 2024, 14(20), 9461; https://doi.org/10.3390/app14209461 - 16 Oct 2024
Viewed by 262
Abstract
Over the past few decades, the study of leadership theory has expanded across various disciplines, delving into the intricacies of human behavior and defining the roles of individuals within organizations. Its primary objective is to identify leaders who play significant roles in the [...] Read more.
Over the past few decades, the study of leadership theory has expanded across various disciplines, delving into the intricacies of human behavior and defining the roles of individuals within organizations. Its primary objective is to identify leaders who play significant roles in the communication flow. In addition, behavioral theory posits that leaders can be distinguished based on their daily conduct, while social network analysis provides valuable insights into behavioral patterns. Our study investigates five and six types of social networks frequently observed in different organizations. This study is conducted using datasets we collected from an IT company and public datasets collected from a manufacturing company for the thorough evaluation of prediction performance. We leverage PageRank and effective word embedding techniques to obtain novel features. State-of-the-art performance is obtained using various statistical machine learning methods, graph convolutional networks (GCN), automated machine learning (AutoML), and explainable artificial intelligence (XAI). More specifically, our approach can achieve state-of-the-art performance with an accuracy close to 90% for leaders identification with data from projects of different types. This investigation contributes to the establishment of sustainable leadership practices by aiding organizations in retaining their leadership talent. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

Figure 1
<p>Illustration of different steps of our proposed procedures.</p>
Full article ">Figure 2
<p>General idea of graph convolutional network.</p>
Full article ">Figure 3
<p>Importance and impact of features on prediction by SHAP. Summary chart (<b>a</b>) explains the entire dataset in Project A, chart (<b>b</b>) explains the entire dataset in Project E, and chart (<b>c</b>) explains the entire dataset in Project F.</p>
Full article ">Figure 4
<p>Illustration of the Automated machine learning (AutoML) workflow on the H<sub>2</sub>O platform.</p>
Full article ">Figure 5
<p>Interaction graphs based on the five types of inter-colleague relationships in project A. Graphs (<b>a</b>–<b>e</b>) show the interaction graphs of relationships (1) to (5), respectively.</p>
Full article ">Figure 6
<p>Interaction graphs based on the six types of inter-colleague relationships in project E. Graphs (<b>a</b>–<b>f</b>) show the interaction graphs of relationships (1) to (6), respectively.</p>
Full article ">Figure 7
<p>The workflow of the explainable artificial intelligence (XAI) method SHapley Additive exPlanations (SHAP).</p>
Full article ">
18 pages, 1657 KiB  
Technical Note
Emitter Signal Deinterleaving Based on Single PDW with Modulation-Hypothesis-Augmented Transformer
by Huajun Liu, Longfei Wang and Gan Wang
Remote Sens. 2024, 16(20), 3830; https://doi.org/10.3390/rs16203830 (registering DOI) - 15 Oct 2024
Viewed by 247
Abstract
Radar emitter signal deinterleaving based on pulse description words (PDWs) is a challenging task in the field of electronic warfare because of the parameter sparsity and uncertainty of PDWs. In this paper, a modulation-hypothesis-augmented Transformer model is proposed to identify emitters from a [...] Read more.
Radar emitter signal deinterleaving based on pulse description words (PDWs) is a challenging task in the field of electronic warfare because of the parameter sparsity and uncertainty of PDWs. In this paper, a modulation-hypothesis-augmented Transformer model is proposed to identify emitters from a single PDW with an end-to-end manner. Firstly, the pulse features are enriched by the modulation hypothesis mechanism to generate I/Q complex signals from PDWs. Secondly, a multiple-parameter embedding method is proposed to expand the signal discriminative features and to enhance the identification capability of emitters. Moreover, a novel Transformer deep learning model, named PulseFormer and composed of spectral convolution, multi-layer perceptron, and self-attention based basic blocks, is proposed for discriminative feature extraction, emitter identification, and signal deinterleaving. Experimental results on synthesized PDW dataset show that the proposed method performs better on emitter signal deinterleaving in complex environments without relying on the pulse repetition interval (PRI). Compared with other deep learning methods, the PulseFormer performs better in noisy environments. Full article
Show Figures

Figure 1

Figure 1
<p>The diagram of Transformer-based signal deinterleaving.</p>
Full article ">Figure 2
<p>The architecture of the PulseFormer model.</p>
Full article ">Figure 3
<p>Components of the PulseFormer model: (<b>a</b>) SP-Conv; (<b>b</b>) MHSA; (<b>c</b>) MLP.</p>
Full article ">Figure 4
<p>Histogram visualization of different features in the first experiment: (<b>a</b>) PW; (<b>b</b>) CF; (<b>c</b>) PA; (<b>d</b>) DOA.</p>
Full article ">Figure 5
<p>Histogram visualization of different features in the second experiment: (<b>a</b>) PW; (<b>b</b>) CF; (<b>c</b>) PA; (<b>d</b>) DOA.</p>
Full article ">Figure 6
<p>The confusion matrix w/w.o. multiple-parameter embedding. (<b>a</b>) Modulation-hypothesis augmentation. (<b>b</b>) Modulation-hypothesis augmentation with multiple-parameter embedding.</p>
Full article ">Figure 7
<p>The t-SNE visualization shows the high-dimensional feature distribution predicted by our model under the modulation-hypothesis augmentation methods. (<b>a</b>) Without multiple-parameter embedding. (<b>b</b>) With multiple-parameter embedding.</p>
Full article ">Figure 8
<p>Performance comparison under different noise conditions.</p>
Full article ">Figure 9
<p>Performance comparison under pulse loss.</p>
Full article ">
27 pages, 920 KiB  
Article
AI-Generated Spam Review Detection Framework with Deep Learning Algorithms and Natural Language Processing
by Mudasir Ahmad Wani, Mohammed ElAffendi and Kashish Ara Shakil
Computers 2024, 13(10), 264; https://doi.org/10.3390/computers13100264 - 12 Oct 2024
Viewed by 349
Abstract
Spam reviews pose a significant challenge to the integrity of online platforms, misleading consumers and undermining the credibility of genuine feedback. This paper introduces an innovative AI-generated spam review detection framework that leverages Deep Learning algorithms and Natural Language Processing (NLP) techniques to [...] Read more.
Spam reviews pose a significant challenge to the integrity of online platforms, misleading consumers and undermining the credibility of genuine feedback. This paper introduces an innovative AI-generated spam review detection framework that leverages Deep Learning algorithms and Natural Language Processing (NLP) techniques to identify and mitigate spam reviews effectively. Our framework utilizes multiple Deep Learning models, including Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, Gated Recurrent Unit (GRU), and Bidirectional LSTM (BiLSTM), to capture intricate patterns in textual data. The system processes and analyzes large volumes of review content to detect deceptive patterns by utilizing advanced NLP and text embedding techniques such as One-Hot Encoding, Word2Vec, and Term Frequency-Inverse Document Frequency (TF-IDF). By combining three embedding techniques with four Deep Learning algorithms, a total of twelve exhaustive experiments were conducted to detect AI-generated spam reviews. The experimental results demonstrate that our approach outperforms the traditional machine learning models, offering a robust solution for ensuring the authenticity of online reviews. Among the models evaluated, those employing Word2Vec embeddings, particularly the BiLSTM_Word2Vec model, exhibited the strongest performance. The BiLSTM model with Word2Vec achieved the highest performance, with an exceptional accuracy of 98.46%, a precision of 0.98, a recall of 0.97, and an F1-score of 0.98, reflecting a near-perfect balance between precision and recall. Its high F2-score (0.9810) and F0.5-score (0.9857) further highlight its effectiveness in accurately detecting AI-generated spam while minimizing false positives, making it the most reliable option for this task. Similarly, the Word2Vec-based LSTM model also performed exceptionally well, with an accuracy of 97.58%, a precision of 0.97, a recall of 0.96, and an F1-score of 0.97. The CNN model with Word2Vec similarly delivered strong results, achieving an accuracy of 97.61%, a precision of 0.97, a recall of 0.96, and an F1-score of 0.97. This study is unique in its focus on detecting spam reviews specifically generated by AI-based tools rather than solely detecting spam reviews or AI-generated text. This research contributes to the field of spam detection by offering a scalable, efficient, and accurate framework that can be integrated into various online platforms, enhancing user trust and the decision-making processes. Full article
Show Figures

Figure 1

Figure 1
<p>Detailed data collection procedure.</p>
Full article ">Figure 2
<p>Generating AI-based spam/fake reviews based on human-authored samples.</p>
Full article ">Figure 3
<p>Check for the working of GPT Module.</p>
Full article ">Figure 4
<p>Data preparation and preprocessing with NLTK toolkit.</p>
Full article ">Figure 5
<p>Experimental setup and configuration.</p>
Full article ">Figure 6
<p>Performance of selected Deep Learning models on TF-IDF representation.</p>
Full article ">Figure 7
<p>Performance of selected Deep Learning models on Word2Vec feature representation.</p>
Full article ">Figure 8
<p>Performance of selected Deep Learning models on One-Hot Encoding.</p>
Full article ">Figure 9
<p>The radar plot showing proposed approaches. Particularly, Word2Vec-based BiLSTM outperformed the existing methods.</p>
Full article ">Figure 10
<p>Heptagon: seven ways to prevent abuse and ensure ethical use of AI-generated reviews.</p>
Full article ">
15 pages, 4255 KiB  
Article
Enhancing Neural Machine Translation Quality for Kannada–Tulu Language Pairs through Transformer Architecture: A Linguistic Feature Integration
by Musica Supriya, U Dinesh Acharya and Ashalatha Nayak
Designs 2024, 8(5), 100; https://doi.org/10.3390/designs8050100 - 12 Oct 2024
Viewed by 323
Abstract
The rise of intelligent systems demands good machine translation models that are less data hungry and more efficient, especially for low- and extremely-low-resource languages with few or no data available. By integrating a linguistic feature to enhance the quality of translation, we have [...] Read more.
The rise of intelligent systems demands good machine translation models that are less data hungry and more efficient, especially for low- and extremely-low-resource languages with few or no data available. By integrating a linguistic feature to enhance the quality of translation, we have developed a generic Neural Machine Translation (NMT) model for Kannada–Tulu language pairs. The NMT model uses Transformer architecture and a state-of-the-art model for translating text from Kannada to Tulu and learns based on the parallel data. Kannada and Tulu are both low-resource Dravidian languages, with Tulu recognised as an extremely-low-resource language. Dravidian languages are morphologically rich and are highly agglutinative in nature and there exist only a few NMT models for Kannada–Tulu language pairs. They exhibit poor translation scores as they fail to capture the linguistic features of the language. The proposed generic approach can benefit other low-resource Indic languages that have smaller parallel corpora for NMT tasks. Evaluation metrics like Bilingual Evaluation Understudy (BLEU), character-level F-score (chrF) and Word Error Rate (WER) are considered to obtain the improved translation scores for the linguistic-feature-embedded NMT model. These results hold promise for further experimentation with other low- and extremely-low-resource language pairs. Full article
Show Figures

Figure 1

Figure 1
<p>Steps involved in developing the POS-integrated Kannada-to-Tulu machine translation system.</p>
Full article ">Figure 2
<p>BLEU score and chrF obtained for the generated translations in Tulu with the inclusion of POS on the source side.</p>
Full article ">Figure 3
<p>BLEU score and chrF obtained for the generated translations in Tulu without the inclusion of POS on the source side.</p>
Full article ">
18 pages, 10227 KiB  
Article
Revamping Image-Recipe Cross-Modal Retrieval with Dual Cross Attention Encoders
by Wenhao Liu, Simiao Yuan, Zhen Wang, Xinyi Chang, Limeng Gao and Zhenrui Zhang
Mathematics 2024, 12(20), 3181; https://doi.org/10.3390/math12203181 - 11 Oct 2024
Viewed by 351
Abstract
The image-recipe cross-modal retrieval task, which retrieves the relevant recipes according to food images and vice versa, is now attracting widespread attention. There are two main challenges for image-recipe cross-modal retrieval task. Firstly, a recipe’s different components (words in a sentence, sentences in [...] Read more.
The image-recipe cross-modal retrieval task, which retrieves the relevant recipes according to food images and vice versa, is now attracting widespread attention. There are two main challenges for image-recipe cross-modal retrieval task. Firstly, a recipe’s different components (words in a sentence, sentences in an entity, and entities in a recipe) have different weight values. If a recipe’s different components own the same weight, the recipe embeddings cannot pay more attention to the important components. As a result, the important components make less contribution to the retrieval task. Secondly, the food images have obvious properties of locality and only the local food regions matter. There are still difficulties in enhancing the discriminative local region features in the food images. To address these two problems, we propose a novel framework named Dual Cross Attention Encoders for Cross-modal Food Retrieval (DCA-Food). The proposed framework consists of a hierarchical cross attention recipe encoder (HCARE) and a cross attention image encoder (CAIE). HCARE consists of three types of cross attention modules to capture the important words in a sentence, the important sentences in an entity and the important entities in a recipe, respectively. CAIE extracts global and local region features. Then, it calculates cross attention between them to enhance the discriminative local features in the food images. We conduct the ablation studies to validate our design choices. Our proposed approach outperforms the existing approaches by a large margin on the Recipe1M dataset. Specifically, we improve the R@1 performance by +2.7 and +1.9 on the 1k and 10k testing sets, respectively. Full article
(This article belongs to the Section Mathematics and Computer Science)
Show Figures

Figure 1

Figure 1
<p>An overview of general frameworks in the image-recipe cross-modal retrieval tasks.</p>
Full article ">Figure 2
<p>Some food images from Recipe1M dataset. The food images have most area of food and small portion of noise. (1) the noise is the bowl; (2) the noise is the plate; (3) the noise is the fork; (4) the noise is the spoon.</p>
Full article ">Figure 3
<p>Overview of the DCA-Food framework. We employ the proposed HCARE and CAIE to extract recipe and image features, respectively. Then, we project the recipe and image features into the common embedding space, and train the model by minimizing the triplet loss and semantic loss.</p>
Full article ">Figure 4
<p>The composition of a recipe. A recipe consists of three entities: title, ingredients, and instructions. Each entity consists of one or more sentences.</p>
Full article ">Figure 5
<p>The structure of HCARE. For the title entity <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>r</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> <mtext> </mtext> </mrow> </msub> </mrow> </semantics></math>, we employ WCAE to transform <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>r</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> <mtext> </mtext> </mrow> </msub> </mrow> </semantics></math> into the enhanced sentence-level embedding <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>s</mi> <mi>e</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics></math>. Since the title consists of only a single sentence, <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>s</mi> <mi>e</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics></math> also serves as the entity-level embedding <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>e</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics></math> for the title. As ingredients and instructions composed of multiple sentences, a list of enhanced sentence-level embeddings (<span class="html-italic">r<sup>’</sup><sub>ing</sub></span> and <span class="html-italic">r<sup>’</sup><sub>ins</sub></span>) is generated. Then, we use SCAE to process <span class="html-italic">r<sup>’</sup><sub>ing</sub></span> and <span class="html-italic">r<sup>’</sup><sub>ins</sub></span>, respectively, and we get the enhanced entity-level embedding <span class="html-italic">en<sup>’</sup><sub>ing</sub></span> and <span class="html-italic">en<sup>’</sup><sub>ins</sub></span>. Next, three entity-level embeddings (<span class="html-italic">en<sup>’</sup><sub>ing</sub></span>, <span class="html-italic">en<sup>’</sup><sub>ins</sub></span> and <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>e</mi> </mrow> <mrow> <mi>t</mi> <mi>t</mi> <mi>l</mi> </mrow> </msub> </mrow> </semantics></math>) are input into the ECAE, resulting in further an enhanced recipe-level embedding <span class="html-italic">e<sup>’’</sup><sub>R</sub></span>. Finally, we project <span class="html-italic">e<sup>’’</sup><sub>R</sub></span> to the same dimensional space as the food image embedding and get the recipe embedding <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>e</mi> </mrow> <mrow> <mi>R</mi> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 6
<p>The cross attention enhancement module. We divide the cross attention enhancement module into word-level, sentence-level, and entity-level. Each level owns both TR and CAD modules, but with different parameters, inputs and outputs. Firstly, TR module calculates the self attention based on an input embedding sequence x = {x<sub>0</sub>,…x<sub>K</sub>}. Then, the average of y = {y<sub>0</sub>,…y<sub>K</sub>} is taken at the last layer to obtain the intermediate representation e. Finally, CAD calculates the cross attention based on the intermediate representation e (Q) and the input embedding sequence (K, V) to obtain E.</p>
Full article ">Figure 7
<p>The Multi-head Self-Attention module.</p>
Full article ">Figure 8
<p>Cross-Attention Image Encoder: Given a food image, the PAR-Net is used to extract <span class="html-italic">d</span> discriminative local region features <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>l</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math>, while the ViT is employed to capture attention-based global features <span class="html-italic">g</span>. The Cross Attention Decoder then computes cross attention by using the global features as <span class="html-italic">Q</span> and the sequence of local features as <span class="html-italic">K</span> and <span class="html-italic">V</span>, resulting in the enhanced image embedding <span class="html-italic">e<sup>’</sup><sub>I</sub></span>. Finally, <span class="html-italic">e<sup>’</sup><sub>I</sub></span> is projected into the same dimensional space as the recipe embedding <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>e</mi> </mrow> <mrow> <mi>R</mi> </mrow> </msub> </mrow> </semantics></math>, yielding the final food image embedding <math display="inline"><semantics> <mrow> <msub> <mrow> <mtext> </mtext> <mi>e</mi> </mrow> <mrow> <mi>I</mi> </mrow> </msub> </mrow> </semantics></math>.</p>
Full article ">Figure 9
<p>Qualitative results. Each row includes one query (the left item) and the top five retrieved items (recipes in (<b>a</b>) and images in (<b>b</b>)). The element highlighted in red is the true target. (<b>a</b>) Image-to-recipe retrieval. Top five retrieved recipes are shown and the one highlighted in red is the true target. (<b>b</b>) Recipe-to-image retrieval. Top five retrieved images are shown, and the one highlighted in red is the true target.</p>
Full article ">Figure 9 Cont.
<p>Qualitative results. Each row includes one query (the left item) and the top five retrieved items (recipes in (<b>a</b>) and images in (<b>b</b>)). The element highlighted in red is the true target. (<b>a</b>) Image-to-recipe retrieval. Top five retrieved recipes are shown and the one highlighted in red is the true target. (<b>b</b>) Recipe-to-image retrieval. Top five retrieved images are shown, and the one highlighted in red is the true target.</p>
Full article ">
16 pages, 2121 KiB  
Article
Enhancement of Named Entity Recognition in Low-Resource Languages with Data Augmentation and BERT Models: A Case Study on Urdu
by Fida Ullah, Alexander Gelbukh, Muhammad Tayyab Zamir, Edgardo Manuel Felipe Riverόn and Grigori Sidorov
Computers 2024, 13(10), 258; https://doi.org/10.3390/computers13100258 - 10 Oct 2024
Viewed by 509
Abstract
Identifying and categorizing proper nouns in text, known as named entity recognition (NER), is crucial for various natural language processing tasks. However, developing effective NER techniques for low-resource languages like Urdu poses challenges due to limited training data, particularly in the nastaliq script. [...] Read more.
Identifying and categorizing proper nouns in text, known as named entity recognition (NER), is crucial for various natural language processing tasks. However, developing effective NER techniques for low-resource languages like Urdu poses challenges due to limited training data, particularly in the nastaliq script. To address this, our study introduces a novel data augmentation method, “contextual word embeddings augmentation” (CWEA), for Urdu, aiming to enrich existing datasets. The extended dataset, comprising 160,132 tokens and 114,912 labeled entities, significantly enhances the coverage of named entities compared to previous datasets. We evaluated several transformer models on this augmented dataset, including BERT-multilingual, RoBERTa-Urdu-small, BERT-base-cased, and BERT-large-cased. Notably, the BERT-multilingual model outperformed others, achieving the highest macro F1 score of 0.982%. This surpassed the macro f1 scores of the RoBERTa-Urdu-small (0.884%), BERT-large-cased (0.916%), and BERT-base-cased (0.908%) models. Additionally, our neural network model achieved a micro F1 score of 96%, while the RNN model achieved 97% and the BiLSTM model achieved a macro F1 score of 96% on augmented data. Our findings underscore the efficacy of data augmentation techniques in enhancing NER performance for low-resource languages like Urdu. Full article
Show Figures

Figure 1

Figure 1
<p>Main steps of the proposed methodology.</p>
Full article ">Figure 2
<p>Example of original and augmented sentences with English translation.</p>
Full article ">Figure 3
<p>The architecture of the proposed BERT model.</p>
Full article ">Figure 4
<p>Confusion matrix for the best model.</p>
Full article ">Figure 5
<p>Comparison of the results with and without augmentation [<a href="#B24-computers-13-00258" class="html-bibr">24</a>].</p>
Full article ">
15 pages, 2357 KiB  
Article
Dynamic Multi-Granularity Translation System: DAG-Structured Multi-Granularity Representation and Self-Attention
by Shenrong Lv, Bo Yang, Ruiyang Wang, Siyu Lu, Jiawei Tian, Wenfeng Zheng, Xiaobing Chen and Lirong Yin
Systems 2024, 12(10), 420; https://doi.org/10.3390/systems12100420 - 9 Oct 2024
Viewed by 409
Abstract
In neural machine translation (NMT), the sophistication of word embeddings plays a pivotal role in the model’s ability to render accurate and contextually relevant translations. However, conventional models with single granularity of word segmentation cannot fully embed complex languages like Chinese, where the [...] Read more.
In neural machine translation (NMT), the sophistication of word embeddings plays a pivotal role in the model’s ability to render accurate and contextually relevant translations. However, conventional models with single granularity of word segmentation cannot fully embed complex languages like Chinese, where the granularity of segmentation significantly impacts understanding and translation fidelity. Addressing these challenges, our study introduces the Dynamic Multi-Granularity Translation System (DMGTS), an innovative approach that enhances the Transformer model by incorporating multi-granularity position encoding and multi-granularity self-attention mechanisms. Leveraging a Directed Acyclic Graph (DAG), the DMGTS utilizes four levels of word segmentation for multi-granularity position encoding. Dynamic word embeddings are also introduced to enhance the lexical representation by incorporating multi-granularity features. Multi-granularity self-attention mechanisms are applied to replace the conventional self-attention layers. We evaluate the DMGTS on multiple datasets, where our system demonstrates marked improvements. Notably, it achieves significant enhancements in translation quality, evidenced by increases of 1.16 and 1.55 in Bilingual Evaluation Understudy (BLEU) scores over traditional static embedding methods. These results underscore the efficacy of the DMGTS in refining NMT performance. Full article
(This article belongs to the Section Artificial Intelligence and Digital Systems Engineering)
Show Figures

Figure 1

Figure 1
<p>Architecture of the proposed DMGTS.</p>
Full article ">Figure 2
<p>Performance comparison of different word segmentation methods.</p>
Full article ">Figure 3
<p>Comparison of output encoding procedures with and without adding relative positions. (<b>a</b>) SA; (<b>b</b>) the output encoding of the first “I”; (<b>c</b>) the output encoding of the second “I”.</p>
Full article ">Figure 4
<p>Multi-granularity position encoding using DAG.</p>
Full article ">Figure 5
<p>Multi-granularity relative distance matrix.</p>
Full article ">
20 pages, 1853 KiB  
Article
Chinese Named Entity Recognition Based on Multi-Level Representation Learning
by Weijun Li, Jianping Ding, Shixia Liu, Xueyang Liu, Yilei Su and Ziyi Wang
Appl. Sci. 2024, 14(19), 9083; https://doi.org/10.3390/app14199083 - 8 Oct 2024
Viewed by 532
Abstract
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which [...] Read more.
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which hinders the accuracy of entity recognition. To this end, a Chinese NER model based on multi-level representation learning is proposed. The model leverages a pre-trained word-based embedding to capture contextual information. A linear layer adjusts dimensions to fit an Extended Long Short-Term Memory (XLSTM) network, enabling the capture of long-range dependencies and contextual information, and providing deeper representations. An adaptive multi-head attention mechanism is proposed to enhance the ability to capture global dependencies and comprehend deep semantic context. Additionally, GlobalPointer with rotational position encoding integrates global information for entity category prediction. Projected Gradient Descent (PGD) is incorporated, introducing perturbations in the embedding layer of the pre-trained model to enhance stability in noisy environments. The proposed model achieves F1-scores of 96.89%, 74.89%, 72.19%, and 80.96% on the Resume, Weibo, CMeEE, and CLUENER2020 datasets, respectively, demonstrating improvements over baseline and comparison models. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>WP-XAG Model.</p>
Full article ">Figure 2
<p>mLSTM block structure diagram.</p>
Full article ">Figure 3
<p>Adaptive multi-head attention structure diagram.</p>
Full article ">Figure 4
<p>GlobalPointer for entity tagging.</p>
Full article ">Figure 5
<p>Model iteration comparison.</p>
Full article ">
21 pages, 2103 KiB  
Article
On the Utilization of Emoji Encoding and Data Preprocessing with a Combined CNN-LSTM Framework for Arabic Sentiment Analysis
by Hussam Alawneh, Ahmad Hasasneh and Mohammed Maree
Modelling 2024, 5(4), 1469-1489; https://doi.org/10.3390/modelling5040076 - 7 Oct 2024
Viewed by 512
Abstract
Social media users often express their emotions through text in posts and tweets, and these can be used for sentiment analysis, identifying text as positive or negative. Sentiment analysis is critical for different fields such as politics, tourism, e-commerce, education, and health. However, [...] Read more.
Social media users often express their emotions through text in posts and tweets, and these can be used for sentiment analysis, identifying text as positive or negative. Sentiment analysis is critical for different fields such as politics, tourism, e-commerce, education, and health. However, sentiment analysis approaches that perform well on English text encounter challenges with Arabic text due to its morphological complexity. Effective data preprocessing and machine learning techniques are essential to overcome these challenges and provide insightful sentiment predictions for Arabic text. This paper evaluates a combined CNN-LSTM framework with emoji encoding for Arabic Sentiment Analysis, using the Arabic Sentiment Twitter Corpus (ASTC) dataset. Three experiments were conducted with eight-parameter fusion approaches to evaluate the effect of data preprocessing, namely the effect of emoji encoding on their real and emotional meaning. Emoji meanings were collected from four websites specialized in finding the meaning of emojis in social media. Furthermore, the Keras tuner optimized the CNN-LSTM parameters during the 5-fold cross-validation process. The highest accuracy rate (91.85%) was achieved by keeping non-Arabic words and removing punctuation, using the Snowball stemmer after encoding emojis into Arabic text, and applying Keras embedding. This approach is competitive with other state-of-the-art approaches, showing that emoji encoding enriches text by accurately reflecting emotions, and enabling investigation of the effect of data preprocessing, allowing the hybrid model to achieve comparable results to the study using the same ASTC dataset, thereby improving sentiment analysis accuracy. Full article
Show Figures

Figure 1

Figure 1
<p>The workflow of the proposed model for Arabic sentiment analysis.</p>
Full article ">Figure 2
<p>The number of positive and negative tweets.</p>
Full article ">Figure 3
<p>The proposed CNN-LSTM model architecture for Arabic sentiment analysis.</p>
Full article ">Figure 4
<p>Confusion matrix of experiment 2 R3.</p>
Full article ">Figure 5
<p>ROC-Curve of experiment 2 R3.</p>
Full article ">
14 pages, 600 KiB  
Article
The Influence of the L1 on L2 Collocation Processing in Tamil-English Bilingual Children
by Roopa Leonard, Holly Joseph and Michael Daller
Languages 2024, 9(10), 319; https://doi.org/10.3390/languages9100319 - 3 Oct 2024
Viewed by 352
Abstract
This study examines the influence of Tamil (L1) on the processing of English (L2) collocations during reading for Tamil-English bilingual children. Building on existing research in formulaic language, we used an online processing tool to investigate whether cross-linguistic transfer can be extended beyond [...] Read more.
This study examines the influence of Tamil (L1) on the processing of English (L2) collocations during reading for Tamil-English bilingual children. Building on existing research in formulaic language, we used an online processing tool to investigate whether cross-linguistic transfer can be extended beyond single lexical items to collocations in bilingual children, a population that is underrepresented in this research area. Fifty-eight children aged 9–10 years from a school in Chennai, India, took part. Using self-paced reading, children’s reading times were measured for both congruent (with equivalent in L2) and incongruent (without equivalent in L2) English collocations embedded in short passages. There were two reading modes (single and chunk), which allowed reading times for the whole collocations and the individual words of the collocations to be examined. Results showed that children read congruent collocations more quickly than incongruent collocations in both modes. For congruent collocations, children read the second word more quickly than the first word, but the reverse was true for incongruent collocations. These results suggest that the L1 (Tamil) is activated during the processing stage of reading English collocations for Tamil-English bilingual children in this context. Full article
Show Figures

Figure 1

Figure 1
<p>Reading times (in ms) for Word 1 and Word 2 in the congruent and incongruent conditions in single mode.</p>
Full article ">
24 pages, 10896 KiB  
Article
Enhanced TextNetTopics for Text Classification Using the G-S-M Approach with Filtered fastText-Based LDA Topics and RF-Based Topic Scoring: fasTNT
by Daniel Voskergian, Rashid Jayousi and Malik Yousef
Appl. Sci. 2024, 14(19), 8914; https://doi.org/10.3390/app14198914 - 3 Oct 2024
Viewed by 529
Abstract
TextNetTopics is a novel topic modeling-based topic selection approach that finds highly ranked discriminative topics for training text classification models, where a topic is a set of semantically related words. However, it suffers from several limitations, including the retention of redundant or irrelevant [...] Read more.
TextNetTopics is a novel topic modeling-based topic selection approach that finds highly ranked discriminative topics for training text classification models, where a topic is a set of semantically related words. However, it suffers from several limitations, including the retention of redundant or irrelevant features within topics, a computationally intensive topic-scoring mechanism, and a lack of explicit semantic modeling. In order to address these shortcomings, this paper proposes fasTNT, an enhanced version of TextNetTopics grounded in the Grouping–Scoring–Modeling approach. FasTNT aims to improve the topic selection process by preserving only informative features within topics, reforming LDA topics using fastText word embeddings, and introducing an efficient scoring method that considers topic interactions using Random Forest feature importance. Experimental results on four diverse datasets demonstrate that fasTNT outperforms the original TextNetTopics method in classification performance and feature reduction. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>The general framework of the fasTNT approach, illustrating the working mechanisms of the T, F, and G components (Part 1).</p>
Full article ">Figure 2
<p>The general framework of the fasTNT approach, illustrating the working mechanisms of the S and M components (Part 2). The bullet is intended to indicate the scoring of the remaining topic clusters.</p>
Full article ">Figure 3
<p>The working mechanism of the T component.</p>
Full article ">Figure 4
<p>The working mechanism of the F component.</p>
Full article ">Figure 5
<p>The working mechanism of the G component.</p>
Full article ">Figure 6
<p>The working mechanism of the S component (feature importance-based topic scoring).</p>
Full article ">Figure 7
<p>The working mechanism of the M component (first iteration). The red border covers the topic clusters and their corresponding word lists utilized for the training and testing processes in the specified iteration.</p>
Full article ">Figure 8
<p>The working mechanism of the M component (second iteration). The red border covers the topic clusters and their corresponding word lists utilized for the training and testing processes in the specified iteration.</p>
Full article ">Figure 9
<p>F1-score performance of fasTNT across various feature-discarding percentages (<span class="html-italic">v</span>%) when utilizing the WOS-5736 dataset. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 10
<p>F1-score performance of fasTNT across various feature-discarding percentages (<span class="html-italic">v</span>%) when utilizing the LitCovid dataset. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 11
<p>F1-score performance of fasTNT across various feature-discarding percentages (<span class="html-italic">v</span>%) when utilizing the MultiLabel dataset. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 12
<p>F1-score performance of fasTNT across various feature-discarding percentages (<span class="html-italic">v</span>%) when utilizing the arXiv dataset. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 13
<p>F1-score performance comparison of fasTNT and TextNetTopics when utilizing the WOS-5736 dataset. The maximum and the minimum F1-scores attained by each algorithm are highlighted. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 14
<p>F1-score performance comparison of fasTNT and TextNetTopics when utilizing the LitCovid dataset. The maximum and the minimum F1-scores attained by each algorithm are highlighted. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 15
<p>F1-score performance comparison of fasTNT and TextNetTopics when utilizing the MultiLabel dataset. The maximum and the minimum F1-scores attained by each algorithm are highlighted. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 16
<p>F1-score performance comparison of fasTNT and TextNetTopics when utilizing the arXiv dataset. The maximum and the minimum F1-scores attained by each algorithm are highlighted. The circles on the line represent the number of accumulated topic clusters.</p>
Full article ">Figure 17
<p>Percentage of feature reduction achieved by fasTNT over TextNetTopics for the WOS-5736 and Litcovid datasets when reaching specific F1-score performance.</p>
Full article ">Figure 18
<p>Percentage of feature reduction achieved by fasTNT over TextNetTopics for the MultiLabel and arXiv datasets when reaching specific F1-score performance.</p>
Full article ">
14 pages, 880 KiB  
Review
Embeddings for Efficient Literature Screening: A Primer for Life Science Investigators
by Carlo Galli, Claudio Cusano, Stefano Guizzardi, Nikolaos Donos and Elena Calciolari
Metrics 2025, 1(1), 1; https://doi.org/10.3390/metrics1010001 - 30 Sep 2024
Viewed by 441
Abstract
As the number of publications is quickly growing in any area of science, the need to efficiently find relevant information amidst a large number of similarly themed articles becomes very important. Semantic searching through text documents has the potential to overcome the limits [...] Read more.
As the number of publications is quickly growing in any area of science, the need to efficiently find relevant information amidst a large number of similarly themed articles becomes very important. Semantic searching through text documents has the potential to overcome the limits of keyword-based searches, especially since the introduction of attention-based transformers, which can capture contextual nuances of meaning in single words, sentences, or whole documents. The deployment of these computational tools has been made simpler and accessible to investigators in every field of research thanks to a growing number of dedicated libraries, but knowledge of how meaning representation strategies work is crucial to making the most out of these instruments. The present work aims at introducing the technical evolution of the meaning representation systems, from vectors to embeddings and transformers tailored to life science investigators with no previous knowledge of natural language processing. Full article
Show Figures

Figure 1

Figure 1
<p>Diagram representing the architecture of the Word2Vec shallow neural network.</p>
Full article ">Figure 2
<p>Word2Vec creates word embeddings by discarding the input one-hot encode vector and retaining the weights.</p>
Full article ">Figure 3
<p>Principal Component Analysis is one algorithm for dimensionality reduction, which works by picking the components (i.e., the dimensions) along which the data show the greatest degree of variation.</p>
Full article ">Figure 4
<p>Embeddings can be reduced to two or three dimensions and used as Cartesian coordinates within a semantic space in (<b>A</b>) 2D or (<b>B</b>) 3D scatterplots. The closer the points in the scatterplot, the closer the semantics of the words.</p>
Full article ">
24 pages, 2069 KiB  
Article
Automated Detection of Misinformation: A Hybrid Approach for Fake News Detection
by Fadi Mohsen, Bedir Chaushi, Hamed Abdelhaq, Dimka Karastoyanova and Kevin Wang
Future Internet 2024, 16(10), 352; https://doi.org/10.3390/fi16100352 - 27 Sep 2024
Viewed by 479
Abstract
The rise of social media has transformed the landscape of news dissemination, presenting new challenges in combating the spread of fake news. This study addresses the automated detection of misinformation within written content, a task that has prompted extensive research efforts across various [...] Read more.
The rise of social media has transformed the landscape of news dissemination, presenting new challenges in combating the spread of fake news. This study addresses the automated detection of misinformation within written content, a task that has prompted extensive research efforts across various methodologies. We evaluate existing benchmarks, introduce a novel hybrid word embedding model, and implement a web framework for text classification. Our approach integrates traditional frequency–inverse document frequency (TF–IDF) methods with sophisticated feature extraction techniques, considering linguistic, psychological, morphological, and grammatical aspects of the text. Through a series of experiments on diverse datasets, applying transfer and incremental learning techniques, we demonstrate the effectiveness of our hybrid model in surpassing benchmarks and outperforming alternative experimental setups. Furthermore, our findings emphasize the importance of dataset alignment and balance in transfer learning, as well as the utility of incremental learning in maintaining high detection performance while reducing runtime. This research offers promising avenues for further advancements in fake news detection methodologies, with implications for future research and development in this critical domain. Full article
(This article belongs to the Special Issue Embracing Artificial Intelligence (AI) for Network and Service)
Show Figures

Figure 1

Figure 1
<p>Overview of our fake news detection framework.</p>
Full article ">Figure 2
<p>The first page of our framework, in which users are prompted to select the database, the machine learning algorithms, and other parameters.</p>
Full article ">Figure A1
<p>Snapshot of a PDF report produced by our framework.</p>
Full article ">
Back to TopTop