MDPI - Publisher of Open Access Journals

31 pages, 2279 KiB

Open AccessArticle

Achieving High Accuracy in Android Malware Detection through Genetic Programming Symbolic Classifier

by Nikola Anđelić and Sandi Baressi Šegota

Computers 2024, 13(8), 197; https://doi.org/10.3390/computers13080197 (registering DOI) - 15 Aug 2024

Viewed by 45

The detection of Android malware is of paramount importance for safeguarding users’ personal and financial data from theft and misuse. It plays a critical role in ensuring the security and privacy of sensitive information on mobile devices, thereby preventing unauthorized access and potential [...] Read more.

The detection of Android malware is of paramount importance for safeguarding users’ personal and financial data from theft and misuse. It plays a critical role in ensuring the security and privacy of sensitive information on mobile devices, thereby preventing unauthorized access and potential damage. Moreover, effective malware detection is essential for maintaining device performance and reliability by mitigating the risks posed by malicious software. This paper introduces a novel approach to Android malware detection, leveraging a publicly available dataset in conjunction with a Genetic Programming Symbolic Classifier (GPSC). The primary objective is to generate symbolic expressions (SEs) that can accurately identify malware with high precision. To address the challenge of imbalanced class distribution within the dataset, various oversampling techniques are employed. Optimal hyperparameter configurations for GPSC are determined through a random hyperparameter values search (RHVS) method developed in this research. The GPSC model is trained using a 10-fold cross-validation (10FCV) technique, producing a set of 10 SEs for each dataset variation. Subsequently, the most effective SEs are integrated into a threshold-based voting ensemble (TBVE) system, which is then evaluated on the original dataset. The proposed methodology achieves a maximum accuracy of 0.956, thereby demonstrating its effectiveness for Android malware detection. Full article

► Show Figures

Figure 1

26 pages, 2739 KiB

Open AccessArticle

Diverse but Relevant Recommendations with Continuous Ant Colony Optimization

by Hakan Yılmazer and Selma Ayşe Özel

Mathematics 2024, 12(16), 2497; https://doi.org/10.3390/math12162497 - 13 Aug 2024

Viewed by 260

Abstract

This paper introduces a novel method called AcoRec, which employs an enhanced version of Continuous Ant Colony Optimization for hyper-parameter adjustment and integrates a non-deterministic model to generate diverse recommendation lists. AcoRec is designed for cold-start users and long-tail item recommendations by leveraging [...] Read more.

This paper introduces a novel method called AcoRec, which employs an enhanced version of Continuous Ant Colony Optimization for hyper-parameter adjustment and integrates a non-deterministic model to generate diverse recommendation lists. AcoRec is designed for cold-start users and long-tail item recommendations by leveraging implicit data from collaborative filtering techniques. Continuous Ant Colony Optimization is revisited with the convenience and flexibility of deep learning solid methods and extended within the AcoRec model. The approach computes stochastic variations of item probability values based on the initial predictions derived from a selected item-similarity model. The structure of the AcoRec model enables efficient handling of high-dimensional data while maintaining an effective balance between diversity and high recall, leading to recommendation lists that are both varied and highly relevant to user tastes. Our results demonstrate that AcoRec outperforms existing state-of-the-art methods, including two random-walk models, a graph-based approach, a well-known vanilla autoencoder model, an ACO-based model, and baseline models with related similarity measures, across various evaluation scenarios. These evaluations employ well-known metrics to assess the quality of top-N recommendation lists, using popular datasets including MovieLens, Pinterest, and Netflix. Full article

(This article belongs to the Section Mathematics and Computer Science)

► Show Figures

Figure 1

22 pages, 44198 KiB

Open AccessArticle

Real-Time Simulation of Tube Hydroforming by Integrating Finite-Element Method and Machine Learning

by Liang Cheng, Haijing Guo, Lingyan Sun, Chao Yang, Feng Sun and Jinshan Li

J. Manuf. Mater. Process. 2024, 8(4), 175; https://doi.org/10.3390/jmmp8040175 - 12 Aug 2024

Viewed by 305

Abstract

The real-time, full-field simulation of the tube hydroforming process is crucial for deformation monitoring and the timely prediction of defects. However, this is rather difficult for finite-element simulation due to its time-consuming nature. To overcome this drawback, in this paper, a surrogate model [...] Read more.

The real-time, full-field simulation of the tube hydroforming process is crucial for deformation monitoring and the timely prediction of defects. However, this is rather difficult for finite-element simulation due to its time-consuming nature. To overcome this drawback, in this paper, a surrogate model framework was proposed by integrating the finite-element method (FEM) and machine learning (ML), in which the basic methodology involved interrupting the computational workflow of the FEM and reassembling it with ML. Specifically, the displacement field, as the primary unknown quantity to be solved using the FEM, was mapped onto the displacement boundary conditions of the tube component with ML. To this end, the titanium tube material as well as the hydroforming process was investigated, and a fairly accurate FEM model was developed based on the CPB06 yield criterion coupled with a simplified Kim–Tuan hardening model. Numerous FEM simulations were performed by varying the loading conditions to generate the training database for ML. Then, a random forest algorithm was applied and trained to develop the surrogate model, in which the grid search method was employed to obtain the optimal combination of the hyperparameters. Sequentially, the principal strain, the effective strain/stress, as well as the wall thickness was derived according to continuum mechanics theories. Although further improvements were required in certain aspects, the developed FEM-ML surrogate model delivered extraordinary accuracy and instantaneity in reproducing multi-physical fields, especially the displacement field and wall-thickness distribution, manifesting its feasibility in the real-time, full-field simulation and monitoring of deformation states. Full article

► Show Figures

Figure 1

22 pages, 9056 KiB

Open AccessArticle

Classification of Rock Mass Quality in Underground Rock Engineering with Incomplete Data Using XGBoost Model and Zebra Optimization Algorithm

by Bo Yang, Yongping Liu, Zida Liu, Quanqi Zhu and Diyuan Li

Appl. Sci. 2024, 14(16), 7074; https://doi.org/10.3390/app14167074 - 12 Aug 2024

Viewed by 313

Abstract

Accurate rock mass quality classification is crucial for the design and construction of underground projects. Traditional methods often rely on expert experience, introducing subjectivity, and struggle with complex geological conditions. Machine learning algorithms have improved this issue, but obtaining complete rock mass quality [...] Read more.

Accurate rock mass quality classification is crucial for the design and construction of underground projects. Traditional methods often rely on expert experience, introducing subjectivity, and struggle with complex geological conditions. Machine learning algorithms have improved this issue, but obtaining complete rock mass quality datasets is often difficult due to high cost and complex procedures. This study proposed a hybrid XGBoost model for predicting rock mass quality using incomplete datasets. The zebra optimization algorithm (ZOA) and Bayesian optimization (BO) were used to optimize the hyperparameters of the model. Data from various regions and types of underground engineering projects were utilized. Adaptive synthetic (ADASYN) oversampling addressed class imbalance. The model was evaluated using metrics including accuracy, Kappa, precision, recall, and F1-score. The ZOA-XGBoost model achieved an accuracy of 0.923 on the test set, demonstrating the best overall performance. Feature importance analysis and individual conditional expectation (ICE) plots highlighted the roles of RQD and UCS in predicting rock mass quality. The model’s robustness with incomplete data was verified by comparing its performance with other machine learning models on a dataset with missing values. The ZOA-XGBoost model outperformed other models, proving its reliability and effectiveness. This study provides an efficient and objective method for rock mass quality classification, offering significant value for engineering applications. Full article

(This article belongs to the Special Issue Computational Mechanics and Digital Applications in the Mineral Resources Sector)

► Show Figures

Figure 1

31 pages, 6880 KiB

Open AccessArticle

Multi-Dimensional Global Temporal Predictive Model for Multi-State Prediction of Marine Diesel Engines

by Liyong Ma, Siqi Chen, Shuli Jia, Yong Zhang and Hai Du

J. Mar. Sci. Eng. 2024, 12(8), 1370; https://doi.org/10.3390/jmse12081370 - 11 Aug 2024

Viewed by 382

Abstract

The reliability and stability of marine diesel engines are pivotal to the safety and economy of maritime operations. Accurate and efficient prediction of the states of these engines is essential for performance evaluation and operational continuity. This paper introduces a novel hybrid deep [...] Read more.

The reliability and stability of marine diesel engines are pivotal to the safety and economy of maritime operations. Accurate and efficient prediction of the states of these engines is essential for performance evaluation and operational continuity. This paper introduces a novel hybrid deep learning model, the multi-dimensional global temporal predictive (MDGTP) model, designed for synchronous multi-state prediction of marine diesel engines. The model incorporates parallel multi-head attention mechanisms, an enhanced long short-term memory (LSTM) with interleaved residual connections, and gated recurrent units (GRUs). Additionally, we propose a dynamic arithmetic tuna optimization algorithm, which synergizes tuna swarm optimization (TSO), and the arithmetic optimization algorithm (AOA) for hyperparameter optimization, thereby enhancing prediction accuracy. Comparative experiments using actual marine diesel engine data demonstrate that our model outperforms the LSTM, GRU, LSTM–GRU, support vector regression (SVR), random forest (RF), Gaussian process regression (GPR), and back propagation (BP) models, achieving the lowest root mean squared error (RMSE) and mean absolute error (MAE), as well as the highest Pearson correlation coefficient across three sampling periods. Ablation studies confirm the significance of each component in improving prediction accuracy. Our findings validate the efficacy of the proposed MDGTP model for predicting the multi-dimensional operating states of marine diesel engines. Full article

(This article belongs to the Special Issue Advanced Condition Monitoring and Intelligent Operation & Maintenance Technologies in Ships and Offshore Facilities)

► Show Figures

Figure 1

17 pages, 718 KiB

Open AccessArticle

MédicoBERT: A Medical Language Model for Spanish Natural Language Processing Tasks with a Question-Answering Application Using Hyperparameter Optimization

by Josué Padilla Cuevas, José A. Reyes-Ortiz, Alma D. Cuevas-Rasgado, Román A. Mora-Gutiérrez and Maricela Bravo

Appl. Sci. 2024, 14(16), 7031; https://doi.org/10.3390/app14167031 - 10 Aug 2024

Viewed by 424

Abstract

The increasing volume of medical information available in digital format presents a significant challenge for researchers seeking to extract relevant information. Manually analyzing voluminous data is a time-consuming process that constrains researchers’ productivity. In this context, innovative and intelligent computational approaches to information [...] Read more.

The increasing volume of medical information available in digital format presents a significant challenge for researchers seeking to extract relevant information. Manually analyzing voluminous data is a time-consuming process that constrains researchers’ productivity. In this context, innovative and intelligent computational approaches to information search, such as large language models (LLMs), offer a promising solution. LLMs understand natural language questions and respond accurately to complex queries, even in the specialized domain of medicine. This paper presents MédicoBERT, a medical language model in Spanish developed by adapting a general domain language model (BERT) to medical terminology and vocabulary related to diseases, treatments, symptoms, and medications. The model was pre-trained with 3 M medical texts containing 1.1 B words. Furthermore, with promising results, MédicoBERT was adapted and evaluated to answer medical questions in Spanish. The question-answering (QA) task was fine-tuned using a Spanish corpus of over 34,000 medical questions and answers. A search was then conducted to identify the optimal hyperparameter configuration using heuristic methods and nonlinear regression models. The evaluation of MédicoBERT was carried out using metrics such as perplexity to measure the adaptation of the language model to the medical vocabulary in Spanish, where it obtained a value of 4.28, and the average F1 metric for the task of answering medical questions, where it obtained a value of 62.35%. The objective of MédicoBERT is to provide support for research in the field of natural language processing (NLP) in Spanish, with a particular emphasis on applications within the medical domain. Full article

(This article belongs to the Special Issue Techniques and Applications of Natural Language Processing)

► Show Figures

Figure 1

15 pages, 2698 KiB

Open AccessArticle

Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China

by Rong Zheng, Zhilin Sun, Jiange Jiao, Qianqian Ma and Liqin Zhao

J. Mar. Sci. Eng. 2024, 12(8), 1339; https://doi.org/10.3390/jmse12081339 - 7 Aug 2024

Viewed by 488

Abstract

Accurate prediction of estuarine salinity can effectively mitigate the adverse effects of saltwater intrusion and help ensure the safety of water resources in estuarine regions. Presently, diverse data-driven models, mainly neural network models, have been employed to predict tidal estuarine salinity and obtained [...] Read more.

Accurate prediction of estuarine salinity can effectively mitigate the adverse effects of saltwater intrusion and help ensure the safety of water resources in estuarine regions. Presently, diverse data-driven models, mainly neural network models, have been employed to predict tidal estuarine salinity and obtained considerable achievements. Due to the nonlinear and nonstationary features of estuarine salinity sequences, this paper proposed a multi-factor salinity prediction model using an enhanced Long Short-Term Memory (LSTM) network. To improve prediction accuracy, input variables of the model were determined through Grey Relational Analysis (GRA) combined with estuarine dynamic analysis, and hyperparameters for the LSTM model were optimized using a multi-strategy Improved Sparrow Search Algorithm (ISSA). The proposed ISSA-LSTM model was applied to predict salinity at the Cangqian and Qibao stations in the Qiantang Estuary of China, based on measured data from 2011–2012. The model performance is evaluated by mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and Nash-Sutcliffe efficiency (NSE). The results show that compared to other models including Back Propagation neural network (BP), Gate Recurrent Unit (GRU), and LSTM model, the new model has smaller errors and higher prediction accuracy, with NSE improved by 8–32% and other metrics (MAP, MAPE, RMSE) improved by 15–67%. Meanwhile, compared with LSTM optimized with the original SSA (SSA-LSTM), MAE, MAPE, and RMSE values of the new model decreased by 13–16%, 15–16%, and 11–13%, and NSE value increased by 5–6%, indicating that the ISSA has a better hyperparameter optimization ability than the original SSA. Thus, the model provides a practical solution for the rapid and precise prediction of estuarine salinity. Full article

(This article belongs to the Topic Sustainable River and Lake Restoration: From Challenges to Solutions)

► Show Figures

Figure 1

Figure 1
A map of study area. (a,b) Monitoring stations along the Qiantang Estuary. The discharge data is provided by Fuchunjiang hydrological station (FCJ), the water level data is provided by Ganpu station (GP), the salinity data is provided by CQ and QB station, and the wind speed data is provided by Hangzhou station (HZ). Full article ">Figure 2
Overall framework and flowchart of the ISSA-LSTM model. Part 1 is data preprocessing and feature selection; Part 2 is hyperparameters optimization by ISSA; Part 3 is the LSTM model. Full article ">Figure 3
Flowchart of the SSA and ISSA. (a) SSA. (b) ISSA. Full article ">Figure 4
Prediction results of different models. (a) CQ station. (b) QB station. The gray color block represents observed values of daily maximum salinity. The green, brown, purple, blue, and red lines represent the predicated results of BP, GRU, LSTM, SSA-LSTM, and ISSA-LSTM models. The light orange region is zoomed in and shown in the small window in the subgraph (7/15–8/15, 10/15–11/15). Full article ">Figure 5
Scatterplot of observed values and predicted values of different models. (a) CQ station. (b) QB station. The green, brown, purple, blue, and red lines represent the results of BP, GRU, LSTM, SSA-LSTM, and ISSA-LSTM models. Full article ">Figure 6
Comparison of prediction results under different discharge conditions. (a) CQ station. (b) QB station. The black solid line represents the salinity prediction result for the original discharge and the yellow and blue dash lines represent the salinity prediction results for discharge decreased or increased by 50%. Full article ">

19 pages, 5027 KiB

Open AccessArticle

Brain Tumor Detection and Classification Using an Optimized Convolutional Neural Network

by Muhammad Aamir, Abdallah Namoun, Sehrish Munir, Nasser Aljohani, Meshari Huwaytim Alanazi, Yaser Alsahafi and Faris Alotibi

Diagnostics 2024, 14(16), 1714; https://doi.org/10.3390/diagnostics14161714 - 7 Aug 2024

Viewed by 1073

Abstract

Brain tumors are a leading cause of death globally, with numerous types varying in malignancy, and only 12% of adults diagnosed with brain cancer survive beyond five years. This research introduces a hyperparametric convolutional neural network (CNN) model to identify brain tumors, with [...] Read more.

Brain tumors are a leading cause of death globally, with numerous types varying in malignancy, and only 12% of adults diagnosed with brain cancer survive beyond five years. This research introduces a hyperparametric convolutional neural network (CNN) model to identify brain tumors, with significant practical implications. By fine-tuning the hyperparameters of the CNN model, we optimize feature extraction and systematically reduce model complexity, thereby enhancing the accuracy of brain tumor diagnosis. The critical hyperparameters include batch size, layer counts, learning rate, activation functions, pooling strategies, padding, and filter size. The hyperparameter-tuned CNN model was trained on three different brain MRI datasets available at Kaggle, producing outstanding performance scores, with an average value of 97% for accuracy, precision, recall, and F1-score. Our optimized model is effective, as demonstrated by our methodical comparisons with state-of-the-art approaches. Our hyperparameter modifications enhanced the model performance and strengthened its capacity for generalization, giving medical practitioners a more accurate and effective tool for making crucial judgments regarding brain tumor diagnosis. Our model is a significant step in the right direction toward trustworthy and accurate medical diagnosis, with practical implications for improving patient outcomes. Full article

(This article belongs to the Topic AI in Medical Imaging and Image Processing)

► Show Figures

Figure 1

32 pages, 11355 KiB

Open AccessArticle

Joint Optimization of Relay Communication Rates in Clustered Drones under Interference Conditions

by Xinglong Gu, Guifen Chen, Guowei Wu and Chenghua Wen

Drones 2024, 8(8), 381; https://doi.org/10.3390/drones8080381 - 7 Aug 2024

Viewed by 362

Abstract

To address the issues of communication failure and inefficiency in clustered drone relay communication due to external malicious interference, this paper proposes a joint optimization method for relay communication rates under interference conditions for clustered drones. This method employs the following two-step processing [...] Read more.

To address the issues of communication failure and inefficiency in clustered drone relay communication due to external malicious interference, this paper proposes a joint optimization method for relay communication rates under interference conditions for clustered drones. This method employs the following two-step processing framework: Firstly, the Discrete Soft Actor-Critic (DSAC) algorithm is used to train the relay drones for dynamic channel selection, effectively avoiding various types of interference. Simultaneously, the Bayesian optimization algorithm is applied to optimize the hyperparameters of the DSAC algorithm, further enhancing its performance. Subsequently, the modulation order, transmission power, trajectory of the relay drones, and power allocation factors of the clustered drones are jointly optimized. This complex problem is transformed into a convex subproblem for determining a solution, aiming to maximize the communication rate of the clustered drones. The simulation’s results demonstrate that the proposed algorithm exhibits excellent performances in terms of anti-interference capability, solution convergence, and stability. It effectively improves the mission efficiency of clustered drones under interference conditions and enhances their adaptability to dynamic environments. Full article

► Show Figures

Figure 1

19 pages, 9956 KiB

Open AccessArticle

Optimized Radio Frequency Footprint Identification Based on UAV Telemetry Radios

by Yuan Tian, Hong Wen, Jiaxin Zhou, Zhiqiang Duan and Tao Li

Sensors 2024, 24(16), 5099; https://doi.org/10.3390/s24165099 - 6 Aug 2024

Viewed by 362

Abstract

With the widespread use of unmanned aerial vehicles (UAVs), the detection and identification of UAVs is a vital security issue for the safety of airspace and ground facilities in the no-fly zone. Telemetry radios are important wireless communication devices for UAVs, especially in [...] Read more.

With the widespread use of unmanned aerial vehicles (UAVs), the detection and identification of UAVs is a vital security issue for the safety of airspace and ground facilities in the no-fly zone. Telemetry radios are important wireless communication devices for UAVs, especially in UAVs beyond the visual line of sight (BVLOS) operating mode. This work focuses on the UAV identification approach using transient signals from UAV telemetry radios instead of the signals from UAV controllers that the former research work depended on. In our novel UAV Radio Frequency (RF) identification system framework based on telemetry radio signals, the

E C - α

algorithm is optimized to detect the starting point of the UAV transient signal and the detection accuracy at different signal-to-noise ratios (SNR) is evaluated. In the training stage, the Convolutional Neural Network (CNN) model is trained to extract features from raw I/Q data of the transient signals with different waveforms. Its architecture and hyperparameters are analyzed and optimized. In the identification stage, the extracted transient signals are clustered through the Self-Organizing Map (SOM) algorithm and the Clustering Signals Joint Identification (CSJI) algorithm is proposed to improve the accuracy of RF fingerprint identification. To evaluate the performance of our proposed approach, we design a testbed, including two UAVs as the flight platform, a Universal Software Radio Peripheral (USRP) as the receiver, and 20 telemetry radios with the same model as targets for identification. Indoor test results show that the optimized identification approach achieves an average accuracy of 92.3% at 30 dB. In comparison, the identification accuracy of SVM and KNN is 69.7% and 74.5%, respectively, at the same SNR condition. Extensive experiments are conducted outdoors to demonstrate the feasibility of this approach. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

17 pages, 3514 KiB

Open AccessArticle

Optimizing CNN-LSTM for the Localization of False Data Injection Attacks in Power Systems

by Zhuo Li, Yaobin Xie, Rongkuan Ma and Zihan Wei

Appl. Sci. 2024, 14(16), 6865; https://doi.org/10.3390/app14166865 - 6 Aug 2024

Viewed by 1092

Abstract

As the informatization of power systems advances, the secure operation of power systems faces various potential network attacks and threats. The false data injection attack (FDIA) is a common attack mode that can lead to abnormal system operations and serious economic losses by [...] Read more.

As the informatization of power systems advances, the secure operation of power systems faces various potential network attacks and threats. The false data injection attack (FDIA) is a common attack mode that can lead to abnormal system operations and serious economic losses by injecting abnormal data into terminal links or devices. The current research on FDIA primarily focuses on detecting its existence, but there is relatively little research on the localization of the attacks. To address this challenge, this study proposes a novel FDIA localization method (GA-CNN-LSTM) that combines convolutional neural networks (CNNs), long short-term memory (LSTM), and a genetic algorithm (GA) and can accurately locate the attacked bus or line. This method utilizes a CNN to extract local features and combines LSTM with time series information to extract global features. It integrates a CNN and LSTM to deeply explore complex patterns and dynamic changes in the data, effectively extract FDIA features in the data, and optimize the hyperparameters of the neural network using the GA to ensure an optimal performance of the model. Simulation experiments were conducted on the IEEE 14-bus and 118-bus test systems. The results indicate that the GA-CNN-LSTM method achieved F1 scores for location identification of 99.71% and 99.10%, respectively, demonstrating superior localization performance compared to other methods. Full article

(This article belongs to the Special Issue Machine Learning and Deep Learning-Based Fault Detection and Diagnosis)

► Show Figures

Figure 1

20 pages, 11655 KiB

Open AccessArticle

Daily Runoff Prediction Based on FA-LSTM Model

by Qihui Chai, Shuting Zhang, Qingqing Tian, Chaoqiang Yang and Lei Guo

Water 2024, 16(16), 2216; https://doi.org/10.3390/w16162216 - 6 Aug 2024

Viewed by 605

Abstract

Accurate and reliable short-term runoff prediction plays a pivotal role in water resource management, agriculture, and flood control, enabling decision-makers to implement timely and effective measures to enhance water use efficiency and minimize losses. To further enhance the accuracy of runoff prediction, this [...] Read more.

Accurate and reliable short-term runoff prediction plays a pivotal role in water resource management, agriculture, and flood control, enabling decision-makers to implement timely and effective measures to enhance water use efficiency and minimize losses. To further enhance the accuracy of runoff prediction, this study proposes a FA-LSTM model that integrates the Firefly algorithm (FA) with the long short-term memory neural network (LSTM). The research focuses on historical daily runoff data from the Dahuangjiangkou and Wuzhou Hydrology Stations in the Xijiang River Basin. The FA-LSTM model is compared with RNN, LSTM, GRU, SVM, and RF models. The FA-LSTM model was used to carry out the generalization experiment in Qianjiang, Wuxuan, and Guigang hydrology stations. Additionally, the study analyzes the performance of the FA-LSTM model across different forecasting horizons (1–5 days). Four quantitative evaluation metrics—mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R²), and Kling–Gupta efficiency coefficient (KGE)—are utilized in the evaluation process. The results indicate that: (1) Compared to RNN, LSTM, GRU, SVM, and RF models, the FA-LSTM model exhibits the best prediction performance, with daily runoff prediction determination coefficients (R²) reaching as high as 0.966 and 0.971 at the Dahuangjiangkou and Wuzhou Stations, respectively, and the KGE is as high as 0.965 and 0.960, respectively. (2) FA-LSTM model was used to conduct generalization tests at Qianjiang, Wuxuan and Guigang hydrology stations, and its R² and KGE are 0.96 or above, indicating that the model has good adaptability in different hydrology stations and strong robustness. (3) As the prediction period extends, the R² and KGE of the FA-LSTM model show a decreasing trend, but the whole model still showed feasible forecasting ability. The FA-LSTM model introduced in this study presents an effective new approach for daily runoff prediction. Full article

(This article belongs to the Section Hydrology)

► Show Figures

Figure 1

24 pages, 7013 KiB

Open AccessArticle

Comparative Analysis of Nature-Inspired Metaheuristic Techniques for Optimizing Phishing Website Detection

by Thomas Nagunwa

Analytics 2024, 3(3), 344-367; https://doi.org/10.3390/analytics3030019 - 6 Aug 2024

Viewed by 426

Abstract

The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving [...] Read more.

The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving complex optimization problems in diverse domains. Following these successes, this research paper aims to investigate the effectiveness of metaheuristic techniques, particularly Genetic Algorithms (GAs), Differential Evolution (DE), and Particle Swarm Optimization (PSO), in optimizing the hyperparameters of machine learning (ML) algorithms for detecting phishing websites. Using multiple datasets, six ensemble classifiers were trained on each dataset and their hyperparameters were optimized using each metaheuristic technique. As a baseline for assessing performance improvement, the classifiers were also trained with the default hyperparameters. To validate the genuine impact of the techniques over the use of default hyperparameters, we conducted statistical tests on the accuracy scores of all the optimized classifiers. The results show that the GA is the most effective technique, by improving the accuracy scores of all the classifiers, followed by DE, which improved four of the six classifiers. PSO was the least effective, improving only one classifier. It was also found that GA-optimized Gradient Boosting, LGBM and XGBoost were the best classifiers across all the metrics in predicting phishing websites, achieving peak accuracy scores of 98.98%, 99.24%, and 99.47%, respectively. Full article

► Show Figures

Figure 1

17 pages, 2589 KiB

Open AccessArticle

Adaptive Evolutionary Computing Ensemble Learning Model for Sentiment Analysis

by Xiao-Yang Liu, Kang-Qi Zhang, Giacomo Fiumara, Pasquale De Meo and Annamaria Ficara

Appl. Sci. 2024, 14(15), 6802; https://doi.org/10.3390/app14156802 - 4 Aug 2024

Viewed by 429

Abstract

Standard machine learning and deep learning architectures have been widely used in the field of sentiment analysis, but their performance is unsatisfactory if the input texts are short (e.g., social media posts). Specifically, the accuracy of standard machine learning methods crucially depends on [...] Read more.

Standard machine learning and deep learning architectures have been widely used in the field of sentiment analysis, but their performance is unsatisfactory if the input texts are short (e.g., social media posts). Specifically, the accuracy of standard machine learning methods crucially depends on the richness and completeness of the features used to represent the texts, and in the case of short messages, it is often difficult to obtain high-quality features. Conversely, methods based on deep learning can achieve better expressiveness, but these methods are computationally demanding and often suffer from over-fitting. This paper proposes a new adaptive evolutionary computational integrated learning model (AdaECELM) to overcome the problems encountered by traditional machine learning and deep learning models in sentiment analysis for short texts. AdaECELM consists of three phases: feature selection, sub classifier training, and global integration learning. First, a grid search is used for feature extraction and selection of term frequency-inverse document frequency (TF-IDF). Second, cuckoo search (CS) is introduced to optimize the combined hyperparameters in the sub-classifier support vector machine (SVM). Finally, the training set is divided into different feature subsets for sub-classifier training, and then the trained sub-classifiers are integrated and learned using the AdaBoost integrated soft voting method. Extensive experiments were conducted on six real polar sentiment analysis data sets. The results show that the AdaECELM model outperforms the traditional ML comparison methods according to evaluation metrics such as accuracy, precision, recall, and F1-score in all cases, and we report an improvement in accuracy exceeding 4.5%, the second-best competitor. Full article

(This article belongs to the Special Issue Artificial Intelligence in Complex Networks (2nd Edition))

► Show Figures

Figure 1

20 pages, 4456 KiB

Open AccessArticle

Predicting the Characteristics of High-Speed Serial Links Based on a Deep Neural Network (DNN)—Transformer Cascaded Model

by Liyin Wu, Jingyang Zhou, Haining Jiang, Xi Yang, Yongzheng Zhan and Yinhang Zhang

Electronics 2024, 13(15), 3064; https://doi.org/10.3390/electronics13153064 - 2 Aug 2024

Viewed by 422

Abstract

The design level of channel physical characteristics has a crucial influence on the transmission quality of high-speed serial links. However, channel design requires a complex simulation and verification process. In this paper, a cascade neural network model constructed of a Deep Neural Network [...] Read more.

The design level of channel physical characteristics has a crucial influence on the transmission quality of high-speed serial links. However, channel design requires a complex simulation and verification process. In this paper, a cascade neural network model constructed of a Deep Neural Network (DNN) and a Transformer is proposed. This model takes physical features as inputs and imports a Single-Bit Response (SBR) as a connection, which is enhanced through predicting frequency characteristics and equalizer parameters. At the same time, signal integrity (SI) analysis and link optimization are achieved by predicting eye diagrams and channel operating margins (COMs). Additionally, Bayesian optimization based on the Gaussian process (GP) is employed for hyperparameter optimization (HPO). The results show that the DNN–Transformer cascaded model achieves high-precision predictions of multiple metrics in performance prediction and optimization, and the maximum relative error of the test-set results is less than 2% under the equalizer architecture of a 3-taps TX FFE, an RX CTLE with dual DC gain, and a 12-taps RX DFE, which is more powerful than other deep learning models in terms of prediction ability. Full article

► Show Figures

Figure 1

Search Results (1,562)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,562)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI