An Optimized Deep Learning Approach for Detecting Fraudulent Transactions
<p>Card payment authorization process.</p> "> Figure 2
<p>Workflow of deep learning architectures.</p> "> Figure 3
<p>Recurrent neural network architectures.</p> "> Figure 4
<p>Long short-term memory architecture.</p> "> Figure 5
<p>Target variable distribution per fraud and non-fraud transactions.</p> "> Figure 6
<p>Hyperparameter optimization process workflow.</p> "> Figure 7
<p>Proposed solution.</p> "> Figure 8
<p>Results of 50 iterations of optimizing the hyperparameters for three deep learning architectures.</p> "> Figure 9
<p>Results of 70 iterations of optimizing the hyperparameters for three deep learning architectures.</p> "> Figure 10
<p>Results of 100 iterations of optimizing the hyperparameters for three deep learning architectures.</p> "> Figure 11
<p>Results of the comparison of hyperparameter optimization execution times.</p> "> Figure 12
<p>The ROC curve constructed after 50 iterations of optimizing hyperparameters using Bayesian algorithm.</p> "> Figure 13
<p>The ROC curve constructed after 70 iterations of optimizing hyperparameters using Bayesian algorithm.</p> "> Figure 14
<p>The ROC curve constructed after 100 iterations of optimizing hyperparameters using Bayesian algorithm.</p> "> Figure 15
<p>The precision–recall curve constructed after 50 iterations of optimizing hyperparameters using Bayesian algorithm.</p> "> Figure 16
<p>The precision–recall curve constructed after 70 iterations of optimizing hyperparameters using Bayesian algorithm.</p> "> Figure 17
<p>The precision–recall curve constructed after 100 iterations of optimizing hyperparameters using Bayesian algorithm.</p> ">
Abstract
:1. Introduction
- A hyperparameter tuning technique based on Bayesian optimization is proposed for selecting the best-performing deep learning architecture.
- Bayesian optimization is implemented for designing the best-performing deep learning architectures, including RNN, LSTM, and ANN.
- Several experiments based on the European credit card dataset are performed, and the obtained results show that the RNN is the most efficient with Bayesian hyperparameter optimization compared to the LSTM and ANN techniques.
2. Related Works
3. Background
3.1. Deep Learning Technique
Algorithm 1 Adam optimization algorithm |
|
3.2. Recurrent Neural Network (RNN)
3.3. Long Short-Term Memory (LSTM)
3.4. Bayesian Optimization
Algorithm 2 Bayesian Optimization |
|
- Initialization: Choose a set of initial hyperparameters to sample the objective function.
- Model construction: Fit the Gaussian Process to the observed data to catch the underlying structure of the function.
- Acquisition function: Use the expected improvement to balance exploration and exploitation and determine the next best point to sample.
- Sampling: Evaluate the objective function at the next best point determined by the acquisition function.
- Updating: Update the Gaussian Process model with the newly obtained sample.
- Repeat steps 3–5 until the maximum number of iterations is reached
- Make the best hyperparameter values found by the optimization process as the optimum.
3.4.1. Gaussian Process
- We denote by the training sets. The function values f are drawn according to a multivariate normal distribution, where
- In mathematics, is a function used to measure the degree to which two points are approximate samples. Furthermore, the diagonal element, without considering the effect of noise, consists of 1.
- Based on the function f, we calculate the function value at . According to the assumption of , is a dimensional normal distribution function, where
3.4.2. Acquisition Function
4. Methodology and Materials
4.1. Dataset
4.2. Data Preprocessing
4.3. Imbalanced Learning
Algorithm 3 RUS technique |
|
4.4. Hyperparameter Optimization Processes
Algorithm 4 Pseudo-code of hyperparameter optimization with 3-fold cross-validation |
|
- Dropout is a regularization technique used to prevent overfitting in deep learning architectures and lead to accurate performance. Its mechanism is to remove some neurons from the layer. To fix this number, the user sets a dropout rate. This approach makes the training process noisy as a side effect.
- The learning rate hyperparameter’s objective was to determine how much the model can adapt its parameters based on the estimated error. While a low learning rate leads to lengthy training processes, a large value leads to learning a suboptimal set of weights, resulting in an unstable learning process.
- The batch size is a gradient descent hyperparameter. It determines the number of training blocks to work through before updating the model parameters, where gradient descent is an iterative learning. The latter uses a training dataset to update the parameters of the model.
- The epoch is a hyperparameter of gradient descent that determines the number of complete passes through the training dataset.
- The activation function is implemented in an artificial neural network to learn and find intricate hidden patterns in the output dataset. So, the activation function can be seen as a controller for what information should be passed to the next neuron. It takes input data from the previous neurons and converts them into some form that can be the input of the next neuron. In this study, the activation hyperparameter takes three functions, namely ReLu (Equation (44)), Sigmoid (Equation (45)), and Tanh (Equation (46)). Those functions are described below.
5. Experimental Design
5.1. Model Architecture Design
Algorithm 5 Deep Learning-Based Fraud Detection |
|
5.2. Statistical Measure
- The accuracy score refers to the ratio of correctly classified transaction packets (normal or abnormal) to total credit card transaction samples. It can be formulated mathematically as:
- Precision score: The ratio of correctly classified non-authorized transactions to the total number of identified fraud transactions. It can be calculated as:
- AUC: Receiver Operating Characteristics (ROC) Curve refers to the relationship between the false positive rate (FPR) on the x-axis and the true positive rate (TPR) on the y-axis. The Area Under the ROC Curve (AUC) is formulated as:
- G-Mean: This metric is used to measure the balance between fraud and the accuracy of the identification of legitimate transactions. Poor performance is indicated by a low G-Mean. This measure is important to avoid the model overfitting the normal transactions and underfitting the abnormal transactions.
- Sensitivity (recall) or the “true positive rate” refers to the number of fraudulent transactions that are correctly predicted. Its formula is as follows:
6. Results and Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abakarim, Y.; Lahby, M.; Attioui, A. An efficient real time model for credit card fraud detection based on deep learning. In Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications, Rabat, Morocco, 24–25 October 2018; pp. 1–7. [Google Scholar]
- Arora, V.; Leekha, R.S.; Lee, K.; Kataria, A. Facilitating user authorization from imbalanced data logs of credit cards using artificial intelligence. Mob. Inf. Syst. 2020, 2020, 8885269. [Google Scholar] [CrossRef]
- Błaszczyński, J.; de Almeida Filho, A.T.; Matuszyk, A.; Szeląg, M.; Słowiński, R. Auto loan fraud detection using dominance-based rough set approach versus machine learning methods. Expert Syst. Appl. 2021, 163, 113740. [Google Scholar] [CrossRef]
- Branco, B.; Abreu, P.; Gomes, A.S.; Almeida, M.S.; Ascensão, J.T.; Bizarro, P. Interleaved sequence RNNs for fraud detection. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; pp. 3101–3109. [Google Scholar]
- Ahmed, W.; Rasool, A.; Javed, A.R.; Kumar, N.; Gadekallu, T.R.; Jalil, Z.; Kryvinska, N. Security in next generation mobile payment systems: A comprehensive survey. IEEE Access 2021, 9, 115932–115950. [Google Scholar] [CrossRef]
- Dornadula, V.N.; Geetha, S. Credit card fraud detection using machine learning algorithms. Procedia Comput. Sci. 2019, 165, 631–641. [Google Scholar] [CrossRef]
- Tayebi, M.; El Kafhali, S. A weighted average ensemble learning based on the cuckoo search algorithm for fraud transactions detection. In Proceedings of the 2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA), Casablanca, Morocco, 22–23 November 2023; pp. 1–6. [Google Scholar]
- Fang, Y.; Zhang, Y.; Huang, C. Credit Card Fraud Detection Based on Machine Learning. Comput. Mater. Contin. 2019, 61, 185–195. [Google Scholar] [CrossRef]
- Forough, J.; Momtazi, S. Ensemble of deep sequential models for credit card fraud detection. Appl. Soft Comput. 2021, 99, 106883. [Google Scholar] [CrossRef]
- Hu, X.; Chen, H.; Zhang, R. Short paper: Credit card fraud detection using LightGBM with asymmetric error control. In Proceedings of the 2019 Second International Conference on Artificial Intelligence for Industries (AI4I), Laguna Hills, CA, USA, 25–27 September 2019; pp. 91–94. [Google Scholar]
- Kousika, N.; Vishali, G.; Sunandhana, S.; Vijay, M.A. Machine learning based fraud analysis and detection system. J. Phys. Conf. Ser.. 2021, 1916, 012115. [Google Scholar] [CrossRef]
- Tan, G.W.H.; Ooi, K.B.; Chong, S.C.; Hew, T.S. NFC mobile credit card: The next frontier of mobile payment? Telemat. Inform. 2014, 31, 292–307. [Google Scholar] [CrossRef]
- Alarfaj, F.K.; Malik, I.; Khan, H.U.; Almusallam, N.; Ramzan, M.; Ahmed, M. Credit Card Fraud Detection Using State-of-the-Art Machine Learning and Deep Learning Algorithms. IEEE Access 2022, 10, 39700–39715. [Google Scholar] [CrossRef]
- Carcillo, F.; Dal Pozzolo, A.; Le Borgne, Y.A.; Caelen, O.; Mazzer, Y.; Bontempi, G. Scarff: A scalable framework for streaming credit card fraud detection with spark. Inf. Fusion 2018, 41, 182–194. [Google Scholar] [CrossRef]
- Wei, W.; Li, J.; Cao, L.; Ou, Y.; Chen, J. Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 2013, 16, 449–475. [Google Scholar] [CrossRef]
- Kim, J.; Kim, H.J.; Kim, H. Fraud detection for job placement using hierarchical clusters-based deep neural networks. Appl. Intell. 2019, 49, 2842–2861. [Google Scholar] [CrossRef]
- El Kafhali, S.; Tayebi, M. XGBoost based solutions for detecting fraudulent credit card transactions. In Proceedings of the 2022 International Conference on Advanced Creative Networks and Intelligent Systems (ICACNIS), Bandung, Indonesia, 23 November 2022; pp. 1–6. [Google Scholar]
- Hajek, P.; Abedin, M.Z.; Sivarajah, U. Fraud detection in mobile payment systems using an XGBoost-based framework. Inf. Syst. Front. 2023, 25, 1985–2003. [Google Scholar] [CrossRef] [PubMed]
- Seera, M.; Lim, C.P.; Kumar, A.; Dhamotharan, L.; Tan, K.H. An intelligent payment card fraud detection system. Ann. Oper. Res. 2024, 334, 445–467. [Google Scholar] [CrossRef]
- Van Belle, R.; Baesens, B.; De Weerdt, J. CATCHM: A novel network-based credit card fraud detection method using node representation learning. Decis. Support Syst. 2023, 164, 113866. [Google Scholar] [CrossRef]
- Jha, S.; Guillen, M.; Westland, J.C. Employing transaction aggregation strategy to detect credit card fraud. Expert Syst. Appl. 2012, 39, 12650–12657. [Google Scholar] [CrossRef]
- Tayebi, M.; El Kafhali, S. Credit Card Fraud Detection Based on Hyperparameters Optimization Using the Differential Evolution. Int. J. Inf. Secur. Priv. (IJISP) 2022, 16, 1–21. [Google Scholar] [CrossRef]
- Mathew, A.; Amudha, P.; Sivakumari, S. Deep learning techniques: An overview. In Proceedings of the International Conference on Advanced Machine Learning Technologies and Applications, Jaipur, India, 13–15 February 2020; Springer: Singapore, 2021; pp. 599–608. [Google Scholar]
- Salloum, S.A.; Alshurideh, M.; Elnagar, A.; Shaalan, K. Machine learning and deep learning techniques for cybersecurity: A review. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision, Cairo, Egypt, 8–9 April 2020; Springer: Cham, Switzerland, 2020; pp. 50–57. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Pan, Z.; Li, Y.; Yang, X.; Geng, C.; Li, X. Advanced root mean square propagation with the warm-up algorithm for fiber coupling. Opt. Express 2023, 31, 23974–23989. [Google Scholar] [CrossRef]
- Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Wang, C.; Niepert, M. State-regularized recurrent neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6596–6606. [Google Scholar]
- Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
- Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
- Garnett, R. Bayesian Optimization; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
- Swinburne, R. Bayes’ Theorem. Rev. Philos. Fr. L’etranger 2004, 194, 250–251. [Google Scholar]
- Dunlop, M.M.; Girolami, M.A.; Stuart, A.M.; Teckentrup, A.L. How deep are deep Gaussian processes? J. Mach. Learn. Res. 2018, 19, 1–46. [Google Scholar]
- Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar]
- Credit Card Fraud Dataset. 2023. Available online: https://www.kaggle.com/mlg-ulb/creditcardfraud/data (accessed on 26 December 2023).
- Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
- El Kafhali, S.; Tayebi, M. Generative adversarial neural networks based oversampling technique for imbalanced credit card dataset. In Proceedings of the 2022 6th SLAAI International Conference on Artificial Intelligence (SLAAI-ICAI), Colombo, Sri Lanka, 1–2 December 2022; pp. 1–5. [Google Scholar]
- Hordri, N.F.; Yuhaniz, S.S.; Azmi, N.F.M.; Shamsuddin, S.M. Handling class imbalance in credit card fraud using resampling methods. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 390–396. [Google Scholar] [CrossRef]
- Feurer, M.; Hutter, F. Hyperparameter optimization. In Automated Machine Learning: Methods, Systems, Challenges; Springer: Cham, Switzerland, 2019; pp. 3–33. [Google Scholar]
- Tayebi, M.; El Kafhali, S. Hyperparameter optimization using genetic algorithms to detect frauds transactions. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision, Settat, Morocco, 28–30 June 2021; Springer: Cham, Switzerland, 2021; pp. 288–297. [Google Scholar]
- Tayebi, M.; El Kafhali, S. Performance analysis of metaheuristics based hyperparameters optimization for fraud transactions detection. Evol. Intell. 2024, 17, 921–939. [Google Scholar] [CrossRef]
Variable | Definition | Type |
---|---|---|
Class | Target feature in this dataset. Takes two values: 0: Legitimate; 1: Fraud | Categorical |
Amount | The amount of the transaction sample | Numeric |
Time | The difference in time between the first and the current transactions, in seconds | Numeric |
V1 to V28 | Features transformed using PCA technique to protect cardholders’ privacy and confidentiality | Numeric |
Hyperparameter | Optimization Rate | Type |
---|---|---|
Activation function | Categorical | |
Learning rate | Continue | |
Dropout rate of layer 1 | Continue | |
Dropout rate of layer 2 | Continue | |
Batch size | discrete | |
Epochs | discrete | |
Number of neurons in layer 1 | discrete | |
Number of neurons in layer 2 | discrete | |
Number of LSTM Units in layer 1 | discrete | |
Number of LSTM Units in layer 2 | discrete | |
Number of RNN Units in layer 1 | discrete | |
Number of RNN Units in layer 2 | discrete |
Model | ACC | PER | GM | SEN | AUC |
---|---|---|---|---|---|
ANN | 0.8939 | 1 | 0.8024 | 0.6439 | 0.8356 |
LSTM | 0.7562 | 1 | 0.4264 | 0.1818 | 0.9291 |
RNN | 0.9593 | 0.9596 | 0.9418 | 0.9015 | 0.9767 |
Model | ACC | PER | GM | SEN | AUC |
---|---|---|---|---|---|
ANN | 0.9525 | 1 | 0.9170 | 0.8409 | 0.9207 |
LSTM | 0.9548 | 0.9745 | 0.9288 | 0.8712 | 0.9372 |
RNN | 0.9593 | 1 | 0.9293 | 0.8636 | 0.9752 |
Model | ACC | PER | GM | SEN | AUC |
---|---|---|---|---|---|
ANN | 0.9548 | 0.9827 | 0.9263 | 0.8636 | 0.9323 |
LSTM | 0.9571 | 1 | 0.9252 | 0.8560 | 0.9641 |
RNN | 0.9593 | 1 | 0.9293 | 0.8636 | 0.9752 |
Model | 50 Iterations | 70 Iterations | 100 Iterations |
---|---|---|---|
ANN | 604.66 | 737.03 | 1045.28 |
LSTM | 1395.57 | 2761.79 | 2603.52 |
RNN | 1039.28 | 1934.66 | 1934.66 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
El Kafhali, S.; Tayebi, M.; Sulimani, H. An Optimized Deep Learning Approach for Detecting Fraudulent Transactions. Information 2024, 15, 227. https://doi.org/10.3390/info15040227
El Kafhali S, Tayebi M, Sulimani H. An Optimized Deep Learning Approach for Detecting Fraudulent Transactions. Information. 2024; 15(4):227. https://doi.org/10.3390/info15040227
Chicago/Turabian StyleEl Kafhali, Said, Mohammed Tayebi, and Hamza Sulimani. 2024. "An Optimized Deep Learning Approach for Detecting Fraudulent Transactions" Information 15, no. 4: 227. https://doi.org/10.3390/info15040227