Abstract
In this paper, we propose an extension to Banksealer, one of the most recent and effective banking fraud detection systems. In particular, until now Banksealer was unable to exploit analyst feedback to self-tune and improve its performance. It also depended on a complex set of parameters that had to be tuned by hand before operations.
To overcome both these limitations, we propose a supervised evolutionary wrapper approach, that considers analyst’s feedbacks on fraudulent transactions to automatically tune feature weighting and improve Banksealer’s detection performance. We do so by means of a multi-objective genetic algorithm.
We deployed our solution in a real-world setting of a large national banking group and conducted an in-depth experimental evaluation. We show that the proposed system was able to detect sophisticated frauds, improving Banksealer’s performance of up to 35% in some cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kaspersky Security Bulletin 2016. Technical report, Kaspersky Lab (2017). https://goo.gl/Jzkab2
Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: AAAI, vol. 91, pp. 547–552. Citeseer (1991)
Amer, M., Goldstein, M.: Nearest-neighbor and clustering based anomaly detection algorithms for RapidMiner. In: Proceedings of the 3rd RapidMiner Community Meeting and Conference (RCOMM 2012), pp. 1–12 (2012)
Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17 (2002)
Bolton, R.J., Hand, D.J., David J.H.: Unsupervised profiling methods for fraud detection. In: Proceedings of Credit Scoring and Credit Control VII, pp. 5–7 (2001)
Cardie, C.: Using decision trees to improve case-based learning. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 25–32 (1993)
Carminati, M., Caron, R., Maggi, F., Epifani, I., Zanero, S.: BankSealer: an online banking fraud analysis and decision support system. In: ICT Systems Security and Privacy Protection. IFIP Advances in Information and Communication Technology, vol. 428, pp. 380–394. Springer, Heidelberg (2014)
Carminati, M., Caron, R., Maggi, F., Epifani, I., Zanero, S.: BankSealer: a decision support system for online banking fraud analysis and investigation. Comput. Secur. 53, 175–186 (2015) http://dx.doi.org/10.1016/j.cose.2015.04.002
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009)
Cost, S., Salzberg, S.: A weighted nearest neighbor algorithm for learning with symbolic features. Mach. Learn. 10(1), 57–78 (1993)
Deb, K., Agrawal, R.B.: Simulated binary crossover for continuous search space. Complex Syst. 9(3), 1–15 (1994)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer Science & Business Media, New York (2003)
Goldstein, M., Dengel, A.: Histogram-Based Outlier Score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: Poster and Demo Track, pp. 59–63 (2012)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems Series. Elsevier Science & Tech, Amsterdam (2006)
Jirapech-Umpai, T., Aitken, S.: Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes. BMC Bioinform. 6(1), 148 (2005)
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)
Kohavi, R., John, G.H.: Automatic parameter selection by minimizing estimated error. In: ICML, pp. 304–312. Citeseer (1995)
Kohavi, R., John, G.H.: The wrapper approach. In: Feature Extraction, Construction and Selection, pp. 33–50. Springer (1998)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57
Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth International Conference on Uncertainty in Artificial Intelligence, pp. 399–406. Morgan Kaufmann Publishers Inc. (1994)
Mahalanobis, P.C.: On the generalized distance in statistics. In: Proceedings of the National Institute of Science of India, vol. 2, pp. 49–55 (1936)
Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9(3), 193–212 (1995)
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1998)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
Obayashi, S., Takahashi, S., Takeguchi, Y.: Niching and elitist models for MOGAs. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 260–269. Springer, Heidelberg (1998). doi:10.1007/BFb0056869
Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 1, pp. 568–571. IEEE (2002)
Parks, G.T., Miller, I.: Selective breeding in a multiobjective genetic algorithm. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 250–259. Springer, Heidelberg (1998). doi:10.1007/BFb0056868
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. SIGKDD Explor. Newsl. 6(1), 50–59 (2004)
Phuong, T.M., Lin, Z., Altman, R.B.: Choosing SNPs using feature selection. In: Proceedings of Computational Systems Bioinformatics Conference, 2005, pp. 301–309. IEEE (2005)
Punch III, W.F., Goodman, E.D., Pei, M., Chia-Shun, L., Hovland, P.D., Enbody, R.J.: Further research on feature selection and classification using genetic algorithms. In: ICGA, pp. 557–564 (1993)
Rudolph, G.: Evolutionary search under partially ordered sets. Dept. Comput. Sci./LS11, Univ. Dortmund, Dortmund, Germany, Technical report CI-67/99 (1999)
Soyel, H., Tekguc, U., Demirel, H.: Application of NSGA-II to feature selection for facial expression recognition. Comput. Electr. Eng. 37(6), 1232–1240 (2011)
Wei, W., Li, J., Cao, L., Ou, Y., Chen, J.: Effective detection of sophisticated online banking fraud on extremely imbalanced data. World Wide Web 16(4), 449–475 (2013). http://dx.doi.org/10.1007/s11280-012-0178-0
Wettschereck, D., Aha, D.W., Mohri, T.: A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif. Intell. Rev. 11(1–5), 273–314 (1997)
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. In: Liu, H., Motoda, H. (eds.) Feature Extraction, Construction and Selection, pp. 117–136. Springer, New York (1998)
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)
Acknowledgment
This work has received funding from the European Union’s Horizon 2020 Programme, under grant agreement 700326 “RAMSES”, as well as from projects co-funded by the Lombardy region and Secure Network S.r.l.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Carminati, M., Valentini, L., Zanero, S. (2017). A Supervised Auto-Tuning Approach for a Banking Fraud Detection System. In: Dolev, S., Lodha, S. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2017. Lecture Notes in Computer Science(), vol 10332. Springer, Cham. https://doi.org/10.1007/978-3-319-60080-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-60080-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60079-6
Online ISBN: 978-3-319-60080-2
eBook Packages: Computer ScienceComputer Science (R0)