Image-Based Malware Detection Using α-Cuts and Binary Visualisation
<p>A 2D representation of a malicious .rtf file (159.71 KB) with the <tt>msword/cve20103333</tt> Trojan.</p> "> Figure 2
<p>A 2D representation of a clean .rtf file (175 KB).</p> "> Figure 3
<p>The H-curve, a Hamiltonian curve, was discovered by [<a href="#B74-applsci-13-04624" class="html-bibr">74</a>] in 1997. This is the sequential order for a 16 × 16 representation.</p> "> Figure 4
<p>The fuzzy set for the value <span class="html-italic">colour</span>. Here, we display the colour <span class="html-italic">Red</span>, although the same shape applies for <span class="html-italic">Green</span> and <span class="html-italic">Blue</span>. We also demonstrate the crisp set <math display="inline"><semantics> <mrow> <msup> <mrow/> <mrow> <mn>0.3</mn> </mrow> </msup> <mi>A</mi> </mrow> </semantics></math>.</p> "> Figure 5
<p>We considered each R, G, and B colour array of an image to be a fuzzy set (<b>left</b>). We then reduced each set to an <math display="inline"><semantics> <mi>α</mi> </semantics></math>-cut (<b>right</b>) according to a value <math display="inline"><semantics> <mi>α</mi> </semantics></math>. In this example, <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mn>0.3</mn> </mrow> </semantics></math>. The newly created <math display="inline"><semantics> <mrow> <msup> <mrow/> <mi>α</mi> </msup> <mi>A</mi> </mrow> </semantics></math> sets were populated according to this: any element of each fuzzy set that was greater than or equal to <math display="inline"><semantics> <mrow> <mn>0.3</mn> </mrow> </semantics></math> was also an element of the crisp set <math display="inline"><semantics> <mrow> <msup> <mrow/> <mrow> <mn>0.3</mn> </mrow> </msup> <mi>A</mi> </mrow> </semantics></math> and therefore took a value of 1, with the rest taking a value of 0.</p> "> Figure 6
<p>A malicious image in (<b>a</b>) and reduced images of it, for <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mo>[</mo> <mn>0.1</mn> <mo>,</mo> <mo>⋯</mo> <mo>,</mo> <mn>0.9</mn> <mo>]</mo> </mrow> </semantics></math> in (<b>b</b>–<b>j</b>), respectively.</p> "> Figure 7
<p>A natural image in (<b>a</b>) and reduced images of it, for <math display="inline"><semantics> <mrow> <mi>α</mi> <mo>=</mo> <mo>[</mo> <mn>0.1</mn> <mo>,</mo> <mo>⋯</mo> <mo>,</mo> <mn>0.9</mn> <mo>]</mo> </mrow> </semantics></math> in (<b>b</b>–<b>j</b>), respectively.</p> "> Figure 8
<p>The learning curves of mean training and validation accuracies (<b>a</b>) and mean training and validation losses (<b>b</b>) during training of the best ResNet50 model. The model was trained on 10,000 images of the proposed colouring scheme that had been processed with an <math display="inline"><semantics> <mi>α</mi> </semantics></math>-cut of 0.6.</p> ">
Abstract
:1. Introduction
2. Literature Review
2.1. Overview
2.2. Malware/Benign File Detection
2.3. Malware Detection and Family Classification
2.4. Network Traffic Malware Detection/Classification
2.5. Limitations of Existing Methods
- Limited dataset or limited diversity in the dataset used for training and testing the models, which may limit the generalisability of the models to real-world scenarios.
- Some studies have reported high accuracy rates, but the models may have to overfit to the training data and may not perform as well on unseen data.
- Use of only static images of malware samples, which may not capture the dynamic behaviour of malware.
- Vulnerability to adversarial attacks that can evade the detection of the models.
- The computational complexity of some models may make them unsuitable for deployment on real-world scenarios with low-power and resource-constrained devices.
- Lack of interpretability of some models, which may make it difficult to understand how they arrive at their classifications and limit their adoption in security-critical applications.
- Collecting and labelling malware samples is a labour-intensive and time-consuming process, and some studies have proposed techniques to reduce labelling efforts. However, the effectiveness of these techniques remains to be evaluated.
3. Materials and Methods
3.1. Overview
3.2. Clean and Malicious File Collection
3.3. Creation of Image Datasets
3.3.1. File Conversion to Image Format
3.3.2. Colour Assignment per ASCII Character Class
The RGB Colour Model
ASCII Colour Classes
3.3.3. Colour Tone Adjustment through Fuzzy Set Theory
Fuzzy Sets
3.3.4. The Mapping
The H-Index Space-Filling Curve
3.4. Generating More Image Datasets Based on -Cuts
3.4.1. Creating Crisp Sets with an -Cut
The α-Cut
3.5. Deep Learning for Image Recognition
Convolutional Neural Networks
3.6. Performance Evaluation Metrics
- Accuracy is the percentage of images (malicious and benign) that were correctly predicted:
- Precision is the percentage of correctly predicted malicious images over the total amount of predicted malicious images:
- Recall is the percentage of correctly predicted malicious images over the total amount of malicious images:
- F-score is the harmonic mean between Precision and Recall, and demonstrates the robustness of a model:
3.7. Programming Language
3.8. GPU Resources
4. Results
4.1. Overview
4.2. Experimental Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sahin, M.; Bahtiyar, S. A Survey on Malware Detection with Deep Learning. In Proceedings of the 13th International Conference on Security of Information and Networks, Merkez, Turkey, 4–7 November 2020; pp. 1–6. [Google Scholar]
- Son, T.T.; Lee, C.; Le-Minh, H.; Aslam, N.; Dat, V.C. An enhancement for image-based malware classification using machine learning with low dimension normalized input images. J. Inf. Secur. Appl. 2022, 69, 103308. [Google Scholar] [CrossRef]
- Vasan, D.; Alazab, M.; Wassan, S.; Safaei, B.; Zheng, Q. Image-Based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 2020, 92, 101748. [Google Scholar] [CrossRef]
- Stupka, V.; Horák, M.; Husák, M. Protection of personal data in security alert sharing platforms. In Proceedings of the 12th International Conference on Availability, Reliability and Security, Reggio Calabria, Italy, 29 August–1 September 2017; pp. 1–8. [Google Scholar]
- Pawlicka, A.; Jaroszewska-Choras, D.; Choras, M.; Pawlicki, M. Guidelines for stego/malware detection tools: Achieving GDPR compliance. IEEE Technol. Soc. Mag. 2020, 39, 60–70. [Google Scholar] [CrossRef]
- Yoo, I. Visualizing windows executable viruses using self-organizing maps. In Proceedings of the 2004 ACM Workshop on Visualization and Data mining For Computer Security, Washington, DC, USA, 29 October 2004; pp. 82–89. [Google Scholar]
- Conti, G.; Dean, E.; Sinda, M.; Sangster, B. Visual reverse engineering of binary and data files. In Proceedings of the International Workshop on Visualization for Computer Security; Springer: Berlin/Heidelberg, Germany, 2008; pp. 1–17. [Google Scholar]
- Nataraj, L.; Karthikeyan, S.; Jacob, G.; Manjunath, B.S. Malware images: Visualization and automatic classification. In Proceedings of the 8th International Symposium on Visualization for Cyber Security, Pittsburgh, PA, USA, 20 July 2011; pp. 1–7. [Google Scholar]
- Nataraj, L.; Yegneswaran, V.; Porras, P.; Zhang, J. A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, Chicago, IL, USA, 21 October 2011; pp. 21–30. [Google Scholar]
- Nataraj, L.; Manjunath, B. Spam: Signal processing to analyze malware [applications corner]. IEEE Signal Process. Mag. 2016, 33, 105–117. [Google Scholar] [CrossRef] [Green Version]
- Ni, S.; Qian, Q.; Zhang, R. Malware identification using visualization images and deep learning. Comput. Secur. 2018, 77, 871–885. [Google Scholar] [CrossRef]
- Le, Q.; Boydell, O.; Mac Namee, B.; Scanlon, M. Deep learning at the shallow end: Malware classification for non-domain experts. Digit. Investig. 2018, 26, S118–S126. [Google Scholar] [CrossRef]
- Baptista, I.; Shiaeles, S.; Kolokotronis, N. A novel malware detection system based on machine learning and binary visualization. In Proceedings of the 2019 IEEE International Conference on Communications Workshops (ICC Workshops), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
- O’Shaughnessy, S. Image-based malware classification: A space filling curve approach. In Proceedings of the 2019 IEEE Symposium on Visualization for Cyber Security (VizSec), Vancouver, BC, Canada, 23 October 2019; pp. 1–10. [Google Scholar]
- O’Shaughnessy, S.; Sheridan, S. Image-based malware classification hybrid framework based on space-filling curves. Comput. Secur. 2022, 116, 102660. [Google Scholar] [CrossRef]
- Shire, R.; Shiaeles, S.; Bendiab, K.; Ghita, B.; Kolokotronis, N. Malware squid: A novel iot malware traffic analysis framework using convolutional neural network and binary visualisation. In Internet of Things, Smart Spaces, and Next Generation Networks and Systems; Springer: Berlin/Heidelberg, Germany, 2019; pp. 65–76. [Google Scholar]
- Saridou, B.; Rose, J.R.; Shiaeles, S.; Papadopoulos, B. SAGMAD—A Signature Agnostic Malware Detection System Based on Binary Visualisation and Fuzzy Sets. Electronics 2022, 11, 1044. [Google Scholar] [CrossRef]
- Cortesi, A. Available online: binvis.io (accessed on 3 February 2023).
- Khattab, D.; Ebied, H.M.; Hussein, A.S.; Tolba, M.F. Color image segmentation based on different color space models using automatic GrabCut. Sci. World J. 2014, 2014, 126025. [Google Scholar] [CrossRef]
- Jungmann, A.; Jatzkowski, J.; Kleinjohann, B. Evaluation of color spaces for robust image segmentation. In Proceedings of the 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, 5–8 January 2014; Volume 1, pp. 648–655. [Google Scholar]
- Balaji, T.; Sumathi, D.M. Effective features of remote sensing image classification using interactive adaptive thresholding method. arXiv 2014, arXiv:1401.7743. [Google Scholar]
- Srinivas, B.; Prasad, J.R. Enhanced Segmentation Algorithm for Hyper-spectral Imaging (HSI). Available online: https://www.jcreview.com/admin/Uploads/Files/61a8692a1af917.81078695.pdf (accessed on 18 January 2023).
- Randive, K.; Mohan, R.; Sivakrishna, A.M. An efficient pattern-based approach for insider threat classification using the image-based feature representation. J. Inf. Secur. Appl. 2023, 73, 103434. [Google Scholar] [CrossRef]
- Sai Adhinesh Reddy, T.; Varma Vadlamudi, V.Y.; Acharya, S.; Rawat, U.; Bhatnagar, R. Windows Malware Detection Using CNN and AlexNet Learning Models. In Proceedings of the 8th International Conference on Advanced Intelligent Systems and Informatics, Cairo, Egypt, 20–22 November 2022; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 271–283. [Google Scholar]
- Shaukat, K.; Luo, S.; Varadharajan, V. A novel deep learning-based approach for malware detection. Eng. Appl. Artif. Intell. 2023, 122, 106030. [Google Scholar] [CrossRef]
- Marais, B.; Quertier, T.; Chesneau, C. Malware analysis with artificial intelligence and a particular attention on results interpretability. In Distributed Computing and Artificial Intelligence, Volume 1: 18th International Conference 18; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 43–55. [Google Scholar]
- Ma, Z.; Zhang, Z.; Liu, C.; Hu, T.; Li, H.; Ren, B. Visualizable Malware Detection based on Multi-dimension Dynamic Behaviors. In Proceedings of the 2022 International Conference on Networking and Network Applications (NaNA), Urumqi, China, 3–5 December 2022; pp. 247–252. [Google Scholar]
- Mane, D.T.; Kumbharkar, P.B.; Javheri, S.B.; Moorthy, R. An Adaptable Ensemble Architecture for Malware Detection. In International Conference on Innovative Computing and Communications: Proceedings of ICICC; Springer: Singapore, 2022; Volume 3, pp. 647–659. [Google Scholar]
- Malani, H.; Bhat, A.; Palriwala, S.; Aditya, J.; Chaturvedi, A. A Unique Approach to Malware Detection Using Deep Convolutional Neural Networks. In Proceedings of the 2022 4th International Conference on Electrical, Control and Instrumentation Engineering (ICECIE), KualaLumpur, Malaysia, 26 November 2022; pp. 1–6. [Google Scholar]
- Lin, C.J.; Huang, M.S.; Lee, C.L. Malware Classification Using Convolutional Fuzzy Neural Networks Based on Feature Fusion and the Taguchi Method. Appl. Sci. 2022, 12, 12937. [Google Scholar] [CrossRef]
- Lin, C.J.; Lin, X.Y.; Jhang, J.Y. Malware classification using a Taguchi-based deep learning network. Sens. Mater 2022, 34, 3569–3580. [Google Scholar] [CrossRef]
- Wang, S.; Wang, J.; Song, Y.; Li, S.; Huang, W. Malware Variants Detection Model Based on MFF–HDBA. Appl. Sci. 2022, 12, 9593. [Google Scholar] [CrossRef]
- Chong, X.; Gao, Y.; Zhang, R.; Liu, J.; Huang, X.; Zhao, J. Classification of Malware Families Based on Efficient-Net and 1D-CNN Fusion. Electronics 2022, 11, 3064. [Google Scholar] [CrossRef]
- Parihar, A.S.; Kumar, S.; Khosla, S. S-DCNN: Stacked deep convolutional neural networks for malware classification. Multimed. Tools Appl. 2022, 81, 30997–31015. [Google Scholar] [CrossRef]
- Park, K.W.; Bu, S.J.; Cho, S.B. Evolutionary Triplet Network of Learning Disentangled Malware Space for Malware Classification. In Proceedings of the Hybrid Artificial Intelligent Systems: 17th International Conference, HAIS 2022, Salamanca, Spain, 5–7 September 2022; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 311–322. [Google Scholar]
- Shukla, S.; Dhavlle, A.; PD, S.M.; Homayoun, H.; Rafatirad, S. Iron-Dome: Securing IoT Networked Systems at Runtime by Network and Device Characteristics to Confine Malware Epidemics. In Proceedings of the 2022 IEEE 40th International Conference on Computer Design (ICCD), Olympic Valley, CA, USA, 23–26 October 2022; pp. 259–262. [Google Scholar]
- Kwan, L.M. Markov Image with Transfer Learning for Malware Detection and Classification. In Proceedings of the TENCON 2022—2022 IEEE Region 10 Conference (TENCON), Hong Kong, China, 1–4 November 2022; pp. 1–6. [Google Scholar]
- Kiger, J.; Ho, S.S.; Heydari, V. Malware Binary Image Classification Using Convolutional Neural Networks. In Proceedings of the International Conference on Cyber Warfare and Security, Islamabad, Pakistan, 7–8 December 2022; Volume 17, pp. 469–478. [Google Scholar]
- Dharmalaksana, P.S.; Mantoro, T.; Khakim, L.; Nurseno, M. Improved Malware Detection Results using Visualization-Based Detection Techniques ant Convolutional Neural Network. In Proceedings of the 2022 IEEE 8th International Conference on Computing, Engineering and Design (ICCED), Sukabumi, Indonesia, 28–29 July 2022; pp. 1–5. [Google Scholar]
- AlGarni, M.D.; AlRoobaea, R.; Almotiri, J.; Ullah, S.S.; Hussain, S.; Umar, F. An efficient convolutional neural network with transfer learning for malware classification. Wirel. Commun. Mob. Comput. 2022, 2022, 4841741. [Google Scholar] [CrossRef]
- Cher, G.; Liu, S. Reducing Malware labeling Efforts Through Efficient Prototype Selection. In Proceedings of the 2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS), Hiroshima, Japan, 26–30 March 2022; pp. 17–22. [Google Scholar]
- Omar, M. New Approach to Malware Detection Using Optimized Convolutional Neural Network. In Proceedings of the Machine Learning for Cybersecurity: Innovative Deep Learning Solutions; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 13–35. [Google Scholar]
- Ahmed, I.; Anisetti, M.; Ahmad, A.; Jeon, G. A Multilayer Deep Learning Approach for Malware Classification in 5G-Enabled IIoT. IEEE Trans. Ind. Inform. 2022, 19, 1495–1503. [Google Scholar] [CrossRef]
- Onoja, M.; Aimufua, G.; Jegede, A.; Oyedele, A.; Mazadu, J.; Olibodum, K. Exploring the Effectiveness and Efficiency of LightGBM Algorithm for Windows Malware Detection. Available online: https://www.researchgate.net/profile/Abayomi-Jegede/publication/366167472_2022_5th_Information_Technology_for_Education_and_Development_ITED/links/63945b6311e9f00cda32f6fb/2022-5th-Information-Technology-for-Education-and-Development-ITED.pdf (accessed on 27 January 2023).
- Chauhan, D.; Singh, H.; Hooda, H.; Gupta, R. Classification of malware using visualization techniques. In International Conference on Innovative Computing and Communications: Proceedings of ICICC; Springer: Singapore, 2022; Volume 3, pp. 739–750. [Google Scholar]
- Sern, L.J.; Keng, T.K.; Fu, C.Z. BinImg2Vec: Augmenting Malware Binary Image Classification with Data2Vec. In Proceedings of the 2022 1st International Conference on AI in Cybersecurity (ICAIC), Victoria, TX, USA, 24–26 May 2022; pp. 1–6. [Google Scholar]
- Kavitha, P.M.; Muruganantham, B. Mal_CNN: An Enhancement for Malicious Image Classification Based on Neural Network. Cybern. Syst. 2022, 1–14. [Google Scholar] [CrossRef]
- Belguendouz, H.; Guerid, H.; Kaddour, M. Static Classification of IoT Malware using Grayscale Image Representation and Lightweight Convolutional Neural Networks. In Proceedings of the 2022 5th International Conference on Advanced Communication Technologies and Networking (CommNet), Marrakech, Morocco, 12–14 December 2022; pp. 1–8. [Google Scholar]
- Agarwal, R.; Patel, S.; Katiyar, S.; Nailwal, S. Malware classification using automated transmutation and CNN. In Advanced Computing and Intelligent Technologies: Proceedings of ICACIT; Springer: Singapore, 2022; pp. 73–81. [Google Scholar]
- Fathurrahman, A.; Bejo, A.; Ardiyanto, I. Lightweight Convolution Neural Network for Image-Based Malware Classification on Embedded Systems. In Proceedings of the 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jakarta, Indonesia, 29–30 January 2022; pp. 12–16. [Google Scholar]
- Ben Abdel Ouahab, I.; Elaachak, L.; Bouhorma, M. Image-Based Malware Classification Using Multi-layer Perceptron. In Networking, Intelligent Systems and Security: Proceedings of NISS; Springer: Singapore, 2022; pp. 453–464. [Google Scholar]
- Qiu, L.; Wang, S.; Wang, J.; Wang, Y.; Huang, W. Malware Classification based on a Light-weight Architecture of CNN: MalShuffleNet. In Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022; pp. 1047–1050. [Google Scholar]
- Nguyen, H.; Di Troia, F.; Ishigaki, G.; Stamp, M. Generative adversarial networks and image-based malware classification. J. Comput. Virol. Hacking Tech. 2023, 1–17. [Google Scholar] [CrossRef]
- Nagaraju, R.; Stamp, M. Auxiliary-classifier GAN for malware analysis. In Artificial Intelligence for Cybersecurity; Springer: Berlin/Heidelberg, Germany, 2022; pp. 27–68. [Google Scholar]
- Tekerek, A.; Yapici, M.M. A novel malware classification and augmentation model based on convolutional neural network. Comput. Secur. 2022, 112, 102515. [Google Scholar] [CrossRef]
- Kuo, W.C.; Chen, Y.T.; Huang, Y.C.; Wang, C.C. Malware Detection Based on Image Conversion. In Proceedings of the 2021 International Conference on Security and Information Technologies with AI, Internet Computing and Big-data Applications, Taichung, Taiwan, 18–20 November 2021; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 180–190. [Google Scholar]
- Tran, K.; Di Troia, F.; Stamp, M. Robustness of Image-Based Malware Analysis. In Proceedings of the Silicon Valley Cybersecurity Conference: Third Conference, SVCC 2022, Virtual Event, 17–19 August 2022; Revised Selected Papers. Springer: Berlin/Heidelberg, Germany, 2023; pp. 3–21. [Google Scholar]
- Agrafiotis, G.; Makri, E.; Flionis, I.; Lalas, A.; Votis, K.; Tzovaras, D. Image-based Neural Network Models for Malware Traffic Classification using PCAP to Picture Conversion. In Proceedings of the 17th International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; pp. 1–7. [Google Scholar]
- Kim, H.M.; Lee, K.H. IIoT Malware Detection Using Edge Computing and Deep Learning for Cybersecurity in Smart Factories. Appl. Sci. 2022, 12, 7679. [Google Scholar] [CrossRef]
- Rose, J.R.; Swann, M.; Grammatikakis, K.P.; Koufos, I.; Bendiab, G.; Shiaeles, S.; Kolokotronis, N. IDERES: Intrusion detection and response system using machine learning and attack graphs. J. Syst. Archit. 2022, 131, 102722. [Google Scholar] [CrossRef]
- Toldinas, J.; Venčkauskas, A.; Liutkevičius, A.; Morkevičius, N. Framing Network Flow for Anomaly Detection Using Image Recognition and Federated Learning. Electronics 2022, 11, 3138. [Google Scholar] [CrossRef]
- Parkour, M. 16,800 Clean and 11,960 Malicious Files for Signature Testing and Research. Available online: https://contagiodump.blogspot.com/2013/03/16800-clean-and-11960-malicious-files.html (accessed on 6 October 2022).
- Palus, H. Representations of colour images in different colour spaces. In The Colour Image Processing Handbook; Springer: Berlin/Heidelberg, Germany, 1998; pp. 67–90. [Google Scholar]
- Chavolla, E.; Zaldivar, D.; Cuevas, E.; Perez, M.A. Color spaces advantages and disadvantages in image color clustering segmentation. In Advances in Soft Computing and Machine Learning in Image Processing; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–22. [Google Scholar]
- Maxwell, J.C. XVIII.—Experiments on Colour, as perceived by the Eye, with Remarks on Colour-Blindness. Earth Environ. Sci. Trans. R. Soc. Edinb. 1857, 21, 275–298. [Google Scholar] [CrossRef] [Green Version]
- Maxwell, J.C., IV. On the theory of compound colours, and the relations of the colours of the spectrum. Philos. Trans. R. Soc. Lond. 1860, 10, 404–409. [Google Scholar]
- Klir, G.; Yuan, B. Fuzzy Sets and Fuzzy Logic; Prentice Hall: Upper Saddle River, NJ, USA, 1995; Volume 4. [Google Scholar]
- Wattenberg, M. A note on space-filling visualizations and space-filling curves. In Proceedings of the IEEE Symposium on Information Visualization, 2005. INFOVIS 2005, Minneapolis, MN, USA, 23–25 October 2005; pp. 181–186. [Google Scholar]
- Mandelbrot, B. Fractals; Freeman: San Francisco, CA, USA, 1977. [Google Scholar]
- He, T.; Tai, J.; Shan, Y.; Wang, X.; Liu, X. A fast acoustic emission beamforming localization method based on Hilbert curve. Mech. Syst. Signal Process. 2019, 133, 106291. [Google Scholar] [CrossRef]
- Keller, A.; Wächter, C.; Binder, N. Rendering Along the Hilbert Curve. In Advances in Modeling and Simulation; Springer: Berlin/Heidelberg, Germany, 2022; pp. 319–332. [Google Scholar]
- Wang, X.; Sun, Y.; Sun, Q.; Lin, W.; Wang, J.Z.; Li, W. HCIndex: A Hilbert-Curve-based clustering index for efficient multi-dimensional queries for cloud storage systems. Clust. Comput. 2022, 1–15. [Google Scholar] [CrossRef]
- Hilbert, D. Ueber die reellen Züge algebraischer Curven. Math. Ann. 1891, 38, 115–138. [Google Scholar] [CrossRef] [Green Version]
- Niedermeier, R.; Reinhardt, K.; Sanders, P. Towards optimal locality in mesh-indexings. Discret. Appl. Math. 2002, 117, 211–237. [Google Scholar] [CrossRef] [Green Version]
- Ross, T.J. Fuzzy Logic with Engineering Applications; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
- Neocognitron, K.F. A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar]
- Fukushima, K.; Miyake, S. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and Cooperation in Neural Nets; Springer: Berlin/Heidelberg, Germany, 1982; pp. 267–285. [Google Scholar]
- Fukushima, K.; Miyake, S. Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognit. 1982, 15, 455–469. [Google Scholar] [CrossRef]
- Fukushima, K.; Miyake, S.; Ito, T. Neocognitron: A neural network model for a mechanism of visual pattern recognition. IEEE Trans. Syst. Man Cybern. 1983, SMC-13, 826–834. [Google Scholar] [CrossRef]
- Fukushima, K. A neural network model for selective attention in visual pattern recognition. Biol. Cybern. 1986, 55, 5–15. [Google Scholar] [CrossRef]
- Fukushima, K. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw. 1988, 1, 119–130. [Google Scholar] [CrossRef]
- Fukushima, K. Analysis of the process of visual pattern recognition by the neocognitron. Neural Netw. 1989, 2, 413–420. [Google Scholar] [CrossRef]
- LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
- LeCun, Y.; Boser, B.; Denker, J.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 1989, 2, 396–404. [Google Scholar]
- Bromley, J.; Guyon, I.; LeCun, Y.; Säckinger, E.; Shah, R. Signature verification using a “siamese” time delay neural network. Adv. Neural Inf. Process. Syst. 1993, 6, 737–744. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Bengio, Y. Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 1995, 3361, 1995. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef] [PubMed]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Bendiab, G.; Shiaeles, S.; Alruban, A.; Kolokotronis, N. IoT malware network traffic classification using visual representation and deep learning. In Proceedings of the 2020 6th IEEE Conference on Network Softwarization (NetSoft), Ghent, Belgium, 29 June–3 July 2020; pp. 444–449. [Google Scholar]
- Van Rossum, G.; Drake, F.L., Jr. Python Tutorial; Centrum voor Wiskunde en Informatica: Amsterdam, The Netherlands, 1995. [Google Scholar]
- Warner, J.; Sexauer, J.; Unnikrishnan, A.; Castelão, G.; Pontes, F.A.; Uelwer, T.; Batista, F. JDWarner/Scikit-Fuzzy: Scikit-Fuzzy, Version 0.4.2; 2019. Available online: https://zenodo.org/record/3541386 (accessed on 7 June 2022).
- Google Colaboratory. Available online: https://colab.research.google.com/ (accessed on 11 August 2022).
- Kluyver, T.; Ragan-Kelley, B.; Pérez, F.; Granger, B.; Bussonnier, M.; Frederic, J.; Kelley, K.; Hamrick, J.; Grout, J.; Corlay, S.; et al. Jupyter Notebooks—A publishing format for reproducible computational workflows. In Proceedings of the Positioning and Power in Academic Publishing: Players, Agents and Agendas; Loizides, F., Schmidt, B., Eds.; IOS Press: Amsterdam, The Netherlands, 2016; pp. 87–90. [Google Scholar]
- Hoefler, T.; Alistarh, D.; Ben-Nun, T.; Dryden, N.; Peste, A. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 2021, 22, 10882–11005. [Google Scholar]
- Pichel, J.C.; Pateiro-López, B. A new approach for sparse matrix classification based on deep learning techniques. In Proceedings of the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK, 10–13 September 2018; pp. 46–54. [Google Scholar]
- Ankner, Z.; Renda, A.; Dziugaite, G.K.; Frankle, J.; Jin, T. The Effect of Data Dimensionality on Neural Network Prunability. arXiv 2022, arXiv:2212.00291. [Google Scholar]
- Goled, S. Future Is Sparse: Prof Nir Shavit, Neural Magic. 2022. Available online: https://analyticsindiamag.com/future-is-sparse-prof-nir-shavit-neural-magic/ (accessed on 19 December 2022).
- Hammad, B.T.; Jamil, N.; Ahmed, I.T.; Zain, Z.M.; Basheer, S. Robust Malware Family Classification Using Effective Features and Classifiers. Appl. Sci. 2022, 12, 7877. [Google Scholar] [CrossRef]
- Aboaoja, F.A.; Zainal, A.; Ghaleb, F.A.; Al-rimy, B.A.S.; Eisa, T.A.E.; Elnour, A.A.H. Malware detection issues, challenges, and future directions: A survey. Appl. Sci. 2022, 12, 8482. [Google Scholar] [CrossRef]
- Banko, M.; Brill, E. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, 6–11 July 2001; pp. 26–33. [Google Scholar]
- Halevy, A.; Norvig, P.; Pereira, F. The unreasonable effectiveness of data. IEEE Intell. Syst. 2009, 24, 8–12. [Google Scholar] [CrossRef]
- Loesdau, M.; Chabrier, S.; Gabillon, A. Hue and saturation in the RGB color space. In Proceedings of the International Conference on Image and Signal Processing; Springer: Berlin/Heidelberg, Germany, 2014; pp. 203–212. [Google Scholar]
- Chang, Y.C.; Reid, J.F. RGB calibration for color image analysis in machine vision. IEEE Trans. Image Process. 1996, 5, 1414–1422. [Google Scholar] [CrossRef]
Character Type | ASCII Decimal Value | Colour Class | RGB Values |
---|---|---|---|
control | 1–31, 127 | green | (0, 1, 0) |
extended | 128–254 | red | (1, 0, 0) |
the NULL character | 0 | grey | (0.5, 0.5, 0.5) |
the non-breaking space | 255 | yellow | (1, 1, 0) |
numbers (part of printable) | 48–57 | orange | (1, 0.5, 0) |
letters (part of printable) | 65–90, 97–122 | magenta | (1, 0, 1) |
space (part of printable) | 32 | blue | (0, 0, 1) |
remaining printable | 33–47, 58–64, 91–96, 123–126 | cyan | (0, 1, 1) |
FIS | |
---|---|
Type | Mamdani |
Inputs | Left Similarity, Right Similarity |
Ouput | Colour Tone |
Implication | min |
Aggregation | max |
Defuzzification | centroid |
Rule | Left Similarity | Right Similarity | Colour Tone |
---|---|---|---|
1 | Different | - | Light |
2 | Similar | - | Medium |
3 | Same | - | Dark |
4 | - | Different | Light |
5 | - | Similar | Medium |
6 | - | Same | Dark |
Colouring Method: [18], No Processing with -Cut | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 80.80 | 83.05 | 77.77 | 80.32 |
2500 | 89.60 | 87.21 | 92.80 | 89.92 |
5000 | 88.40 | 84.78 | 93.60 | 88.97 |
10,000 | 89.80 | 92.70 | 86.40 | 89.44 |
Colouring Method: [17], No Processing with -Cut | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 84.80 | 95.83 | 73.01 | 82.88 |
2500 | 88.40 | 91.37 | 84.80 | 87.96 |
5000 | 92.60 | 95.31 | 89.60 | 92.37 |
10,000 | 94.50 | 97.23 | 91.60 | 94.33 |
Colouring Method: [17], Reduced for | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 92.00 | 90.76 | 93.65 | 92.18 |
2500 | 89.60 | 93.04 | 85.60 | 89.16 |
5000 | 91.80 | 95.63 | 87.60 | 91.44 |
10,000 | 93.89 | 95.82 | 91.80 | 93.76 |
Colouring Method: [17], Reduced for | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 87.20 | 94.33 | 79.36 | 86.20 |
2500 | 88.80 | 91.45 | 85.60 | 88.42 |
5000 | 89.60 | 93.04 | 85.60 | 89.16 |
10,000 | 92.80 | 95.72 | 89.60 | 92.56 |
Colouring Method: Our Method, No Processing with -Cut | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 86.40 | 92.59 | 79.36 | 85.47 |
2500 | 90.40 | 94.69 | 85.60 | 89.91 |
5000 | 93.40 | 97.79 | 88.80 | 93.08 |
10,000 | 93.89 | 96.40 | 91.20 | 93.73 |
Colouring Method: Our Method, Reduced for | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 87.20 | 89.83 | 84.12 | 86.88 |
2500 | 89.20 | 95.37 | 82.39 | 88.41 |
5000 | 93.40 | 96.56 | 90.00 | 93.16 |
10,000 | 92.90 | 93.86 | 91.80 | 92.82 |
Colouring Method: Our Method, Reduced for | ||||
---|---|---|---|---|
Dataset Size | Accuracy (%) | Precision (%) | Recall (%) | F-Score (%) |
1250 | 83.20 | 88.88 | 76.19 | 82.05 |
2500 | 91.20 | 94.78 | 87.20 | 90.83 |
5000 | 93.20 | 97.36 | 88.80 | 92.88 |
10,000 | 93.60 | 94.48 | 92.60 | 93.53 |
Hyperparameters | |
---|---|
Batch | 32 |
Optimiser | Stochastic Gradient Descent |
Learning Rate | 0.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Saridou, B.; Moulas, I.; Shiaeles, S.; Papadopoulos, B. Image-Based Malware Detection Using α-Cuts and Binary Visualisation. Appl. Sci. 2023, 13, 4624. https://doi.org/10.3390/app13074624
Saridou B, Moulas I, Shiaeles S, Papadopoulos B. Image-Based Malware Detection Using α-Cuts and Binary Visualisation. Applied Sciences. 2023; 13(7):4624. https://doi.org/10.3390/app13074624
Chicago/Turabian StyleSaridou, Betty, Isidoros Moulas, Stavros Shiaeles, and Basil Papadopoulos. 2023. "Image-Based Malware Detection Using α-Cuts and Binary Visualisation" Applied Sciences 13, no. 7: 4624. https://doi.org/10.3390/app13074624