Algorithms 17 00099 v2
Algorithms 17 00099 v2
Article
Ensembling Supervised and Unsupervised Machine Learning
Algorithms for Detecting Distributed Denial of Service Attacks
Saikat Das 1,† , Mohammad Ashrafuzzaman 2, *,† , Frederick T. Sheldon 3, * and Sajjan Shiva 4
Abstract: The distributed denial of service (DDoS) attack is one of the most pernicious threats
in cyberspace. Catastrophic failures over the past two decades have resulted in catastrophic and
costly disruption of services across all sectors and critical infrastructure. Machine-learning-based
approaches have shown promise in developing intrusion detection systems (IDSs) for detecting
cyber-attacks, such as DDoS. Herein, we present a solution to detect DDoS attacks through an
ensemble-based machine learning approach that combines supervised and unsupervised machine
learning ensemble frameworks. This combination produces higher performance in detecting known
DDoS attacks using supervised ensemble and for zero-day DDoS attacks using an unsupervised
ensemble. The unsupervised ensemble, which employs novelty and outlier detection, is effective
in identifying prior unseen attacks. The ensemble framework is tested using three well-known
benchmark datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. The results show that ensemble
classifiers significantly outperform single-classifier-based approaches. Our model with combined
supervised and unsupervised ensemble models correctly detects up to 99.1% of the DDoS attacks,
with a negligible rate of false alarms.
Citation: Das, S.; Ashrafuzzaman, M.;
Keywords: network security; DDoS attack detection; machine learning; ensemble
Sheldon, F.T.; Shiva, S. Ensembling
Supervised and Unsupervised
Machine Learning Algorithms for
Detecting Distributed Denial of
Service Attacks. Algorithms 2024, 17, 1. Introduction
99. https://doi.org/10.3390/ Denial of service (DoS) is a cyber-attack that an adversary launches on a network
a17030099 service by sending a huge number of service requests, in various shapes and forms, beyond
Academic Editor: Francesco what the server can handle. The attack disrupts the normal functioning of the server
Bergadano and, depending on the severity, can result in a slowdown of performance or, ultimately,
a complete shutdown. Distributed denial of service (DDoS), a more dangerous variant of
Received: 9 January 2024 DoS, attacks are often carried out by utilizing botnets from many different compromised
Revised: 15 February 2024
devices that send simultaneous and coordinated service requests to the targeted server.
Accepted: 20 February 2024
DDoS, which was initially a tool used by hacktivists, has effectively been weaponized by
Published: 24 February 2024
cyber criminals to disrupt and extort organizations in ways that shut down operations and
reduce accessibility to existing networking infrastructure capability.
The first reported DoS attack was perpetrated against Panix, the third-oldest internet
Copyright: © 2024 by the authors.
service provider (ISP), in September of 1996 [1]. The first reported DDoS attack brought
Licensee MDPI, Basel, Switzerland. down e-commerce sites of large companies like Amazon, Yahoo!, eBay, Dell, and others
This article is an open access article in February of 2000 [2]. Despite numerous technological innovations devised to prevent
distributed under the terms and and countermeasure DDoS attacks, according to NETSCOUT’s DDoS Threat Intelligence
conditions of the Creative Commons Report, there were approximately 7.9 million DDoS attacks in the first half of 2023, which is
Attribution (CC BY) license (https:// a 31% year-over-year increase.
creativecommons.org/licenses/by/ Extensive research has been carried out to develop methods for detecting and prevent-
4.0/). ing such attacks [3]. Typically, intrusion detection systems (IDSs) are based on known attack
1.2. Contributions
The contributions of this work are enumerated below:
• Five standalone supervised and five unsupervised classifications were used, and their
performance in detecting DDoS attacks were recorded.
• The classification results of individual supervised models were combined to create
six ensemble models using (i) majority voting, (ii) logistic regression, (iii) naïve Bayes,
(iv) neural network, (v) decision tree, and (vi) support vector machine.
• Similarly, the classification results of individual unsupervised models were combined
to create six ensemble models.
• Comparative analyses demonstrated that, for both the supervised and unsupervised
cases, the ensemble models outperformed the corresponding individual models.
• These supervised and unsupervised ensemble frameworks were then combined, pro-
viding results that detect both seen and unseen DDoS attacks with higher accuracy.
• Three well-known IDS benchmark datasets were used for experimentation and to
validate the scheme. The detailed performance evaluation results with these datasets
proved the robustness and effectiveness of the scheme.
• The scheme iwass further verified using a verification dataset consisting of ground truths.
1.3. Organization
The remainder of this article is organized as follows: In Section 2, we review machine-
learning-based approaches to network intrusion detection from the literature. The threat
model is given in Section 3. Section 4 describes the machine learning algorithms, including
the ensemble mechanism, used in this study. The machine learning ensemble-based frame-
work of the proposed solution is presented in Section 5. Experiments with the framework
using the NSL-KDD, UNSW-NB15, and CICIDS2017 datasets are described in Section 6.
The results of the experiments are discussed in Section 7. In Section 8, we present the
conclusions and future work. Additional graphs and tables showing results for all three
datasets are given in the Appendix A.
Algorithms 2024, 17, 99 3 of 21
2. Related Work
Herein, we summarize the studies that focused on the detection of DDoS attacks using
different machine learning methods and techniques. We also discuss how our method is
different from those attempts and in what ways our method improves on the current state
of this research area.
Belavagi and Muniyal [9] used supervised learning and Ashfaq et al. [10] used semisu-
pervised learning to build training models to detect DDoS attacks. Meera Gandhi [11]
investigated the performance of four supervised classifiers, including J48, IBk, multilayer
perceptron (MLP), and naïve Bayes (NB), using the DARPA-Lincoln dataset [12] in de-
tecting intrusion, including DoS. Perez et al. [13] used both supervised and unsupervised
classifiers, including artificial neural network (ANN), support vector machine (SVM), k-
nearest neighbor (kNN), principal component analysis (PCA), and generalized random
forest (GRF). Villalobos et al. [14] proposed a distributed and collaborative architecture for
a network intrusion detection system using an in-memory distributed graph, employing
the unsupervised k-means method using a synthetic dataset composed of mainly DNS
traffic. Jabez and Muthukumar [15] used the neighborhood outlier factor (NOF) for in-
trusion detection with a dataset that consisted of network data collected using snort, an
open-source packet sniffer. Bindra and Sood [16] used only supervised methods, such
as linear SVM, Gaussian NB, kNN, random forest (RF), logistic regression (LR), and lin-
ear discriminant analysis (LDA), using four datasets. However, none of these datasets
included recent DDoS attacks. The best results obtained included a 96% accuracy with LR.
Similarly, Lima-Filho et al. [17] used individual supervised ML models, namely, RF, LR,
AdaBoost, stochastic gradient descent (SGD), decision tree (DT), and MLP, on the CIC-DoS,
CICIDS2017, and CSE-CIC-IDS2018 datasets, producing similar accuracy.
Idhammad et al. [18] used sequential semisupervised learning and entropy estima-
tion, coclustering, and information gain ratio for reducing the feature space and then
used various extra-trees semisupervised methods to detect DDoS attacks on three public
datasets, namely, NSL-KDD, UNB ISCX 12 ,and UNSW-NB15, obtaining accuracies of
98.23%, 99.88%, and 93.71% on those datasets, respectively. Suresh et al. [19] used chi-
square and the information gain ratio for feature selection and then used NB, C4.5, SVM,
kNN, K-means, and fuzzy c-means clustering methods, with fuzzy c-means clustering
giving higher accuracy in identifying DDoS attacks. Usha et al. [20] used different machine
learning algorithms (extreme gradient boosting, kNN, SGD, and NB) and a deep learning
architecture (convolutional neural network (CNN)) to identify and classify attacks. The
result showed that XGBoost achieved the highest accuracy, while CNN and kNN also gave
comparable figures. They used the CICDDoS2019 dataset for validating their approach.
Zhang et al. [21] presented a detection algorithm that combines the power spectral density
(PSD) entropy function and SVM to detect low-rate DoS (LDoS) traffic from normal traffic.
The decision algorithm uses the detection rate and efficiency as adjustable parameters.
Yuan et al. [22] used a recurrent neural network (RNN) to learn patterns from sequences of
network traffic and trace network attack activities. They reported an error rate of 2.103%
in their validation of the approach using the UNB ISCX Intrusion Detection Evaluation
2012 dataset. Hou et al. [23] proposed a hybrid scheme whereby the features in the traffic
data are selected using NetFlow, and the attack detection is determined using a number of
supervised machine learning methods, including SVM, C4.5 DT, Adaboost, and RF. They
reported a 99% accuracy with a 0.5% false-positive rate on the CICIDS 2017 dataset.
Ensemble [5] is a way of combining multiple classifiers to achieve better performance
compared to that of a single classifier, and many classification problems have benefited
from this idea. There are generally two types of ensembles: homogeneous and heteroge-
neous. In homogeneous ensembles (e.g., bagging and boosting), a number of similar types
of classifiers are combined to build the training model. In heterogeneous ensembles (e.g.,
stacking) [24], different types of classifiers are combined to construct the training model.
Aburomman and Riaz [3] provided a detailed survey of ensembles with homogeneous,
heterogeneous, and hybrid classifiers and the usage and shortfalls of machine learning
Algorithms 2024, 17, 99 4 of 21
ensembles in network security. Smyth et al. [25] used stacked density estimation and
showed that this outperforms the best single-classifier models. Hybrid supervised models
were used by Hosseini and Azizi [26] for DDoS detection. Das et al. [6,7] used supervised
and unsupervised ensemble for detecting DDoS attacks using the NSL-KDD dataset. Xi-
ang et al. [27] improved the supervised model’s accuracy by combining supervised and
unsupervised models using MAP estimation with the quasi-Newton method to obtain the
coordinates’ mapping embedded space into the object–group co-occurrence matrix.
Mittal et al. [28] surveyed deep-learning-based DDoS attack detection proposals in
the literature. They compared the different works based on ‘accuracy’, which is not a
useful performance metric to determine the attack detection capability of the techniques.
Nevertheless, they reported that the best accuracy rating among the methods they surveyed
was 99.1% on the NSLKDD dataset and 99.9% on the CICIDS2017 dataset. They also
observed that there was a lack of work on unseen data and zero-day attacks.
Our literature review shows that a number of researchers used only supervised meth-
ods or unsupervised methods for DDoS detection; however, none of them used ensemble-
based approaches that combine both supervised and unsupervised methods. Unlike these
previous studies, we devised an effective scheme that forms two sets of stacking ensembles,
one consisting of supervised models and another of unsupervised models. The detection of
DDoS attacks is achieved by the best-performing models from each set. The framework
supports feature selection, hyperparameter tuning, and data curation. Table 6 in Section 6
gives a comparative performance summary of our work with some existing works.
3. Threat Model
In a DDoS attack, an adversary uses a network of bots to overload the targeted server
with a flood of service requests. This overwhelms the request-handling capacity of the
server, rendering it unable to process legitimate service requests properly or at all, hence
effectively denying the service.
The threat model used in this study assumes that
1. The DDoS traffic originates from many different compromised hosts or botnets.
2. The attacks can be caused by IP spoofing or flooding requests.
3. The attacks can target the different OSI layers of the target network.
4. Either high-volume traffic (including DNS amplification) or low-volume traffic can
cause a DDoS attack.
5. The attackers can initiate various attack vectors, including Back Attack, Land Attack,
Ping of Death, Smurf Attack, HTTP Flood, Heartbleed, etc. (more information about
these DDoS attacks can be found in Balaban [29]).
Our proposed method, combining supervised and unsupervised ensemble, is capable
of detecting these DDoS attacks while having a very low false-positive rate. We applied an
empirical validation approach (i.e., experiment-based) to this threat model by analyzing
the results of the experiments. The validation process also compared the results with those
of existing methods.
Figure 1. Illustration of how the individual classifications are combined using different ensemble
models.
Algorithms 2024, 17, 99 6 of 21
In this paper, we use the terms metaclassifiers and ensemble classifiers interchange-
ably. The ensemble process we used is detailed in Section 5. Figure 1 illustrates how the
individual classifiers are combined in the ensemble models.
5. Proposed Method
This section describes the proposed ensemble-based DDoS attack detection scheme.
Figure 2 depicts the detailed architectural diagram of the process flow. The individual
processing phases are (1) data collection, (2) data preprocessing, (3) feature selection,
(4) data classification, and (5) DDoS detection.
Figure 2. Process flow of the proposed ensemble approach to detect DDoS attacks.
Algorithms 2024, 17, 99 7 of 21
performance of five individual and six ensemble models, the best-performing supervised
model is retained. The operations in lines 21–27 are similar to lines 6–12, but in this round
the unsupervised classifiers are invoked. Similarly, the prediction matrices obtained from
these individual unsupervised models are accumulated and a new dataset is created for
ensemble classification. Again lines 30–34 perform the same operations as lines 15–19,
but with this different datasets. Finally, the algorithm produces the two best-performing
models: m1 from the supervised ensemble framework and m2 from the unsupervised
ensemble framework, which are subsequently used later for DDoS detection purposes.
Figure 3. DDoS detection using the combined best performing supervised and unsupervised models.
Algorithms 2024, 17, 99 9 of 21
6. Experiments
In this section, we present the set of experiments performed with the ensemble frame-
work using three benchmark datasets, which have been thoroughly studied.
6.2. Datasets
In this study, three different datasets, as mentioned above, were used for experimenta-
tion, as shown in Table 1. The details include, for example, the number of instances that
were used in the experiments for training, testing, and verification purposes across both
supervised and unsupervised techniques. Next, we briefly detail those datasets.
Table 2. Reduced feature sets for NSL-KDD dataset that were used in experiments (adopted from [6]).
FS-1 Degree of dependency 24 2, 3, 4, 5, 7, 8, 10, 13, 23, 24, 25, 26, 27, 28, 29, 30,
and dependency ratio 33, 34, 35, 36, 38, 39, 40, 41
FS-2 FS-3 ∩ FS-4 ∩ FS-5 ∩ FS-6 13 3, 4, 29, 33, 34, 12, 39, 5, 30, 38, 25, 23, 6
FS-3 Information gain 14 5, 3, 6, 4, 30, 29, 33, 34, 35, 38, 12, 39, 25, 23
FS-4 Gain ratio 14 12, 26, 4, 25, 39, 6, 30, 38, 5, 29, 3, 37, 34, 33
FS-5 Chi-squared 14 5, 3, 6, 4, 29, 30, 33, 34, 35, 12, 23, 38, 25, 39
FS-6 ReliefF 14 3, 29, 4, 32, 38, 33, 39, 12, 36, 23, 26, 34, 40, 31
FS-7 Mutual information gain 12 23, 5, 3, 6, 32, 24, 12, 2, 37, 36, 8, 31
FS-8 Domain knowledge 16 2, 4, 10, 14, 17, 19, 21, 24, 25, 26, 27, 30, 31, 34, 35, 39
FS-9 Gain ratio 35 9, 26, 25, 4, 12, 39, 30, 38, 6, 29, 5, 37, 11, 3, 22, 35, 34, 14, 33,
23, 8, 10, 31, 27, 28, 32, 1, 36, 2, 41, 40, 17, 13, 16, 19
FS-10 Information gain 19 3, 4, 5, 6, 12, 23, 24, 25, 26, 29, 30, 32, 33, 34, 35, 36, 37, 38, 39
FS-11 Genetic algorithm 15 4, 5, 6, 8, 10, 12, 17, 23, 26, 29, 30, 32, 37, 38, 39
FS-12 Full set 41 All features
Predicted Predicted
Normal Attack
Actual Normal TN FP
Actual Attack FN TP
This study used the following standard evaluation metrics to measure the performance
of the models [36]:
1. Accuracy [( TP + TN )/Total]: the percentage of true DDoS attack data detection over
total data instances.
2. Precision [TP/( FP + TP)]: the measurement of how often the model correctly identi-
fies a DDoS attack.
3. False-positive rate (FPR) [FP/( FP + TN )]: how often the model raises a false alarm
by identifying a normal data instance as a DDoS attack.
4. Recall [TP/( FN + TP)]: the measurement of how many of the DDoS attacks the
model does identify correctly. Recall is also known as the true-positive rate, sensitivity,
or DDoS detection rate.
5. F1 score [2Precision ∗ Recall/( Precision + Recall )]: the harmonic average of precision
and recall.
We also used the ROC AUC score, which provides the diagnostic ability of binary
classifiers. The ROC curve is a plot of the true-positive rate (TPR) against the false-positive
rate (FPR). In DDoS detection, the higher the ROC, the better the model at discriminating
DDoS from benign traffic. We also measured the elapsed times for training and testing for
each model.
Algorithms 2024, 17, 99 13 of 21
Figure 4. Back-to-back performance comparison of the individual (LHS) vs. ensemble (RHS) models
on all three datasets.
Algorithms 2024, 17, 99 14 of 21
Tables A1–A6 in Appendix A show that most ensemble models performed better
than the corresponding individual models. We also observed that the supervised models
performed better than the unsupervised models. This was not unexpected, as supervised
models are trained with labeled data and therefore can classify with consistent accuracy.
On the other hand, if the DDoS attack data instance was not one of the learned patterns
or an unseen attack, then unsupervised models performed better in detecting the attack
than supervised models. Moreover, in practice, raw data are unlabeled, and additional data
engineering is required to label these. For these reasons, we included both the supervised
and unsupervised models in our framework.
The last columns in these tables, the elapsed time taken to train and test the models
shows that the ensemble models required more time. This is because in an ensemble
mechanism, all the constituent individual models run first, and then the ensemble classifier
runs. Therefore, the elapsed time for ensemble models includes the time taken by the
individual models, plus its own time. However, in a real-world machine learning applica-
tion, model training is performed offline, and these pre-trained models are deployed for
real-time testing, rendering the issue of comparatively higher elapsed time insignificant.
High-performance computing platforms with graphical processing units (GPUs) can also
be used to speed up offline training.
Table 5 shows the results of the experiment with the verification dataset (as described
in Section 5.1) that was used to verify the DDoS detection ability of the framework. This
experiment demonstrated the final detection performance of our approach. The detection
results were compared with the ground truths associated with the verification dataset and
that gave us the performance metric values. A comparison of the performance metric values
in Table 5 with the corresponding values in Table 4 demonstrates that the metrics associated
with the verification sets were similar to those of the test datasets, which indicated that the
our ensemble approach correctly detected at most 99.1% of the DDoS attacks and incorrectly
detected at least 0.01% of benign data instances as DDoS.
Table 6 compares the performance of our proposed method with that of existing
ensemble-based works in the literature, demonstrating that our proposed method outper-
formed all of them.
Dataset Learning Classifier Model F1 Score Accuracy Precision Recall FPR ROC Elapsed
Type Category AUC Time (s)
NSL-KDD Supervised Individual SVM 0.967 0.972 0.984 0.949 0.012 0.959 468.64
Ensemble Ens_DT 0.971 0.975 0.991 0.952 0.006 0.977 468.68
Unsupervised Individual EE 0.897 0.910 0.904 0.891 0.075 0.908 49.59
Ensemble Ens_NB 0.951 0.955 0.926 0.978 0.063 0.957 1122.49
UNSW-NB15 Supervised Individual DT 0.978 0.985 0.977 0.979 0.012 0.984 0.282
Ensemble Ens_LR 0.979 0.986 0.982 0.976 0.009 0.983 26.29
Unsupervised Individual EE 0.855 0.885 0.747 1.000 0.173 0.913 3.890
Ensemble Ens_LR 0.896 0.925 0.851 0.945 0.085 0.930 313.56
CICIDS2017 Supervised Individual DT 0.999 0.999 0.999 0.999 0.001 0.999 60.46
Ensemble Ens_LR 0.999 0.999 0.999 0.999 0.001 1.000 112,599.96
Unsupervised Individual EE 0.648 0.783 0.637 0.659 0.163 0.748 203.87
Ensemble Ens_DT 0.703 0.834 0.773 0.645 0.083 0.781 54,780.97
Predicted Results
Instances
Dataset Method Best Model DDoS Benign F1-Score Accuracy Precision Recall FPR
DDoS Benign TP FP TN FN
Sup En Ens_DT 861 1 99 39 0.977 0.960 0.999 0.957 0.010
NSL-KDD Unsup En Ens_NB 900 100 859 5 95 41 0.974 0.954 0.994 0.954 0.050
OR’ed N/A 861 2 98 39 0.977 0.959 0.998 0.957 0.020
Sup En Ens_LR 863 2 98 37 0.978 0.961 0.998 0.959 0.020
UNSW-NB15 Unsup En Ens_LR 900 100 739 7 93 161 0.898 0.832 0.991 0.821 0.070
OR’ed N/A 863 4 96 37 0.977 0.959 0.995 0.959 0.040
Sup En Ens_LR 892 1 99 8 0.995 0.991 0.999 0.991 0.010
CICIDS2017 Unsup En Ens_DT 900 100 723 9 91 177 0.886 0.814 0.988 0.803 0.090
OR’ed N/A 892 5 95 8 0.993 0.987 0.994 0.991 0.050
Algorithms 2024, 17, 99 15 of 21
Table 6. A performance comparison of our method with that of existing ensemble methods.
8. Conclusions
Detecting and preventing DDoS attacks is a very active field of research. Machine-
learning-based detection approaches are showing promise in detecting DDoS attacks, and
this research is not yet mature. This paper described a machine-learning-based approach in-
volving a unique kind of ensemble algorithm for detecting DDoS attacks. In this algorithm,
the outputs of five different supervised classifiers are fed as input to six metaclassifiers.
Separately, the outputs of five different unsupervised classifiers are fed as input to six
metaclassifiers. Then, these two ensembles are combined to obtain the best results.
We experimented with the algorithm using three benchmark datasets. The results
clearly showed the effectiveness of the algorithm on all three datasets in correctly detecting
DDoS attacks without many false alarms.
Future Work
Considering this work as a baseline, we plan to extend our method for detecting DDoS
attacks in the early stage based on identifying the crucial features using interpretable ma-
chine learning techniques. Since the benchmark datasets that we used in these experiments
are offline, we plan to employ our comprehensive solution as a DDoS attack discriminator
with real-time traffic.
Author Contributions: Conceptualization, S.D.; methodology, S.D. and M.A.; software, S.D. and M.A.;
validation: S.D. and Mohammad Ashrafuzzaman; formal analysis: S.D., Mohammad Ashrafuzzaman,
F.T.S. and S.S.; investigation: S.D., Mohammad Ashrafuzzaman, F.T.S. and S.S.; writing—original
draft preparation, S.D. and M.A.; writing—review and editing, S.D., Mohammad Ashrafuzzaman,
F.T.S. and S.S.; supervision, S.S. All authors have read and agreed to the published version of the
manuscript.
Funding: This research received no external funding.
Data Availability Statement: We used the publicly-available datasets for our work which were cited
in the paper.
Conflicts of Interest: The authors declare no conflicts of interest.
Table A1. Performance metrics for NSL-KDD dataset using supervised ensemble framework.
Table A2. Performance metrics for NSL-KDD test dataset using unsupervised ensemble framework.
Figure A1. Performance metrics for the models on the NSL-KDD dataset.
Algorithms 2024, 17, 99 17 of 21
Figure A2. ROC curves for all the models running on the NSL-KDD dataset.
Table A3. Performance metrics for UNSW-NB15 dataset using supervised ensemble framework.
Table A4. Performance metrics for UNSW-NB15 test dataset using unsupervised ensemble framework.
Figure A3. Performance metrics for the models running on the UNSW-NB15 dataset.
Figure A4. ROC curves for all the models running on the UNSW-NB15 dataset.
Table A5. Performance metrics for CICIDS2017 dataset using supervised ensemble framework.
Table A6. Performance metrics for CICIDS2017 test dataset using unsupervised ensemble framework.
Figure A5. Performance metrics for the models running on the CICIDS2017 dataset.
Figure A6. ROC curves for all the models running on the CICIDS2017 dataset.
References
1. Calem, R.E. New York’s Panix Service is Crippled by Hacker Attack. The New York Times, 14 September 1996; pp. 1–3.
2. Famous DDoS Attacks: The Largest DDoS Attacks of All Time. Cloudflare 2020. Available online: https://www.cloudflare.com/
learning/ddos/famous-ddos-attacks/ (accessed on 14 February 2024).
3. Aburomman, A.A.; Reaz, M.B.I. A survey of intrusion detection systems based on ensemble and hybrid classifiers. Comput. Secur.
2017, 65, 135–152. [CrossRef]
4. Gogoi, P.; Bhattacharyya, D.; Borah, B.; Kalita, J.K. A survey of outlier detection methods in network anomaly identification.
Comput. J. 2011, 54, 570–588. [CrossRef]
5. Dietterich, T.G. Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems; Springer:
Berlin/Heidelberg, Germany, 2000; pp. 1–15.
6. Das, S.; Venugopal, D.; Shiva, S. A Holistic Approach for Detecting DDoS Attacks by Using Ensemble Unsupervised Machine
Learning. In Proceedings of the Future of Information and Communication Conference, San Francisco, CA, USA, 5–6 March 2020;
pp. 721–738.
Algorithms 2024, 17, 99 20 of 21
7. Das, S.; Mahfouz, A.M.; Venugopal, D.; Shiva, S. DDoS Intrusion Detection Through Machine Learning Ensemble. In Proceedings
of the 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C), Sofia, Bulgaria,
22–26 July 2019; pp. 471–477.
8. Ashrafuzzaman, M.; Das, S.; Chakhchoukh, Y.; Shiva, S.; Sheldon, F.T. Detecting stealthy false data injection attacks in the smart
grid using ensemble-based machine learning. Comput. Secur. 2020, 97, 101994. [CrossRef]
9. Belavagi, M.C.; Muniyal, B. Performance evaluation of supervised machine learning algorithms for intrusion detection. Procedia
Comput. Sci. 2016, 89, 117–123. [CrossRef]
10. Ashfaq, R.A.R.; Wang, X.Z.; Huang, J.Z.; Abbas, H.; He, Y.L. Fuzziness based semi-supervised learning approach for intrusion
detection system. Inf. Sci. 2017, 378, 484–497. [CrossRef]
11. MeeraGandhi, G. Machine learning approach for attack prediction and classification using supervised learning algorithms. Int. J.
Comput. Sci. Commun. 2010, 1, 11465–11484.
12. Lippmann, R.; Haines, J.W.; Fried, D.J.; Korba, J.; Das, K. The 1999 DARPA off-line intrusion detection evaluation. Comput. Netw.
2000, 34, 579–595. [CrossRef]
13. Perez, D.; Astor, M.A.; Abreu, D.P.; Scalise, E. Intrusion detection in computer networks using hybrid machine learning techniques.
In Proceedings of the 2017 XLIII Latin American Computer Conference (CLEI), Cordoba, Argentina, 4–8 September 2017; pp. 1–10.
14. Villalobos, J.J.; Rodero, I.; Parashar, M. An unsupervised approach for online detection and mitigation of high-rate DDoS
attacks based on an in-memory distributed graph using streaming data and analytics. In Proceedings of the Fourth IEEE/ACM
International Conference on Big Data Computing, Applications and Technologies, Austin, TX, USA, 5–8 December 2017;
pp. 103–112.
15. Jabez, J.; Muthukumar, B. Intrusion detection system (IDS): Anomaly detection using outlier detection approach. Procedia Comput.
Sci. 2015, 48, 338–346. [CrossRef]
16. Bindra, N.; Sood, M. Detecting DDoS attacks using machine learning techniques and contemporary intrusion detection dataset.
Autom. Control. Comput. Sci. 2019, 53, 419–428. [CrossRef]
17. Lima Filho, F.S.d.; Silveira, F.A.; de Medeiros Brito Junior, A.; Vargas-Solar, G.; Silveira, L.F. Smart detection: An online approach
for DoS/DDoS attack detection using machine learning. Secur. Commun. Netw. 2019, 2019. [CrossRef]
18. Idhammad, M.; Afdel, K.; Belouch, M. Semi-supervised machine learning approach for DDoS detection. Appl. Intell. 2018,
48, 3193–3208. [CrossRef]
19. Suresh, M.; Anitha, R. Evaluating machine learning algorithms for detecting DDoS attacks. In Proceedings of the International
Conference on Network Security and Applications, Chennai, India, 15–17 July 2011; pp. 441–452.
20. Usha, G.; Narang, M.; Kumar, A. Detection and Classification of Distributed DoS Attacks Using Machine Learning. In Computer
Networks and Inventive Communication Technologies; Springer: Berlin/Heidelberg, Germany, 2021; pp. 985–1000.
21. Zhang, N.; Jaafar, F.; Malik, Y. Low-rate DoS attack detection using PSD based entropy and machine learning. In Proceedings of
the 2019 6th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2019 5th IEEE International
Conference on Edge Computing and Scalable Cloud (EdgeCom), Paris, France, 21–23 June 2019; pp. 59–62.
22. Yuan, X.; Li, C.; Li, X. DeepDefense: Identifying DDoS attack via deep learning. In Proceedings of the 2017 IEEE International
Conference on Smart Computing (SMARTCOMP), Hong Kong, China, 29–31 May 2017; pp. 1–8.
23. Hou, J.; Fu, P.; Cao, Z.; Xu, A. Machine learning based DDoS detection through netflow analysis. In Proceedings of the MILCOM
2018-2018 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 29–31 October 2018; pp. 1–6.
24. Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [CrossRef]
25. Smyth, P.; Wolpert, D. Stacked density estimation. In Proceedings of the Advances in neural information processing systems,
Denver, CO, USA, 30 November–5 December, 1998; pp. 668–674.
26. Hosseini, S.; Azizi, M. The hybrid technique for DDoS detection with supervised learning algorithms. Comput. Netw. 2019,
158, 35–45. [CrossRef]
27. Ao, X.; Luo, P.; Ma, X.; Zhuang, F.; He, Q.; Shi, Z.; Shen, Z. Combining supervised and unsupervised models via unconstrained
probabilistic embedding. Inf. Sci. 2014, 257, 101–114. [CrossRef]
28. Mittal, M.; Kumar, K.; Behal, S. Deep learning approaches for detecting DDoS attacks: A systematic review. Soft Comput. 2023,
27, 13039–13075. [CrossRef]
29. Balaban, D. Are you Ready for These 26 Different Types of DDoS Attacks? Secur. Mag. 2020. Available online: https://www.
securitymagazine.com/articles/92327-are-you-ready-for-these-26-different-types-of-ddos-attacks (accessed on 14 February 2024).
30. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer:
New York, NY, USA, 2008.
31. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.;
et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830.
32. Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009
IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009;
pp. 1–6.
33. Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems. In Proceedings of the
2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015;
pp. 1–6.
Algorithms 2024, 17, 99 21 of 21
34. Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic
characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018),
Funchal, Madeira, Portugal, 22–24 January 2018; pp. 108–116. [CrossRef]
35. Das, S.; Venugopal, D.; Shiva, S.; Sheldon, F.T. Empirical evaluation of the ensemble framework for feature selection in DDoS
attack. In Proceedings of the 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020
6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), New York, NY, USA, 1–3 August 2020;
pp. 56–61.
36. Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009,
45, 427–437. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.