Enhancing IoT Security With CNN and LSTM-Based Int
Enhancing IoT Security With CNN and LSTM-Based Int
Abstract—Protecting Internet of things (IoT) devices against pose threats to the security of information systems based
cyber attacks is imperative owing to inherent security vul- on standards when activities deviate from these standards or
nerabilities. These vulnerabilities can include a spectrum of baseline, the IDS alerts us at an early stage. In the realm of
sophisticated attacks that pose significant damage to both
individuals and organizations. Employing robust security mea- security components, IDSs provide two distinct forms : (1)
sures like intrusion detection systems (IDSs) is essential to host-based (HIDS) which focuses on monitoring and analyz-
solve these problems and protect IoT systems from such ing activities transpiring on a server, and (2) network-based
attacks. In this context, our proposed IDS model consists on (NIDS), tasked with the observation of network activities
a combination of convolutional neural network (CNN) and and communications [4]. Numerous organizations opt for a
long short-term memory (LSTM) deep learning (DL) models.
This fusion facilitates the detection and classification of IoT hybrid approach, incorporating both HIDS and NIDS [5].
traffic into binary categories, benign and malicious activities Based on the nature of the analysis performed, IDSs are
by leveraging the spatial feature extraction capabilities of CNN
categorized as either signature-based or anomaly-based [6].
for pattern recognition and the sequential memory retention
of LSTM for discerning complex temporal dependencies in Signature-based schemes, alternatively referred to as misuse-
achieving enhanced accuracy and efficiency. In assessing the based, aim to identify predefined patterns, or signatures,
performance of our proposed model, the authors employed the within the analyzed data, these systems serve to identify
new CICIoT2023 dataset for both training and final testing, specified and well-known attacks but may fail to detect novel
while further validating the model’s performance through a
and unfamiliar intrusions [4, 6–8]. Whereas, the anomaly-
conclusive testing phase utilizing the CICIDS2017 dataset.
Our proposed model achieves an accuracy rate of 98.42%, based IDSs are employed to observe the behavior of a
accompanied by a minimal loss of 0.0275. False positive rate standard network and establish a threshold for detecting
(FPR) is equally important, reaching 9.17% with an F1- deviations from the norm [4]. Their main benefit is the ability
score of 98.57%. These results demonstrate the effectiveness to detect previously unseen and unknown intrusion activities
of our proposed CNN-LSTM IDS model in fortifying IoT
[9].
environments against potential cyber threats.
Index Terms—Intrusion detection system, deep learning, In addressing security threats, IDS software is frequently
internet of things, CNN, LSTM, cyber security, CICIoT2023. developed employing artificial intelligence (AI) algorithms,
including techniques such as machine learning (ML) and
I. I NTRODUCTION data mining (DM). These methods have proven to be highly
IoTs have grown notably to sweep the whole world, it effective, in identifying intrusions [7]. DL is a broader sub-
involves billions of devices connected to each other without field of ML, its architectural configuration comprises an
any human interactions (interplay). The IoT generates large initial input layer succeeded by a series of hidden layers,
data analytics through using sensors, actuators, and control which subsequently propagate inputs to the output layer
devices. These data are leveraged for diverse tasks and [1]. The CNN represents a DL model extensively applied
objectives across different fields including healthcare, indus- in different domains, like in [10] for image classifications,
try, agriculture, military, and other sectors. The expansive in [11–14] for speech processing and security, and in [15]
realm of the IoT is proportionate to its exposure to a for cyber attacks. LSTM is a special class of recurrent
myriad of threats and cyber attacks that have the potential to neural network (RNN), which lies in its capability to be
compromise the integrity and security of connected devices directly applied to raw data without necessitating the usage
and networks. Hence, it is imperative to address optimal of any feature selection methods [1]. Nevertheless, LSTM
solutions for countering such behaviors. Moreover, IDSs entails a lengthier training duration and demands more
assume a pivotal role in identifying and mitigating cyber computational resources compared to CNN [9]. Hence, this
attacks in any network [1–3]. study introduces an advanced unified model, CNN-LSTM,
An IDS functions as a monitoring tool, it serves to identify which combines the strength of CNN and LSTM models.
any form of potentially malicious network traffic such as The below steps constitute the procedural flow and the
intrusion attempts, viral attacks, and suspicious traffic that contribution of this paper.
B. Performance metrics
The performance of our proposed model for the detection
of diverse types of attacks is quantified using standard
metrics, including accuracy, precision, recall, F1-score and
FPR, which are defined in [13, 19, 20]. The corresponding
equations are presented below:
TP + TN
Acc = (1)
TP + FP + TN + FN
TP TP
Rc = , Pr = (2)
TP + FN TP + FP
Precision × Recall
F1 − Score = 2 × (3)
Precision + Recall
FP
FPR = (4)
FP + TN
Where, the term ”true positive” (TP) denotes instances
where the IDS accurately identifies an intrusion, while ”true
negative” (TN) signifies the correct identification of normal
traffic. Conversely, ”false positives” (FP) denote instances
where benign traffic is mistakenly flagged as malicious, and
”false negatives” (FN) represent failures of the IDS to detect
actual intrusions. A robust F1 score, which integrates preci-
sion and recall, is indicative of effective IDS performance,
particularly when it reflects low rates of FP and FN [19].
C. Results
This subsection introduces the results of our proposed Fig. 3: Accuracy and loss model during the training phases.
model. This method used a combination of CNN-LSTM for (a): train and validation losses. (b) training and validation
network security, the results were obtained by splitting the accuracies.
dataset into 80% for training and 20% for the validation.
In the conducted experiment, the model underwent training • Classification Report: Table III presents a detailed evalu-
using the CIC-IoT2023 dataset, encompassing both benign ation of the binary classification of our system using a set
and malicious network traffic and the training procedure was of metrics like precision, recall, F1-score, and support. It
executed on Google Colab, employing 25 epochs and the is obvious that the model’s performance changes through
Adam optimizer. different classes, especially for the first and second classes
• Accuracy and loss graph: Figure 3 shows the accuracy (normal traffic and attacks). Concerning the first class, the
and loss performance of both training and validation precision obtained is 90% compared to other classes, this
established based on the numbers of epochs equal to 25. In implies that the model could face difficulties in precisely
Figure 3 (b), the model increases as the number of training classifying instances associated with the normal situation
TABLE III: Classification report.
Precision Recall F1-score Support
Normal traffic 90% 61% 73% 8321
Attacks 99% 100% 99% 229932
Accuracy 98% 238253
Macro avg 95% 80% 86% 238253
Weighted avg 98% 98% 98% 238253
category. For the same class, the recall is documented at Fig. 4: Confusion matrix during the training phase.
61%, indicating that the model might encounter difficul-
ties in identifying all positive instances (a higher recall
indicates fewer false negative results). For this particular
class, the F1-score stands at 73%, which is determined
by both precision and recall. The challenges encountered
may stem from the inherent resemblance between certain
features of benign network traffic and malicious attacks.
For the other class (attacks), the F1-score reaches a value
of 99%.
It is evident that the model exhibits commendable per-
formance in accuracy, precision, recall, and F1-score.
Nevertheless, it is imperative to note that while accuracy
provides valuable insights, it alone may not suffice for
making the final decision regarding the system’s perfor-
mance.
• FPR: An FPR of 9.17% is generally considered an
acceptable result, signifying that only 9.17% of instances Fig. 5: ROC curve.
representing normal traffic were erroneously categorized
as attacks. This denotes the classifier’s proficiency in accu-
rately discerning the majority of normal traffic instances, the loss metric maintains an analogous value to that
a critical factor in mitigating false alarms and enhancing observed during the training phase, registering at 0.02%,
the overall efficacy of the system (Determined using the while the FPR manifested a numerical value of 9.17%. It
confusion matrix). is noteworthy that all obtained results closely align with
• Confusion matrix: Referring to Figure 4, it is discerned those of the training model. Besides, we have conducted
that the classification performance is notably robust, with experiments utilizing an alternative dataset, namely the
an accuracy rate of 90% for the first class (representing CICIDS2017. The primary objective is to assess the
normal traffic), and an even higher accuracy of 99% for model’s performance across diverse datasets and ascertain
the second class, designated for cyberthreats. Regarding its generalization capabilities. The targeted metric for
mis-classifications, a marginal 10% pertains to instances performance evaluation in this context is the same as
where the first class is erroneously classified as the previous tests, achieving an accuracy rate of 97.45%, loss
second, whereas a mere 1% of attacks are misclassified of 0.06, precision of 97.17%, recall 97.15%, F1-score
as normal traffic which substantiates our proposition as 97.07% and FPR 2.08%. This meticulous examination of
delineated in the classification report. the model’s proficiency on a distinct dataset serves to
• Receiver operating characteristics (ROC): ROC pre- reinforce the robustness and reliability of its predictive
sented in Figure 5 indicates the commendable perfor- capabilities, contributing to a more nuanced understanding
mance of our CNN-LSTM model in the classification of of its potential applications in real-world scenarios. The
attacks, with high TPR values and low FPR values. The confusion matrix of the final test described related to CIC-
model has an elevated capability to discriminate between IoT2023 and CICIDS2017 datasets are in Figure 6 (a)
normal network traffic and instances of attacks. This is and (b) respectively. The results showed similar results to
likely because the ROC curve is positioned near the left previous tests. However, regarding the confusion matrics
corner suggesting that the models’ predictions are both of the CICIDS2017, the classifier correctly identified 98%
accurate and precise. instances and misclassified 0.02% instances as ”attacks”.
• Generalization verification: Additional subsets of CI- Out of instances that are true ”attacks”, the classifier cor-
CIoT2023 dataset is conducted in our study. The ensuing rectly identified 94% and misclassified 0.06% instances as
accuracy from this evaluation attains 98.43%, accom- ”normal traffic” which shows that the classifier performs
panied by the precision, recall, and F1-score values of well in identifying ”normal traffic” but has some error rate
98.85%, 98.43%, and 98.57%, respectively. Remarkably, in detecting ”attacks”.
TABLE IV: Performance metrics of the proposed model CNN-LSTM compared to state-of-the-art for binary classification.
Work year Model datasets Accuracy (%) Loss Precision (%) Recall (%) F1-score (%) FPR (%)