1. Introduction
The Internet of Things (IoT) is considered to be a worldwide network of uniquely addressable interconnected objects, using sensing features, employing communication protocols, exploiting computational capability, and providing services and capacity to analyze data. IoT objects can be doorbells, sensors, Digital Video Recorders (DVRs), light bulbs, electric switches, and home assistant devices. Juniper Research estimates there will be over 46 billion IoT-connected objects by 2021, including devices, sensors and actuators, which represents an increase of 200% compared to 2016 (
https://www.i-scoop.eu/internet-of-things-guide/connected-devices-2021/). Near-Field Communications (NFC) and Wireless Sensor and Actuator Networks (WSAN) associated with Radio-Frequency IDentification (RFID) make up the core of the IoT network [
1]. The convergence of the Internet and sensor networks is fruitful, and is leading to a new paradigm called machine-to-machine (M2M) communication over the Internet by enabling a very large number of autonomous and self-organized devices [
2]. The core concept of IoT is that every object in the network has many capabilities, such as identifying, sensing, and processing data, therefore enabling communication with a wide variety of other devices and services through the Internet to provide services to humanity.
IoT application domains fall into several categories, including utilities, transport and supply chain, environment and agriculture, health, personal home, and manufacturing and industry [
3]. Industry 4.0 is a new trend, introducing new technologies to the manufacturing field, such as IoT, cyber-physical systems, big data, cloud computing, the semantic web, and virtualization [
4]. As with any trend, many cyber-physical attacks target manufacturers that use Industry 4.0 systems [
5], such as the Maroochy water services attack in Australia [
6], the steel mill attack in Germany [
7], the New York Dam attack [
8], and the Norwegian Hydro aluminum attack (
www.bbc.com/news/technology-47624207) in 2000, 2014, 2016, and 2019, respectively.
IoT, being an emerging technology as well as having huge number of devices deployed and connected to the Internet, represents a fertile field for attacker threats, and therefore new cyber-security issues related to IoT have appeared. Many threats threatening IoT devices have been 2 defined, including network, physical, environment, cryptanalysis, and software attacks [
9]. Network attacks include man-in-the-middle (MITM), replay, masquerade, and distributed denial of service (DDoS) attacks [
10]. To overcome these risks to IoT systems, communication protocols should be secure, lightweight encryption algorithm should be implemented, IoT platform security features should be enforced, and advanced techniques should be applied to filter and predict different security threats.
Security in IoT is of extreme importance, as any successful attack may paralyze a whole manufacturing, transport, health system, etc. sector. IoT is a combination of devices, network protocols, and technologies that each have their own vulnerabilities, which increases the attack surface across the whole IoT network. In other words, several attacks against IoT have been inherited from underlying technologies.
Contributions—There has been no standard until now for IoT architecture. However, different architectures have been proposed for IoT, such as three-layer [
11], middle-ware-based architecture [
12], service-oriented architecture (SOA) [
13,
14], four-layer [
15], and five-layer [
12].
Architecture previously proposed in the literature is highlighted in this paragraph. The basic model is called three-layer architecture, and it is composed of perception, network, and application layers [
11,
12,
16]. Four-layer architecture covers perception, network, middleware, and application layers [
13,
15,
16]. The role of the middleware layer involves service management, data storage, and service composition [
15]. A proposed five-layer architecture includes objects, object abstraction, service management, application, and business layers [
12].
To add advanced features to IoT such as IoT data, machine-learning algorithms, and light encryption algorithms, we propose in this paper a new IoT architecture, as shown in
Figure 1. The proposed IoT architecture is based on five layers, including a perception layer, a network/protocol layer, a transport layer, and a data and cloud services layer. As shown in
Figure 1, the physical layer involves different sensors and IoT devices such as a Wireless Sensors Network (WSN), QR Codes, Wireless Body Area Network (WBAN), Radio-Frequency IDentification (RFID) devices, etc.
The network and protocol layer covers different wired and wireless network protocols involved in an IoT system, such as Wi-Fi, ZigBee, Ethernet, Bluetooth, LTE, 5G, etc. The transport layer involves TCP/IP, UDP/IP, and Transport Layer Security (TLS)/secure sockets layer (SSL) suite protocols. For the application layer, we cover the various application protocols developed to meet the IoT requirement in terms of low power consumption and small device capacity, such as Advanced Message Queuing Protocol (AMQP), Constrained Application Protocol (CoAP), and Message Queuing Telemetry Transport (MQTT). Finally, the data and cloud services layer presents the main cloud-based IoT frameworks.
In
Table 1, common IoT attacks are highlighted. We also provide security control suggestions to mitigate the harm to IoT devices caused by these attacks.
The paper focuses on analyzing security issues inherited by each layer component, while presenting deployed security measures and mechanisms to defeat prominent attacks.
As shown in
Table 1, common IoT attacks can be classified into 5 classes:
Data and cloud services layer attacks include poisoning, evasion, impersonation, and inversion.
Application layer attacks include Mirai malware, IPCTelnet malware, DDoS, and injection.
Transport layer attacks include resource exhaustion, flooding, replay, DDoS attack, and amplification attacks.
Network and protocol layer attacks include man-in-the-middle, DDoS, and replay attacks.
Physical sensing layer attacks include eavesdropping, cyber-physical, and tracking attacks.
A scenario to describe the realistic use of the proposed architecture could be an e-health application, in which the perception layer captures a physical parameter via a sensor implemented in a patient’s body. Then, the job of the network and transport layers is to send the data to the application layer by selecting the suitable communication and lightweight encryption protocol based on power processing and energy consumption of the IoT device. The application layer will select the appropriate application protocol (i.e., MQTT, CoAP, or other) to communicate the data to the right user (i.e., doctor or medical staff). Finally, the data will be stored in the cloud layer and will be useful for future data analysis and prediction by using the appropriate machine-learning algorithm.
Existent Surveys—Internet of things security issues have attracted a lot of research, in which several published survey papers have studied IoT architecture, applications, and security issues. The survey authored by Al-Fuqaha et al. [
12] covers the main IoT element-enabling technologies and the principle common IoT standards. In [
11], the authors address the security of IoT frameworks such as AWS, Azure, and Calvin architecture. The authors in [
16] provide a survey of the most common architectures proposed for IoT e-health applications, smart society applications, and cloud service and management solutions. Moreover, [
4] addresses IoT in terms of the requirements of smart factories to enable standard Industry 4.0 protocols in the next industrial revolution. Key IoT applications in industries are presented in [
13] including the food supply chain, the iDrive system provided by the BMW car company, and an environment monitoring system for firefighting based on RFID tags. Buton et al. [
17] introduced a security analysis of IoT based on an in-depth analysis of the use of WSNs, their vulnerabilities and their major security threats. Recently, Hussain et al. [
18] presented a review of machine learning applied in IoT, and their main advantages and limitations.
Position of our paper—In this survey paper, we combine different aspects related to IoT technologies in one compact IoT architecture, covering IoT physical devices and sensors, communication and network protocols, a transport layer, an application layer, and data and cloud services. This architecture is based on a modification of OSI architecture, considering the security vulnerabilities and threats. In addition to existent OSI layers, we define a cloud and data layer, which involves several publicly available IoT frameworks providing IoT data storage, processing, and analysis. This architecture is extended to involve machine-learning applications that process data and protect IoT components. Furthermore, we present a discussion of current challenges facing IoT security solutions, such as the lack of standard encryption algorithms adapted for IoT devices. We also explore the application of novel techniques to secure IoT, such as the use of Blockchain in IoT and machine-learning models, as well as reviewing the potential of 5G network applications, and their reliance on IoT.
Paper Organization—This survey is organized as follows.
Section 2 presents the main components of the physical sensing layer, and the related security threats and countermeasures. The IoT network and communication protocols and their related security issues and solutions are reviewed in
Section 3.
Section 4 introduces an overview of the transport layer protocol and its main security countermeasures. The application layer protocols are studied in
Section 5, detailing their main security features.
Section 6 reviews the well-known cloud-based IoT frameworks, while reviewing the main security measures they are implementing. Finally, a discussion of open issues and research opportunities is conducted in
Section 7, before the survey paper is concluded in
Section 8.
6. Data and Cloud Services Layer
The development of applications for IoT faces many challenges due to the complexity of distributed computing, the involvement of different programming languages, and the variety of communication protocols. Therefore, the development of IoT applications requires the management of both hardware and software components, along with the handling of full infrastructure and delivery of functional and non-functional requirements. These challenges have led to the emergence of a cloud-based IoT programming framework launched by the major IoT stakeholders to provide ready-to-use/develop IoT applications.
The cloud-based IoT frameworks introduce a set of rules and protocols aimed at organizing data management and message exchange between the parties involved in the IoT network, such as devices, the cloud system, and users. These frameworks enable a simplified high-level deployment of IoT applications while hiding the complexity of the underlaying protocols.
In this section, we review the performance of the five main IoT frameworks based on public clouds, namely Amazon AWS IoT, CISCO IoT Cloud Connect, Google Cloud IoT, Oracle IoT Ecosystem, and Bosch IoT Suite. We have chosen these frameworks in the absence of a standardized framework, as they are the best-known ones. We focus on reviewing the security features provided by these frameworks as well as the inherited security threats by using public cloud architecture.
The cloud-based IoT frameworks are built on three main components: smart devices such as sensors, tags, etc., the cloud servers providing storage and processing of IoT data, and the users represented by the applications that access cloud-stored data and communicate with the devices. The frameworks also include the protocols that are needed to communicate between all the entities.
In
Table 5, we compare the security features provided by the selected IoT frameworks. Providing a secure framework relies mainly on ensuring confidentiality, integrity, availability, authentication, and access control [
55].
To ensure secure communication while transferring and accessing IoT data, various protocols are used by the aforementioned IoT frameworks, including Hypertext Transfer Protocol Secure (HTTPS), IPsec, transport layer security (TLS), datagram transport layer security (DTLS), and MQTT over TLS. Basically, SSL is used by AWS, Google Cloud and Oracle IoT Ecosystem.
AWS IoT is composed of four components, namely the device gateway, the rules engine, the registry, and the device shadows (
https://docs.aws.amazon.com/iot-device-management/index.html). The device gateway is an intermediate component enabling communication between devices and cloud services via the MQTT protocol. The rules engine is responsible for processing the exchanged messages to forward them to the AWS, the subscribed devices, or a non-AWS service. The registry unit assigns an identifier to every connected device, while storing metadata to enable their tracking. The device shadow is a virtual device image created and stored in the cloud, enabling the saving of the last online state of the device and enforcement of future changes to the state once it goes online again. In a nutshell, the framework enables the management of IoT devices using its shadow even when it is not connected to the network.
To ensure confidentiality, integrity and availability, AWS proposes SSL-protected API endpoints (
https://docs.aws.amazon.com/iot-device-management/index.html). AWS security modules ensure authentication and authorization. AWS authentication is based on X.509 certificates. On the other hand, AWS authorization is based on identity and access management (users, groups, and roles). Additionally, AWS Cognito identity modules are used to create unique user identities [
11].
Google Cloud uses three kinds of encryption protocols to ensure the protection of data at the application layer. These are AES, TLS and secure/multipurpose Internet mail extensions (S/MIME) (
http://cloud.google.com/security/encryption-in-transit). Likewise, Google cloud uses application layer transport security (ATLS) to guaranty confidentiality, integrity and authentication among different services. Also, Google Cloud suggests various access control options, such as cloud identity and access management as well as access control lists (ACLs).
The Oracle IoT solution is based on transparent sensitive data protection (TSDP) to ensure confidentiality and integrity. In addition, to improve data security, Oracle employs data masking and sub-setting to comply with the payment card industry data security standard (PCI-DSS) (
www.oracle.com/technetwork/database/security/security-compliance).
CISCO IoT platform architecture is composed of four layers. These are an embedded systems and sensors layer, a multi-service edge layer, a core layer, and a data center cloud layer. The core layer includes IP/MPLS, security management, and network service. CISCO proposes an IoT/M2M security framework. Strong authentication is well provided by using AES and RSA for digital signature and key transport (
www.cisco.com/secure-iot-proposed-framework, CISCO Kinetic Security Technical Paper). To ensure secure data traffic and data management, The CISCO Cloud solution employs HTTPS over IPsec, and SNMP over IPsec, respectively. Likewise, authorization and access control in CISCO IoT Cloud Connect uses segment data based on destination.
The architecture of the Bosch IoT suite expects an identity management module for users, roles, relations, and permissions. Regarding Bosch cross-domain applications (i.e., case of XDL120), confidentiality and integrity are based on the Wi-Fi-protected access 2 (WPA2) provided by the standard IEEE 802.11i/e/g white-listing of MAC addresses (
https://www.digikey.co.uk/en/supplier-centers/b/bosch-cds). Furthermore, XDL120 employs DTLS to ensure a secure communication of transmitted sensor parameters and lightweight M2M (LWM2M) communication protocols.
In addition, cloud-based IoT frameworks provide access to machine-learning functions, enabling the processing of collected IoT data.
Research has identified multiple applications of machine learning in IoT contexts. The taxonomy of ML in IoT contexts for big data analysis is presented in
Table 6. These ML models are categorized into three categories—classification, regression, and clustering [
62]. The ML classification family includes K-Nearest Neighbors (KNN), Naive Bayes (NB), and SVM. The ML clustering family involves K-means, a density-based approach to spatial clustering of applications with noise (DBSCAN), and the Feed Forward Neural Network (FFNN). The ML regression family covers Linear Regression (LR) and Support Vector Regression (SVR).
One important application of KNN clustering machine learning is to enable smart tourism and tourist pattern tracking. Then main advantage of KNN is that the online settings are easy to update; however, KNN is unscalable to large datasets. NB is applicable in many fields, such as spam filtering, text categorization, and automatic medical diagnosis [
63]. Due to applying Bayes’ theorem with the “naive” assumption of independence between the features, Naive Bayes classification is fast and highly scalable. The most important application of SVM is real-time prediction, which makes it suitable for real-time intrusions and attack detection. In addition, SVM has the capability to deal with high-dimensional datasets. Nonetheless, SVM suffers from a lack of transparency of results. LR can process at a high rate [
64], and this algorithm is useful in many applications, such as economics, market analysis, and energy usage (to analyze and predict the energy usage of buildings, for example). However, LR is very sensitive to outliers. SVR uses the same basic idea as SVM, a classification algorithm, but applies it to predict real values rather than a class. SVR informs the presence of data non-linearity, and a prediction model is provided. Additionally, SVR is a useful and flexible technique, helping the user to deal with limitations pertaining to the distributional properties of underlying variables (
https://rpubs.com/linkonabe/SLSvsSVR). The applications of SVR include the forecasting of financial markets, prediction of electricity prices, estimation of power consumption, and intelligent transportation systems [
65]. The K-means clustering algorithm is present in many IoT applications, such as smart city, smart home, smart citizen, and air traffic control [
66]. The most important benefits of K-means includes the high scalability and speed. However, K-means presents various disadvantages such as difficulty in predicting the number of clusters (K-Value), and sensitivity to scale. DBSCAN is an effective ML clustering algorithm, especially for large datasets. In addition, DBSCAN is very suitable for smart cities and for anomaly detection in temperature data applications [
67]. Nonetheless, in the case of a dataset with large differences in densities, the clustering process is not efficient. Likewise, the performance of the model is sensitive to the distance metric used for determining whatever region is dense [
68]. FFNN is a neural network trained with a back-propagation learning algorithm. The major advantages of FFNN are its adaptability without support of the user, non-linearity, and robustness. FFNN suffers from having a high number of weights in the neural network and requiring a longer time for training. The application fields of FFNN are smart health and chemistry (i.e., for the prediction of multi-state secondary structures).
The Generative adversarial network (GAN) is a pertinent type of machine learning that is receiving increased attention from researchers, based on two networks—generative network and discriminative network. The first network is used to generate new candidates from a known dataset, while the second serves as candidate evaluation. New emergent applications of GAN are applied in various fields, such as semi-supervised salient object detection in cloud-fog IoT devices [
69] and high-resolution image generation [
70]. On the other hand, the Floor of Log algorithm associated with KNN and SVM is a promising supervised technique based on compressed features for power reduction of mobile devices running face-recognition applications [
71].