0 ratings0% found this document useful (0 votes) 794 views17 pages18CS81 Module 4 IOT Notes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
INTERNET OF THINGS TECHNOLOGY
(18CS81) MODULE 4
1. Differentiate between structure and unstructured data
Answer:
‘Structured Data
4 model or schema that defines how the
data is represented or organized
Traditional relational database management
system (RDBMS).
Structured data is easily formatted, stored,
queried, and processed
Structured data is more eas
processed due to _ its
organization.
managed and
well-defined
making business decisions.
Structured data means that the data follows |
It has been the core type of data used for |
Unstructured Data
Unstructured data lacks a logical schema
for understanding and decoding the data
through traditional programming means.
Data type includes text, speech, images,
and video.
Unstructured Data
formatted stored, quetis
Unstructured data can
2. Explain with a block diagram tl
Answer:
es of Data analysis
‘Types of Data Analysis Results
© Descriptive: Descriptive data analysis tells you what is happening, either now or in
the past. For example, a thermometer in a truck engine reports temperature values
every second. From a descriptive analysis perspective, you can pull this data at any
moment to gain insight into the current operating condition of the truck engine. If the
temperature value is too high, then there may be a cooling problem or the engine may
be experiencing too much load.
© Diagnostic: When you are interested in the “why,” diagnostic data analysis can
provide the answer. Continuing with the example of the temperature sensor in the
truck engine, you might wonder why the truck engine failed. Di
Shrisha HS, Assistant Professor, Canara Engineering collegeINTERNET OF THINGS TECHNOLOGY
(18CS81) MODULE 4
might show that the temperature of the engine was too high, and the engine
overheated. Applying diagnostic analysis across the data generated by a wide range of
smart objects can provide a clear picture of why a problem or an event occurred.
* Predictive: Predictive analysis aims to foretell problems or issues before they occur.
For example, with historical values of temperatures for the truck engine, predictive
analysis could provide an estimate on the remaining life of certain components in the
engine. These components could then be proactively replaced before failure occurs.
Or perhaps if temperature values of the truck engine start to rise slowly over time, this
could indicate the need for an oil change or some other sort of engine cooling
maintenance.
* Prescriptive: Prescriptive analysis goes a step beyond predictive
solutions for upcoming problems. A prescriptive analysis of the ter
a truck engine might calculate various alternatives to cost-effeck
truck. These calculations could range from the cost necessary for more fi
changes and cooling maintenance to installing new cool i
or upgrading to a lease on a model with a more power!
looks at a variety of factors and makes the appropri
3. List and Explain Iot data anal, challenges.
Answer:
aging it later can slow or stop the database from
‘operating. Du xibility, revisions to the schema must be kept at a
fe. A dynamic schema is often required so that data model
daily or even hourly.
fre of its data and with managing data at the network level. It is
is possible to analyze and respond to it in real-time. Real-time
Sblem or a situation that needs some kind of immediate response.
ata or Network Analytics: With the large numbers of smart objects in
Iworks that are communicating and streaming data, it can be challenging to
enstire that these data flows are effectively managed, monitored, and secure. Network
analytics tools provide the capability to detect irregular patterns or other problems in
the flow of IoT data through a network.
Shrisha HS, Assistant Professor, Canara Engineering college Page 2INTERNET OF THINGS TECHNOLOGY
(18CS81) MODULE 4
4. Explain the domains which revolve around the common
applications of ML for IOT.
Answer:
© Monitoring: Smart objects monitor the environment where they operate. Data is
processed to better understand the conditions of operations. These conditions can refer
to external factors, such as air temperature, humidity, or presence of carbon dioxide in
a mine, or to operational internal factors, such as the pressure of a pump, the viscosity
of oil flowing in a pipe. ML can be used with monitoring to detect early failure
conditions
© Behaviour control: Monitoring commonly works in conjunctionywith behaviour
control. When a given set of parameters reach a target threshold.
or learned dynamically through deviation from mean values 1
more advanced system would trigger a corrective action,
of fresh air in the mine tunnel, turning the robot army
the pipe.
Praking corrective
actions ead to changes that
improve the overall process. For example, wwater purificatfon plant in a smart city
can implement a system to monitor the effici the purification process based on
ate the
can help
improve the efficie ‘these operations.
© Self-healing, fast-developing aspect of deep learning
closed. loop. nitSring triggers changes in machine behaviour and
operations
feters and automatically deduce and implement new
the results demonstrate a possible gain. The system becomes self-
izing. It also detects new K-means deviations that result in pre-
‘The healing is not
* Velocity: Velocity refers to how quickly data is being collected and analyzed.
Hadoop Distributed File System is designed to ingest and process data very quickly.
Smart objects can generate machine and sensor data at a very fast rate and require
database or file systems capable of equally fast ingest functions.
© Variety: Variety refers to different types of data, Often you see data categorized as
structured, semi-structured, or unstructured, Different database technologies may only
be capable of accepting one of these types. Hadoop is able to collect and store all
three types. This can be beneficial when combining machine data from loT devices
Shrisha HS, Assistant Professor, Canara Engineering college Page 3INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
that is very structured in nature with data from other sources, such as social media or
multimedia that is unstructured.
© Volume: Volume refers to the scale of the data. Typically, this is measured from
gigabytes on the very low end to petabytes or even exabytes of data on the other
extreme. Generally, big data implementations scale beyond what is available on
locally attached storage disks on a single node. It is common to see clusters of servers
that consist of dozens, hundreds, or even thousands of nodes for some large
deployments.
6. Write a short note on Apache kafka.
© Apache Kafka is a distributed publisher-subscriber'm
be scalable and fast
consumers read data from these topics.
© Due to the distributed nature of Kafk:
handle many producers and const
between nodes, allowing topics
© The goal of Kafka is to provide
consumers to connect to
Apache Kafka Data Flow
© Apache Spark is an in-memory distributed data analytics platform designed to
accelerate processes in the Hadoop ecosystem.
Shrisha HS, Assistant Professor, Canara Engineering college Page 4‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
The “in-memory” characteristic of Spark is what enables it to run jobs very quickly.
At each stage of a Map Reduce operation, the data is read and written back to the
disk, which means latency is introduced through each disk operation,
* Real-time processing is done by a component of the Apache Spark project called
Spark Streaming. Spark Streaming is an extension of Spark Core that is responsible
for taking live streamed data from a messaging system, like Kafka, and dividing it
into smaller micro batches. These micro batches are called discretized streams
* Apache Storm and Apache Flink are other Hadoop ecosystem projects designed for
distributed stream processing and are commonly deployed for ToT use cases. Storm
can pull data from Kafka and process it in a near-real-time fashion, and so can Apache
Flink.
7. Write a short note on Lambda architecture. =
Answer: a
Stream Layer >
‘Spark
[| seaming S!9m | Fink
Peak Time
Views:
‘Smart ‘Merged | Serving
ces |+[fE ram i)
oS Pele
= |
J Lambda Architecture
rent system that consists of two layers for ingesting data
ind one layer for providing the combined data (Serving).
rr the packages like Spark and Map Reduce, to operate on the
| focusing on the key attributes for which they are designed and
from a message broker, commonly Kafka, and processed by each layer
f, and the resulting data is delivered to a data store where additional
ing or queries can be run.
‘© Stream layer: This layer is responsible for near-real-time processing of events.
Technologies such as Spark Streaming, Storm, or Flink are used to quickly
ingest, process, and analyze data on this layer.
© Batch layer: The Batch layer consists of a batch-processing engine and data
store. If an organization is using other parts of the Hadoop ecosystem for the
other layers, Map Reduce and HDFS can easily fit.
Shrisha HS, Assistant Professor, Canara Engineering college Page 5‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
© Serving layer: The Serving layer is a data store and mediator that decides
which of the ingest layers to query based on the expected result or view into the
data. The Serving layer is often used by the data consumers to access both
stream and batch layers simultaneously.
‘The Lambda Architecture can provide a robust system for collecting and processing
massive amounts of data and the flexibility of being able to analyze that data at
different rates.
6. One limitation of this type of architecture is its place in the network. Due to the
processing and storage requirements of many of these pieces, the yast majority of
these deployments are either in data centres or in the cloud. Tpi#eould limit the
effectiveness of the analytics to respond rapidly enough if the progessit tengy are
milliseconds or seconds away from the device generating the data
List and explain edge analytics core functi andl illustrate
edge analytics processing unit.
Answer:
¢ Raw input data: This is the raw
processing unit
from the sensors into the analytics
* Analytics processing unit
‘organizes them by time
go Anayios
a
= Aesuting —_Hadoop
cup aa
= —>#
Storage and
Deeper anaes
nud
Eidge Analytics Processing Unit
APU needs to perform the following functions:
a. Filter: The streaming data generated by IoT endpoints is likely to be very large, and
most of it is irrelevant. The filtering function identifies the information that is
considered important,
Shrisha HS, Assistant Professor, Canara Engineering college Page 6INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
b. Transform: In the data warehousing world, Extract, Transform, and Load (ETL)
operations are used to manipulate the data structure into a form that can be used for
other purposes. Analogous to data warehouse ETL operations, in streaming analytics,
‘once the data is filtered; it needs to be formatted for processing.
¢c. Time: As the real-time streaming data flows, a timing context needs to be established.
‘This could be to correlated average temperature readings from sensors on a minute-
by-minute basis. The APU is programmed to report the average temperature every
minute from the sensors, based on an average of the past two minutes.
data streams
1e@From
4. Correlate: Streaming data analytics becomes most useful when
are combined from different types of sensors. Different types
different instruments, but when this data is combined and, analyz
invaluable picture of the situation. Another key aspect isdombining
real-time measurements with pre-existing, or historical, data®
oe °
TOTTOTT
rar fel mans o| const | naa
Historical
Data
ye ith it Dt
dala streams are properly cleaned, transformed, and
fin deeper insights to the data. The patterns can be simple
‘may be complex, based on the criteria defined by the
learning may be leveraged to identify these patterns.
ness intelligence: Ultimately, the value of edge analytics is in the
its to business intelligence that were not previously available.
Shrisha HS, Assistant Professor, Canara Engineering college Page 7‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
9. Distributed analytics throughout the IoT system.
Edge Location Processing Regional
Sensors and Edge Processing Platforms Data Center
(© Indicates MOTT
‘Communication
Distributed Analytics throughout the lo? §
* Streaming analytics may be performed directly at the ed®¥Qjn thefog, or in the cloud
data centre. There are no hard-and-fast rules dictating whet
but there are a few guiding principles.
* Fog analytics allows you to see beyond o1
ving you visibility into an
from a wider set.
communicate via MQTT throt
a broader data set.
‘is located on the same oil rig and performs
jge devices, giving it better insights due to the
le to respond to an event as quickly as analytics
levice, but itis still close to responding in real-time as
shed with the data, it communicates the results
analysis
Shrisha HS, Assistant Professor, Canara Engineering college Page 8‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
10. Demonstrate smart grid FAN analytics with Net-Flow
example.
Answer:
isco
‘Site #1
mse
(606
¢ :
26a
siceapnaet
=
—/
" ae
{6 6 8y
(ooa B<_ |& | A
a ; work Operations Management
‘Multi-Services FAN. Public Key Infrastructure
me Tex Netto
a Srcemeer
£006. Network Management
(000,84
Sa
Potential Flexible Netfiow 4
Collector Points
Headend
((Pv4 and IPV6 Traffic,
IP Addresses,
‘TCP/UDP Port Numbers, etc.)
iQ AN Analytics with NetFlow
‘ork (FAN) traffic analytics performed on the aggregation
{trary to generic computing platforms, are designed to directly
a very small number of specific application servers, such as an IoT
lata broker, or specific application servers and network management
refore, it could be said that IoT solutions and use cases tightly couple
* Network analytics has the power to analyze details of communications patterns made
by protocols and correlate this across the network. It allows you to understand what
should be considered normal behavior in a network and to quickly identify anomalies
that suggest network problems due to suboptimal paths, intrusive malware, or
excessive congestion.
«© Network analytics offer capabilities to cope with capacity planning for scalable loT
deployment as well as security monitoring in order to detect abnormal traffic volume
and patterns such as an unusual traffic spike for a normally quiet protocol for both
centralized or di fog computing.
Shrisha HS, Assistant Professor, Canara Engineering college Page 9‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
11. Explain the benefits of flow analytics or network
analytics.
Answer:
‘The benefits of flow analyti
follows:
addition to other network management servi
Network traffic monitoring and profiling: Flow collection from the network layer provides
global and distributed near-real-time monitoring capabilities. IPv4 and IPyp network wide
traffic volume and pattern analysis helps administrators proactively dj roblems and
quickly troubleshoot and resolve problems when they occur. e
Application traffic monitoring and profiling: Monitoring and
a detailed time-based view of IoT access services, such as
including MQTT, CoAP, and DNP3, as well as the associate
over the network.
mn dayér protocols,
thaPare being used
Capacity planning: Flow analytic
help in the planning of upgrades when deploying’
captured data over a long period of time. This
generate a low volume of traffic and
always send their data to the sa ange in network traffic behaviour may
indicate a cyber security event, of service (DoS) attack. Security can be
enforced by ensuring that nggfaf side the scope of the IoT domain
ers or gateways are often physically isolated and
PNs for backhaul. Deployments may have thousands
ile IoT infrastructure over a cellular network. Flow
4 can be warehouse for later retrieval and
vice IoT infrastructures and applications,
d data mining: Flow d:
of proactive analysis of multi
12. Write a shot note on flexible Net-Flow Architecture.
Answer:
FNF is a flow technology developed by Cisco Systems that is widely deployed all over the
world. Key advantages of FNF are as follows:
Flexibility, scalability, and aggregation of flow data
Ability to monitor a wide range of packet information and produce new information
about network behaviour:
© Enhanced network anomaly and security detection
Shrisha HS, Assistant Professor, Canara Engineering college Page 10‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
© User-configurable flow information for performing customized traffic identification
and ability to focus and monitor specific network behaviour
© Convergence of multiple accounting technologies into one accounting mechanism
FNF Components:
* ENF Flow Monitor (Net Flow cache): TI
Flow cache or information stored in
record definitions with key fields
match statement) and non-ke
Flow Monitor describes the Net
Monitor contains the flow
ite a’ flow, unique per flow record:
with the flow as attributes or
‘two primary methods for accessing Net Flow data: Using
the command-line interface (CLI), and using an application
ble Net
Export timers: Timers indicate how often flows should be exported to the
tion and reporting server.
Net Flow export format: This simply indicates the type of flow reporting format.
© Net Flow server for collection and reporting: This is the destination of the flow
export, It is often done with an analytics tool that looks for anomalies in the traffic
patterns.
Shrisha HS, Assistant Professor, Canara Engineering college Page 11INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
* Flexible Net Flow in Multiservice loT Networks:
It is recommended that FNF be configured on the routers that aggregate connections from
the last mile’s routers. This gives a global view of all services flowing between the core
network in the cloud and the [oT last-mile network.
Challenges with deploying flow analytics tools in an loT network include the following:
a, The distributed nature of fog and edge computing may mean that traffic flows are
processed in places that might not support flow analytics, and visibility is thus lost.
b. IPy4 and IPv6 native interfaces sometimes need to inspect inside VP!
may impact the router’s performance.
tunnels, which
¢. Additional network management traffic is generated by FNF reporting devipes. The
added cost of increasing bandwidth thus needs to be si
backhaul network uses cellular or satellite communicatj
e
13.
Answer:
© Erosion of Networl “wo of the major challenges in securing
I design and ongoing maintenance. The initial
‘oncept that networks were safe due to physical
th minimal or no connectivity to the outside world,
cers lacked sufficient knowledge to carry out security
iy have been a solid design to begin with is eroded through
ividual changes to hardware and machinery without
fe broader network impact. This led to miscalculations of
‘s and the introduction of wireless communication in a standalone
consideration of the impact to the original security design. These
1¢ Le to weak or inadequate network and systems security.
© Pervasive Legacy Systems: Due to the static nature and long lifecycles of equipment
in industrial environments, many operational systems may be deemed legacy systems.
Legacy components are not restricted to isolated network segments but have now
been consolidated into the IT operational environment. Legacy components are not
restricted to isolated network segments but have now been consolidated into the IT
operational environment. From a security perspective, this is potentially dangerous as
many devices may have historical vulnerabilities or weaknesses that have not been
patched and updated.
* Insecure Operational Protocols: Many industrial control protocols, particularly
those that are serial based, were designed without inherent strong security
Shrisha HS, Assistant Professor, Canara Engineering college Page 12INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
requirements. Their operation was often within an assumed secure network. Common
industrial protocols and their respective security concerns are as follow:
© Modbus: Authentication of communicating endpoints was not a default
‘operation because it would allow an inappropriate source to send improper
commands to the recipient. Some older and serial-based versions of Modbus
‘communicate via broadcast. The ability to curb the broadcast function does not
exist in some versions. There is potential for a recipient to act on a command
that was not specifically targeting it
© DNP3: participants allow for unsolicited responses, which could trigger an
undesired response. The missing security element here ig the ability to
establish trust in the system’s state and thus the ability to ie veracity of
the information being presented.
© ICCP (Inter-Control Center Communications
require authentication for communication. Sec
protocol was not enabled as a default conditios
‘man-in-the-middle (MITM) and replay attacl
© OPC (OLE for Process Control): Dependence
to clearly understand the much vul
second requires you to identify the |
specific network.
facturing Message Specification),
ibstation Event), and SV (Sampled
Values). Authentis in MMS, but it is based on clear-text
jeans there is no way to verify its authenticity or
ave limited message integrity, which makes it
. fion base of legacy systems, control and communication
ies. Many of the systems utilize
can be easily downloaded and worked against. They operate
yfre and standard operating systems. Components used within those
ll known to traditionally IT-focused security researchers. There
develop new tools or techniques when those that have long been in pl:
tly adequate to breach the target’s defences
fndence on External Vendors: Direct and on-demand access to critical systems
on the plant floor or in the field is sometimes written directly into contracts or
required for valid product warranties. This has clear benefits in many industries as it
allows vendors to remotely manage and monitor equipment and to proactively alert
the customer if problems are beginning to creep in. While contracts may be written to
describe equipment monitoring and management requirements with expl
statements of what type of access is required and under what conditions, they
generally fail to address questions of shared liability for security breaches or
processes to ensure communication security.
Shrisha HS, Assistant Professor, Canara Engineering college Page 13,INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
* Security Knowledge: In the industrial operations space, the technical investment is
primarily in connectivity and compute. It has seen far less investment in security
relative to its IT counterpart. Due to the importance of security in the industrial space,
all likely attack surfaces are treated as unsafe.
14. Write a short note on Purdue Model for control hierarchy
with diagram.
Answer:
eer
Enterprise Zone
Safety ‘Safety-Critcal
The Logical Fipmi ‘thePRiive Model for Contol Hierarchy
‘This model identifies level OMperatigns gad defines each level:
© Enterprise z
‘Business planning and logistics network: The IT services exist at
‘vel and may include scheduling systems, material flow applications,
»ptimization and planning systems, and local IT services such as phone, email,
rinting, and security monitoring.
* Industrial demilitarized zone
© DMZ: The DMZ provides a buffer zone where services and data can be
shared between the operational and enterprise zones. It also allows for easy
segmentation of organizational control. By default, no traffic should traverse
the DMZ; everything should originate from or terminate on this area.
* Operational zone:
Level 3: Operations and control: This level includes the functions involved
in managing the workflows to produce the desired end products and for
monitoring and controlling the entire operational system.
Shrisha HS, Assistant Professor, Canara Engineering college Page 14INTERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
© Level 2: Supervisory control: This level includes zone control rooms,
controller status, control system network/application administration, and other
control related applications.
© Level 1: Basie control: At this level, controllers and IEDs, dedicated HMIs
(human-machine interface), and other applications may talk to each other to
run part or all of the control function.
© Level 0: Process: This is where devices such as sensors and actuators and
machines such as drives, motors, and robots communicate with controllers or
IEDs.
© Safety zone:
© Safety-critical: This level includes devices, sensors, ang“other equipment
used to manage the safety functions of the control system. e
15. Compare the nature of traffic flows
Networks. e
Answer:
IT networks: In an IT environment, there are mau s s. The communication
data flows that emanate from a typical IT endpoi | relatively far. They frequently
traverse the network through layers of swit
responded to, or triggers actions in méreWgg/ervipy such as a printer. In the case of email
‘or web browsing, the endpoint j ions Kat leave the confines of the enterprise
network and potentially travel arg
vironment (Levels 0-3), there are typically two
cal traffic that may be contained within a specific
ring and closed-loop control. This is the traffic that is
processes and does not need to leave the process control
is used for monitoring and control of areas or zones or the
ic is a good example of this, where information about remote
ition from a function is shared at a system level so that operators
Shrisha HS, Assistant Professor, Canara Engineering college Page 15‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
16. Illustrate Formal Risk Analysis structures Octave and Fair.
Answer:
OCTAVE:
‘Sep 1: Estabish
Fisk Step 6: dently
Measurement Risks
itera |
Step 5: Identity °
Step 7: Analyze ||
Threat
Scenarios fee
Identity and
Wdentity Threats Mitigate Risk
AVE
1 methodology is to establish a risk
ying a risk measurement criterion is that at any
can take place against the reference model.
is profile is populated
sociated with each asset, including
is and possible locations where the information might reside, The
‘the container level rather than the asset level. The value is to reduce
ibitors within the container for information operation.
uurth step is to identify areas of concern. Judgments are made through a
mapping of security-related attributes to more business-focused use cases. The analyst
looks to risk profiles and delves into the previously mentioned risk analy:
¢ The Fifth step is where threat scenarios are identified. Threats are broadly (and
properly) identified as potential undesirable events. This definition means that results
from both malevolent and accidental causes are viable threats.
© At the sixth step risks are identified. Within OCTAVE, risk is the possibility of an
undesired outcome. This is extended to focus on how the organization is impacted.
Shrisha HS, Assistant Professor, Canara Engineering college Page 16‘TERNET OF THINGS TECHNOLOGY (15CS81)
MODULE 4
The seventh step is risk analysis, with the effort placed on qualitative evaluation of
the impacts of the risk. Here the risk measurement criteria defined in the first step are
explicitly brought into the process.
© Mitigation is applied at the eighth step. There are three outputs or decisions to be
taken at this stage. One may be to accept a risk and do nothing, other than document
the situation, potential outcomes, and reasons for accepting the risk. The second is to
mitigate the risk with whatever control effort is required. The final possible action is
to defer a decision, meaning risk is neither accepted nor mitigated.
AIR:
‘of emphasis.
operational da
loss. A clear hierarchy of sub-elements 1
focused on frequency and the other o
* Loss even frequency is the result of fing on an asset with a resulting
n frequency called the threat event
Vulnerability here is no some compute asset weakness, but is more
broadly defined as the fe targeted asset will fail as a result of the
és, four of them externally focused and two internally
for operational teams are productivity and replacement
3. Explain in detail how IT and OT security practices and systems vary in real time.
(uly 2019)
4, Discuss OCTAVE and FAIR formal risk analysis. (July 2019)
Exercise Questions
1. Explain i) NOSQL databases, ii) Hadoop iii) YARN
2. Explain i)Supervised Learning, ii)Unsupervised Learning iii) Neural Network
3. The Phased Application of Security in an Operational Environment.
Shrisha HS, Assistant Professor, Canara Engineering college Page 17