ABSTRACT:
Industrial control systems (ICSs) are an essential part of every nation’s critical infrastructure
and have been utilized for a long time to supervise industrial machines and processes. Today’s
ICSs are substantially different from the information technology (IT) devices a decade ago.
The integration of internet of things (IoT) technology has made them more efficient and
optimized, improved automation, and increased quality and compliance. Now, they are a sub
(and arguably the most critical) part of IoT’s domain, called industrial IoT (IIoT).
In the past, to secure ICSs from malicious outside attack, these systems were isolated from
the outside world. However, recent advances, increased connectivity with corporate networks,
and utilization of internet communications to transmit the information more conveniently
have introduced the possibility of cyber-attacks against these systems. Due to the sensitive
nature of the industrial applications, security is the foremost concern.
We discuss why despite the exceptional performance of artificial intelligent (AI) and machine
learning (ML), industry leaders still have a hard time utilizing these models in practice as
a standalone units. The goal of this dissertation is to address some of these challenges to
help pave the way of utilizing smarter and more modern security solutions in these systems.
To be specific, here, we focus on data scarcity for the AI, black-box nature of the AI, high
computational load of the AI.
Industrial companies almost never release their network data, because they are obligated to
follow confidentiality laws and user privacy restrictions. Hence, real-world IIoT datasets are
not available for security research area, and we face a data scarcity challenge in IIoT security
research community. In this domain, the researchers usually have to resort to commercial
or public datasets that are not specific to this domain. In our work, we have developed a
real-world testbed that resembles an actual industrial plant. We have emulated a popular
industrial system in water treatment processes. So, we could collect datasets containing
realist traffic to conduct our research.
There exists several specific characteristics of IIoT networks that are unique to them. We
have provided an extensive study to figure out them and incorporate them in the design. We
have gathered information on relevant cyber-attacks in IIoT systems to run them against
the system to gather realistic datasets containing both normal and attack traffic analogous
to real industrial network traffic. Their particular communication protocols are also their
specific to them. We have implemented one of the most popular one in our dataset. Another
attribute that distinguishes the security of these systems from others is the imbalanced data.
The number of attack samples are significantly lower compared to the enormous number
of normal traffic that flows in the system daily. We have made sure we build our datasets
compliant with all the specific attributes of an IIoT.
When dealing with the IoT technology, especially industrial IoT, we deal with a massive
amount of data streaming to and from the IoT devices. In addition, the availability and
reliability constraints of industrial systems require them to operate at a fast pace and avoid
creating any bottleneck in the system. High computational load of complex AI models might
cause a burden by having to deal with a large number of data and producing the results
not as fast as required. In this dissertation, we utilize distributed computing in the form
of edge/cloud structure to address these problems. We propose Anomaly Detection using
Distributed AI (ADDAI) that can easily span out geographically to cover a large number
of IoT sources. Due to its distributed nature, it guarantees critical IIoT requirements such
as high speed, robustness against a single point of failure, low communication overhead,
privacy, and scalability. We formulate the communication cost which is minimized and the
improvement in performance.
Complete thesis in Adobe Acrobat format.