[go: up one dir, main page]

0% found this document useful (0 votes)
17 views5 pages

Cryptography Using Machine Learning

Research paper and reviews

Uploaded by

Deepsikha Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Cryptography Using Machine Learning

Research paper and reviews

Uploaded by

Deepsikha Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Science and Technology Publishing (SCI & TECH)

ISSN: 2632-1017
Vol. 4 Issue 9, September - 2020

Some Applications of Machine Learning in


Cryptography
Jollanda Shara
Dept.Mathematics&Computer Science
University “Eqrem Cabej”
Gjirokaster, Albania
e-mail jokrisha@yahoo.com

Abstract— In the 1940’s and 50’s the computer digital data communication, where prior requirement is
science made great progress relying on some data security, so that data should reach to the
theoretical developments of the 1930’s. The intended user. The protection of multimedia data,
cryptography and machine learning, from the very sensitive information like credit cards, banking
beginning, were tightly related with this new transactions and social security numbers is becoming
technology. Cryptography, on the one hand, very important. The protection of these confidential
played an important part during the World War II, data from unauthorized access can be done with
where some computers of that time were destined many encryption techniques. So, for providing data
to accomplish cryptanalytic tasks. On the other security many cryptography techniques are employed,
hand, many authors, such as Turing, Samuel, etc. such as symmetric and asymmetric techniques. [8]
examined the possibility that computers could Cryptography enables the user to transmit confidential
“learn” to perform tasks. information across any insecure network so that it
Machine learning techniques have had a long list cannot be used by an intruder.
of applications in recent years. However, the use Machine Learning (ML) is a branch of AI and is closely
of machine learning in information and network related to (and often overlaps with) computational
security is not new. Machine learning and statistics, which also focuses on prediction making
cryptography have many things in common. The using computers. It has strong ties to mathematical
most apparent is the processing of large amounts optimization, which delivers methods, theory and
of data and large search spaces. application domains to the field. ML is occasionally
In its varying techniques, machine learning has conflated with data mining, but the latter subfield
been an interesting field of study with massive focuses more on exploratory data analysis and is
potential for application. In general, machine known as unsupervised learning. ML can also be
learning and cryptanalysis have more in common unsupervised and be used to learn and establish
that machine learning and cryptography. This is baseline behavioral profiles for various entities and
due to that they share a common target; searching then used to find meaningful anomalies. The pioneer
in large search spaces. A cryptanalyst’s target is of ML, Arthur Samuel, defined ML as a ``field of study
to find the right key for decryption, while machine that gives computers the ability to learn without being
learning’s target is to find a suitable solution in a explicitly programmed.'' ML primarily focuses on
large space of possible solutions. In addition to classification and regression based on known features
cryptography and cryptanalysis, machine learning previously learned from the training data. [4]
has a wide range of applications in relation to As it is pointed out in [3], the area of Machine
information and network security. In these notes, Learning deals with the design of programs that can
we underline some of them. learn rules from data, adapt to changes, and improve
performance with experience. In addition to being one
Keywords—cryptography; application; of the initial dreams of Computer Science, Machine
machine learning; data. Learning has become crucial as computers are
I. INTRODUCTION expected to solve increasingly complex problems and
become more integrated into our daily lives.
Cryptography is the process that involves encryption Machine Learning Theory, also known as
and decryption of text using various mechanisms or Computational Learning Theory, aims to understand
algorithms. A cryptographic algorithm is a the fundamental principles of learning as a
mathematical function that can be used in the process computational process. This field seeks to understand
of encryption and decryption. at a precise mathematical level what capabilities and
Cryptography is a technique used today hiding any information are fundamentally needed to learn
confidential information from the attack of an intruder. different kinds of tasks successfully, and to
Today data communication mainly depends upon understand the basic algorithmic principles involved in
getting computers to learn from data and to improve

www.scitechpub.org
SCITECHP420106 492
Science and Technology Publishing (SCI & TECH)
ISSN: 2632-1017
Vol. 4 Issue 9, September - 2020

performance with feedback. The goals of this theory single action by itself is not important, but of
are both to aid in the design of better automated importance is the sequence of the correct action. [2]
learning methods and to understand fundamental Machine Learning happens when we need a
issues in the learning process itself. machine(computer) to learn to solve a problem based
Machine Learning is a field of research that focuses on, usually large amounts of, data previously fed into
on extracting information from datasets. the machine. Machine Learning can be a good
If the dataset is very large, it is also often referred to solution finder for problems for which we do not have
as Big Data or Data Mining. There are countless a clear algorithm to solve. For example, when we
algorithms in Machine Learning with inputs ranging want the computer to be able to detect spam emails,
from numeric over categorical to text-based. The there is not clear algorithm that is 100% accurate in
applications today seem endless: We have the first finding spam. Hence, Machine Learning can be a
self-driving cars, which have learned to do this via closer to optimum solution when we feed the machine
Neural Networks, we have smartphone keyboards that hundreds or thousands of spam and non-spam
predict the next word based on your individual writing examples. Gradually, the machine will learn to us
style, researchers are working on algorithms that can more and more accurate in detecting spam emails.
predict illness from a set of measured attributes or The larger the data used for learning becomes, the
even a persons genome, and many more. However, more accurate the classification becomes. (E.
many of these application scenarios involve sensitive Alpaydin, 2014)
data : people do not feel safe sending e.g. their With the increasingly in-depth integration of the
medical data to a service provider, because they Internet and social life, the Internet is changing how
either do not trust the provider or are worried about a people learn and work, but it exposes us to
data breach even if they do trust the provider. This increasingly serious security threats, as well. A key
has lead to Machine Learning being a popular topic in issue, which must be solved immediately, is how to
the context of privacy-preserving computations in identify various network attacks, particularly not
general, and Fully Homomorphic Encryption in previously seen attacks.
particular. Cybersecurity is a set of technologies and processes
Generally, Machine Learning can be divided into two designed to protect computers, networks, programs
categories: supervised and unsuper- and data from attacks and unauthorized access,
vised learning. alteration, or destruction… (S. Aftergood, 2017). A
network security system consists of a network security
II. SOME APPLICATIONS OF MACHINE
system and a computer security system. Each of
LEARNING
these systems includes firewalls, antivirus software,
As we mentioned above, Machine Learning can be of and intrusion detection systems (IDS). IDSs help
great help in producing useful information from discover, determine and identify unauthorized system
extremely large amounts of data. behavior such as use, copying, modification and
Classification is one of the most widely used destruction [9].
applications of machine learning… (E. Alpaydin, In the intervening forty years, the field of computer
2014). A common example of classification is the and network security has come to encompass an
classification that banks use for loans; low-risk, and enormous range of threats and domains: intrusion
high-risk. detection, web application security, malware analysis,
Another example of machine learning applications is social network security, advanced persistent threats,
regression. Regression is the type of problem that and applied cryptography, and these are only a few of
produces a number based on multiple inputs…The them.. But even today spam remains a major focus for
output would be a specific number driven from the those in the email or messaging space, and for the
inputs. However, there has to be training that would general public spam is probably the aspect of
enable the system to be more accurate gradually and computer security that most directly touches their own
to learn the impact of change in each of the input lives. [6]
elements. Machine learning was not invented by spam fighters,
Learning associations is also one of the applications but it was quickly adopted by statistically inclined
of machine learning. For example, analysis of technologists who saw its potential in dealing with a
shopping baskets data can produce useful information constantly evolving source of abuse. Email providers
to supermarkets that they can use to improve their and Internet service providers (ISPs) have access to a
sales. wealth of email content, metadata, and user behavior.
Unsupervised learning can also be used in machine Leveraging email data, content-based models can be
learning. When there is no reference output to built to create a generalizable approach to recognize
compare to, the learning is done through input data spam. Metadata and entity reputations can be
only. This type of learning or training is called extracted from email to predict the likelihood that an
unsupervised. email is spam without even looking at its content. By
Reinforcement learning can also be used in machine instantiating a user behavior feedback loop, the
learning applications. In certain applications, the system can build a collective intelligence and improve
output of the system is a sequence of actions. A over time with the help of its users.

www.scitechpub.org
SCITECHP420106 493
Science and Technology Publishing (SCI & TECH)
ISSN: 2632-1017
Vol. 4 Issue 9, September - 2020

Email filters have thus gradually evolved to deal with establishment of cryptographic keys (Ruttor, 2006;
the growing diversity of circumvention methods that Kinzel & Kanter, 2002), and on corresponding attacks
spammers have thrown at them. Even though 86% of (Klimov et al., 2002). [5]
all emails sent today are spam (according to one Machine Learning Theory also has a number of
study, see [10]) the best spam filters today block more fundamental connections to other disciplines. In
than 99.9% of all spam, (see [11]) and it is a rarity for cryptography, one of the key goals is to enable users
users of major email services to see unfiltered and to communicate so that an eavesdropper cannot
undetected spam in their inboxes. These results acquire any information about what is being said.
demonstrate an enormous advance over the simplistic Machine Learning can be viewed in this setting as
spam filtering techniques developed in the early days developing algorithms for the eavesdropper. In
of the Internet, which made use of simple word particular, provably good cryptosystems can be
filtering and email metadata reputation to achieve converted to problems one cannot hope to learn, and
modest results.(see [12]) hard learning problems can be converted into
Computer systems and web services have become proposed cryptosystems. Moreover at the technical
increasingly centralized, and many applications have level, there are strong connections between important
evolved to serve millions or even billions of users. techniques in Machine Learning and techniques
Entities that become arbiters of information are bigger developed in Cryptography. For example, Boosting, a
targets for exploitation, but are also in the perfect Machine Learning method designed to extract as
position to make use of the data and their user base much power as possible out of a given learning
to achieve better security. Coupled with the advent of algorithm, has close connections to methods for
powerful data crunching hardware, and the amplifying cryptosystems developed in cryptography.
development of more powerful data analysis and [3]
Machine Learning algorithms, there has never been a Machine Learning and Cryptography have many
better time for exploiting the potential of Machine things in common: the amount of data to be handled
Learning in security.(see [6]) and large search spaces for instance. The application
of Machine Learning in Cryptography is not new, but
III. ML AND CRYPTOGRAPHY
with over 3 quintillion bytes of data being generated
“…Machine Learning and cryptanalysis can be viewed every day, it is now more relevant to apply Machine
as “sister fields”, since they share many of the same Learning techniques in cryptography than ever before.
notions and concerns. In a typical cryptanalytic Machine Learning generally automates analytical
situation, the cryptanalyst wishes to “break” some model building to continuously learn and adapt to the
cryptosystem. Typically this means he wishes to find large amount of data being fed as input. Machine
the secret key used by the users of the cryptosystem, Learning techniques can be used to indicate the
where the general system is already known. The relationship between the input and output data
decryption function thus comes from a known family of created by cryptosystems. Machine Learning
such functions (indexed by the key), and the goal of techniques such as Boosting and Mutual Learning can
the cryptanalyst is to exactly identify which such be used to create the private cryptographic key over
function is being used. He may typically have the public and insecure channel. Methods such as
available a large quantity of ciphertext and plaintext to Naive Bayesian, support vector machine, and
use in his analysis. This problem can also be AdaBoost, which come under the category of
described as the problem of “learning an unknown classification, can be used to classify the encrypted
function” (that is, the decryption function) from traffic and objects into steganograms used in
examples of its input/output behavior and prior steganography. Besides the application in
knowledge about the class of possible functions…” [1] cryptography, which is an art of creating secure
1. “…The notion of “secret key” in cryptography systems for encrypting/decrypting confidential data,
corresponds to the notion of “target function” the Machine Learning techniques can also be applied
in machine learning theory, and more in cryptanalysis, which is an art of breaking
generally, the notion of “key space” in cryptosystems to perform certain side-channel
cryptography corresponds to the notion of the attacks.
“class of possible target functions”. Another arena in which cryptography and machine
2. “…A critical aspect of any cryptanalytic or learning relate is that of data compression. It has been
learning scenario is the specification of how shown by Blumer et al. that pac-learning and data
the cryptanalyst (learner) may gather compression are essentially equivalent notions.
information about the unknown target Furthermore, the security of an encryption scheme is
function… Even if information is gathered often enhanced by compressing the message before
from random examples, cryptanalytic/learning encrypting it. Learning theory may conceivably aid
scenarios may also vary in the prior cryptographers by enabling ever more effective
knowledge available to the attacker/learner compression algorithms. (see [1])
about the distribution of those examples…”
IV. CRYPTOGRAPHY AND NEURAL NETWORKS
[1]
Prior work at the intersection of machine learning and Cryptanalysis has been an area of great research
cryptography has focused on the generation and interest in the past decade owing to advancements in

www.scitechpub.org
SCITECHP420106 494
Science and Technology Publishing (SCI & TECH)
ISSN: 2632-1017
Vol. 4 Issue 9, September - 2020

Machine Learning algorithms, particularly in neural access to the information that it should not use.
networks. The process of Classical cryptography may be able to support some
discovering the plaintext from a ciphertext without applications along these lines. In particular,
knowing any information about the system or the key homomorphic encryption enables inference on
that was used to encrypt the plaintext is called encrypted data (Xie et al., 2014; Gilad-Bachrach et al.,
cryptanalysis. Any mode of communication is secure 2016). On the other hand, classical cryptographic
only as long as the cryptographic system that encrypts functions are generally not differentiable, so they are
the messages between the sender and the receiver is at odds with training by stochastic gradient descent
strong. Once a third party listening in on the (SGD), the main optimization technique for deep
communication channel is able to decipher the neural networks. Therefore, we would have trouble
encrypted texts, the cipher system is said to have learning what to encrypt, even if we know how to
flaws and to be broken. All ciphers are vulnerable to encrypt. Integrating classical cryptographic
brute-force attacks in that the attackers try to break functions—and, more generally, integrating other
the cipher system by exploring its key space. Though known functions and relations (e.g., (Neelakantan et
this takes a lot of time and computational power, it is al., 2015))—into neural networks remains a
possible to break the system. fascinating problem.[5]
B. Chandra and P. P. Varghese, (2007), used neural
V.CONCLUSIONS
networks to classify the ciphertext based on the
algorithm that was used to encrypt it. They had used In this paper, we have described, briefly, the
Cascade Correlation Neural Network and Back relationship of ML and Cryptography. It is known that
Propagation Network to identify the cipher systems. there exists a wide range of applications of ML in
For training they had used ciphertexts obtained from Cryptography and this range is becoming more and
Enhanced RC6, a block cipher, and from SEAL, a more larger in our times. We have separated here
stream cipher. They had used different types of some of them trying to launch a beam of light to the
datasets with same keys, different keys, same sets of throng of these applications.
plaintexts, different sets of plaintexts etc. and
concluded that cascade correlation worked better than
ACKNOWLEDGMENT (Heading 5)
the back propagation method.
Another author,( Alani MM, 2012), had come up with I want to express my gratitude to many authors
an idea to break Data Encryption Standard (DES) such as [1,2,3] etc. for their excellent help and
cipher using neural network. The author had used the inspiration they offer to me in this research.
known-plaintext attack to arrive at the plaintext. The
algorithm used by the author does not seem to
attempt to find the key, but rather tries to directly find REFERENCES
the plaintext. Though this approach is not considered
to be a cryptographic attack, the work of the author is [1] Ronald L.Rivest, “Cryptography and Machine
commendable as the author had designed a neural Learning”.
network for the process of identifying the plaintext [2] Mohammed M. Alani, “Applications of
using the same plaintext and ciphertext of the same Machine Learning in Cryptography: A Survey”, 2019.
key.
In the paper written by Albassal and Wahdan, (2004), [3] Avrim Blum, Machine Learning Theory,
the authors have described how they were able to use Carnegie Mellon University, Department of Computer
neural networks to break a hypothetical Feistel cipher, Science.
called HypCipher. The round function for the [4] Yang Xin, Lingshuang Kong, Zhi Liu , Yuling
HypCipher had been chosen from the Advanced Chen, Yanmiao Li, Hongliang Zhu, Mingcheng Gao,
Encryption Standard (AES). The back propagation Haixia Hou, Chunhua Wang, “Machine Learning and
technique has been used showing success with 2 and Deep Learning Methods for Cybersecurity”, 2018.
3 rounds of the cipher. An additional hidden layer had
been added for 4 rounds. The model was successful [5] Martin Abadi and David G. Andersen,
in that it used a simple neural network with a simple “Learning to Protect Communications with Adversarial
activation function like the sigmoid function. The Neural Cryptography”.
authors have proposed to use a distributed system to [6] Clarence Chio and David Freeman, “Machine
attack ciphers with more rounds.[7] Learning and Security”, 2017.
Let us consider a neural network with several
components, and suppose that we wish to guarantee [7] Kowsic Jayachandiran, “A Machine Learning
that one of the components does not rely on some Approach for Cryptanalysis”, RIT Computer Science.
aspect of the input data, perhaps because of concerns [8] Ravi K. Sheth, Sarika P. Patel, “Analysis of
about privacy or discrimination. Neural networks are Cryptography Techniques”, International Journal of
notoriously difficult to explain, so it may be hard to Research in Advance Engineering, Volume-1, Issue-2,
characterize how the component functions. A simple 2015.
solution is to treat the component as an adversary,
and to apply encryption so that it does not have

www.scitechpub.org
SCITECHP420106 495
Science and Technology Publishing (SCI & TECH)
ISSN: 2632-1017
Vol. 4 Issue 9, September - 2020

[9] A. Milenkoski, M. Vieira, S. Kounev, A.


Avritzer and B. D. Payne, “Evaluating computer
intrusion detection systems: A survey of common
practices”, ACM Comput. Surv., vol. 48, no. 1, pp. 1-
41, 2015.
[10] https://www.bloomberg.com/
news/articles/2016-01-19/e-mail-spam-goesartisanal
[11] https://www.wired.com/2015/07/ google-says-
ai-catches-99-9-percentgmail-spam/
[12] http://www.paulgraham.com/spam.html

www.scitechpub.org
SCITECHP420106 496

You might also like