[go: up one dir, main page]

0% found this document useful (0 votes)
9 views80 pages

Shakeel Ur Rahman

Uploaded by

Rabnawaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views80 pages

Shakeel Ur Rahman

Uploaded by

Rabnawaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Optimisation of Quantum Machine Learning

Circuit for Classification, Using Variational


Quantum Classifier

By

Shakeel ur Rahman

(Registration No: 00000365071)

A thesis submitted to the National University of Sciences and Technology, Islamabad,

in partial fulfillment of the requirements for the degree of

Master of Science in

Physics

Supervisor: Dr. Aeysha Khalique

School of Natural Sciences

National University of Sciences and Technology (NUST)

Islamabad, Pakistan

(2024)
DEDICATION

I Dedicate this Thesis To

My Parents
Whose relentless efforts paved the way to my success.

To my
Grandfather
Whose absence I miss the most.

To my
Teachers
Who enabled me to be what I wished to be.

And to my
friends
Whose company made every moment worth living.

I
ACKNOWLEDGEMENTS

All praises belong to Allah almighty, the creator of all beings and the most merciful
and benevolent. Blessings of almighty have been accompanying throughout writing
this piece. He helped me to be creative and empowered me with strength to cope with
hard consequences and obstacles on the way. Then tributes and countless blessings to
the Holy Prophet (PBUH); the ultimate source of inspiration.

I submit my loyalty and respect to my parents whose magnanimous, selfless and un-
conditional love and affection carved out the way to present.

I owe a great debt of gratitude to devoted supervisor, Dr. Aeysha Khalique. I


feel honoured and lucky working in her mentorship taking advantage of her expertise
in the field. I extend my gratfulness to GCE members Dr. Ali Piracha and Dr.
Faisal Munir who guided me throughout the journey and provided their valuable
suggestions.

Finally, a special thanks to my fellas whose uninterrupted support and encouragement


fuelled the fire of the academic journey.Words cannot express my deepest gratitude to
Major Aruza for her unwavering kindness, support, and encouragement throughout
this endeavor.

I
Contents

LIST OF TABLES V

LIST OF FIGURES VIII

ABSTRACT IX

1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Introduction to Quantum Computing 4


2.1 Bits vs Qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Representation of Qubits . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Dirac Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Vector Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Bloch sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Multi Qubit System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Quantum Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4.1 Quantum Superposition . . . . . . . . . . . . . . . . . . . . . . 7
2.4.2 Quantum Entanglement . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Models of Quantum Computation . . . . . . . . . . . . . . . . . . . . . 8
2.5.1 Adiabatic Quantum Computing (AQC) . . . . . . . . . . . . . . 8
2.5.2 Gate Model Quantum Computing . . . . . . . . . . . . . . . . . 9
2.5.3 Universal Quantum Gates . . . . . . . . . . . . . . . . . . . . . 12
2.6 Efficacy of Quantum Computer . . . . . . . . . . . . . . . . . . . . . . 12
2.6.1 Complexity classes . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6.1.1 P Class . . . . . . . . . . . . . . . . . . . . . . . . . . 13

II
2.6.1.2 NP Class . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.1.3 NP Hard Class . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1.4 NP Complete . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1.5 BQP Class . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Evolution of Quantum Computing . . . . . . . . . . . . . . . . . . . . . 14

3 Quantum Machine Learning 16


3.1 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Types of Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Quantum Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 Classical Data on Classical Device (CC) . . . . . . . . . . . . . 19
3.3.2 Quantum Data on Classical Device (QC) . . . . . . . . . . . . . 19
3.3.3 Classical Data on Quantum Device (CQ) . . . . . . . . . . . . . 19
3.3.4 Quantum Data on Quantum Device (QQ) . . . . . . . . . . . . 19
3.4 Important QML Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.1 Quantum Support Vector Machines and Kernals (qSVM) . . . . 20
3.4.2 Quantum Generative Adversarial Networks (QGANs) . . . . . . 21
3.4.3 Variational Quantum Eigen-Solver (VQE) . . . . . . . . . . . . 22
3.5 Variational Quantum Classifier (VQC) . . . . . . . . . . . . . . . . . . 23
3.5.1 Feature Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5.1.1 Basis encoding . . . . . . . . . . . . . . . . . . . . . . 24
3.5.1.2 Amplitude Encoding . . . . . . . . . . . . . . . . . . . 24
3.5.1.3 Angle Encoding . . . . . . . . . . . . . . . . . . . . . . 25
3.5.2 Ansatz (Parameterised Circuit) . . . . . . . . . . . . . . . . . . 25
3.5.2.1 Expressibility . . . . . . . . . . . . . . . . . . . . . . . 26
3.5.2.2 Entangling capability . . . . . . . . . . . . . . . . . . . 27
3.5.2.3 Hardware efficiency . . . . . . . . . . . . . . . . . . . . 28
3.5.3 Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5.4 Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5.5 Gradient Based optimisation . . . . . . . . . . . . . . . . . . . . 28
3.5.5.1 Gradient Free Optimisation . . . . . . . . . . . . . . . 29

III
4 VQC for Penguins and Titanic Data sets 30
4.1 Penguins Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 QML Classification for Penguin Dataset . . . . . . . . . . . . . . . . . 30
4.2.1 Importing and Displaying Dataset . . . . . . . . . . . . . . . . . 31
4.2.2 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.3 Label Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.4 Previewing Data . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.5 Displaying Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.6 Removal of Weak Variables . . . . . . . . . . . . . . . . . . . . 34
4.2.7 Ordering the Data . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.8 Normalisation and Test/Train Split . . . . . . . . . . . . . . . . 35
4.2.9 Encoding Classical Data into Quantum States . . . . . . . . . . 36
4.2.10 Application of Suitable Ansatz . . . . . . . . . . . . . . . . . . . 37
4.2.11 Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.12 Calling VQC Function . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.13 Displaying Optimisation Path . . . . . . . . . . . . . . . . . . . 39
4.2.14 Displaying Accuracy . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Classification of Penguins Data Through Classical Support Vector Ma-
chine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 QML and CML Comparison for Titanic Data . . . . . . . . . . . . . . 41

5 Optimal Number of Reps For Feature-Map and Ansatz 43


5.1 Relation between Numbers of Reps and Accuracy For Penguins Data-set 43
5.1.1 Numbers of Reps vs Accuracy with Cobyla . . . . . . . . . . . . 43
5.1.2 Numbers of Reps vs Accuracy with SPSA . . . . . . . . . . . . 45
5.2 Relation between Numbers of Reps and Accuracy For Titanic Data-set 48
5.2.1 Numbers of Reps vs Accuracy with Cobyla . . . . . . . . . . . . 48
5.2.2 Numbers of Reps vs Accuracy with SPSA . . . . . . . . . . . . 50

6 Analysis of The Results 54


6.1 Training and Test Data accuracy for Penguins data . . . . . . . . . . . 54
6.1.1 Training and Test Data Accuracy Comparison Using Cobyla . . 54
6.1.2 Training and Test Data Accuracy Comparison Using SPSA . . . 56
6.1.3 Optimal Combination of Reps . . . . . . . . . . . . . . . . . . . 57
6.2 Training and Test Data accuracy for Titanic data . . . . . . . . . . . . 57
6.2.1 Training and Test Data Accuracy Comparison Using Cobyla . . 57
6.2.2 Training and Test Data Accuracy Comparison Using SPSA . . . 58
6.2.3 Optimal Combination of Reps . . . . . . . . . . . . . . . . . . . 60
6.3 Analysis of the results . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7 SUMMARY OF RESEARCH WORK 62

IV
List of Tables

6.1 Optimal combinations of reps of feature map and Ansatz, penguins data 57
6.2 Optimal combinations of reps of feature map and Ansatz, Titanic data 60

V
List of Figures

2.1 Bit vs Qubit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


2.2 Various Representations of Qubits . . . . . . . . . . . . . . . . . . . . . 5
2.3 Bloch sphere Representing qubits . . . . . . . . . . . . . . . . . . . . . 6
2.4 Left: Example of a digital subtractor circuit. Right: Example of a
quantum gate circuit for generating a 4-qubit Bell state. . . . . . . . . 10
2.5 Complexity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Timeline of Evolution in Quantum Computation and QML . . . . . . . 15

3.1 Types of Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 17


3.2 Approaches to QML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Linearly and non Linearly separable data . . . . . . . . . . . . . . . . . 20
3.4 Kernal Trick to eperate nonlinearly separable data . . . . . . . . . . . . 21
3.5 Working of Generative Adversarial Networks . . . . . . . . . . . . . . . 21
3.6 Variational Quantum Classifier Design . . . . . . . . . . . . . . . . . . 23
3.7 Ansatz: EfficientSU2 with features = 3, Reps = 1, Entangling Pattern
= Circular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.8 Expressibility of various PQC models shown on Bloch sphere . . . . . . 27
3.9 Entanglement capability of two different PQCs . . . . . . . . . . . . . . 27
3.10 Variance decreases with the increase of number of qubits, Baren Plateau 29

4.1 Importing the necessary libraries and functions . . . . . . . . . . . . . . 31


4.2 Importing and displaying the penguins dataset . . . . . . . . . . . . . . 31
4.3 Displaying all the missing values . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Displaying data after cleaning . . . . . . . . . . . . . . . . . . . . . . . 32
4.5 Label encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.6 Previewing the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.7 displaying data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.8 Removing weak variables . . . . . . . . . . . . . . . . . . . . . . . . . . 35

VI
4.9 Ordering the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.10 Selecting target and labels . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.11 Splitting data into train and test data . . . . . . . . . . . . . . . . . . . 36
4.12 Data encoding through ZZ feature map . . . . . . . . . . . . . . . . . . 36
4.13 RealAmplitudes ansatz with Reps = 3 . . . . . . . . . . . . . . . . . . 37
4.14 COBYLA Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.15 Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.16 Variational quantum classifier . . . . . . . . . . . . . . . . . . . . . . . 39
4.17 Objective function optimization against the number of iterations . . . . 39
4.18 Accuracy of training and test data . . . . . . . . . . . . . . . . . . . . . 40
4.19 Classification through classical support vector machine algorith . . . . 40
4.20 Accuracy of results using SVM . . . . . . . . . . . . . . . . . . . . . . . 41
4.21 VQC result for Titanic data . . . . . . . . . . . . . . . . . . . . . . . . 41
4.22 SVM result for Titanic data . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = cobyla. . . . . . . 44
5.2 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = cobyla. . . . . . . 44
5.3 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = cobyla. . . . . . . 45
5.4 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = cobyla. . . . . . . 45
5.5 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = SPSA. . . . . . . 46
5.6 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 4, optimizer = SPSA. . . . . . . 46
5.7 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = SPSA. . . . . . . 47
5.8 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = SPSA. . . . . . . 47
5.9 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = Cobyla. . . . . . 48
5.10 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 4, optimizer = Cobyla. . . . . . 49
5.11 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = Cobyla. . . . . . 49

VII
5.12 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = Cobyla. . . . . . 50
5.13 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 7, optimizer = Cobyla. . . . . . 50
5.14 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = SPSA. . . . . . . 51
5.15 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 4, optimizer = SPSA. . . . . . . 51
5.16 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = SPSA. . . . . . . 52
5.17 Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = SPSA. . . . . . . 52
5.18 Training and test data accuracy against different reps of feature-map
and ansatz, number of features = 7, optimizer = SPSA. . . . . . . . . . 53

6.1 Training data accuracy against different No. of reps for various No. of
features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2 Test data accuracy against different No. of reps for various No. of fea-
tures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3 Training data accuracy against different No. of reps for various No. of
features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.4 Test data accuracy against different No. of reps for various No. of fea-
tures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 Training data accuracy against different No. of reps for various No. of
features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.6 Test data accuracy against different No. of reps for various No. of fea-
tures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.7 Training data accuracy against different No. of reps for various No. of
features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.8 Test data accuracy against different No. of reps for various No. of fea-
tures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

VIII
Abstract

Quantum machine learning (QML) is an emerging area of research in the domain of


Quantum computing, leveraging unique properties of entanglement and superposition.
Although QML is in early stages of development, yet there are quite many areas where
its potential advantage is highly probable, ranging from finance to drug discovery
and molecular simulations. Quantum computers of NISQ(Noisy Intermediate Scale
Quantum Computers) era are prone to decoherence and errors. Its is expected that
the novelty of the quantum computing in general and QML in particular will be more
evident with fault tolerant quantum computers in future.
Variational Quantum Eigen-solver (VQE) is an algorithm in QML that replicates
the classical neural network employing both quantum and classical resources. We took
supervised machine learning approach employing the instance of VQE called the Vari-
aitonal Quantum Classifier(VQC); a classification algorithm. Feature map and Ansatz
are important components of VQC for mapping classical data into quantum states and
serving as a parameterised circuit to look for ground state for the Hamiltonian of the
system, respectively. Number of the repetitions of both featuremap and ansatz applied
in a circuit significantly fluctuates the accuracy of the model. Following thesis look for
the optimal number of repetitions.
For the coding implementation a library ’qiskit’ is used in the python framework.
Two data sets ’Penguins data’ and ’Titanic data’ are used with ZZfeature map for
feature mapping and Real-mplitude as an ansatz. Number of repetitions for feature
map has been varied from 1 to 3 and for ansatz it has been varied from 1 through
5. Various combinations help determine the optimal number of repetitions and our
findings establish that less number of repetitions of featuremap and more number of
repetitions of ansatz give better accuracy.

IX
Chapter 1

Introduction

This chapter is a general introduction to approaches and goals, which will follow in the
next chapters. It sets the objectives of the research work and briefly introduces the
tools that have been used.

1.1 Overview
Quantum computing technology is at the verge of revolution with improvements being
announced every day. The realisation of what once was a dream is imminent. The
utility and potential advantage that we can leverage through quantum computer, has
led us to pluck the chords of our imagination and tread on a path that was never been
trodden before. Today we have purely quantum and hybrid models ready to over-
come the challenges of modern era. Cryptography to sensing, AI to machine learning,
quantum computing is coming out to be the torchbearer of next tech revolution.
Quantum computing employ the peculiar laws of quantum mechanics and put them
to use for its construct and operation. Entanglement and superposition are two impor-
tant non classical phenomena at which, under the hood, all the quantum supremacy
rests. 0s and 1s commonly known as the bits in classical computing with deterministic
values, in quantum computing are transformed into qubits with simultaneous proba-
bilistic values of both 0 and 1. Quantum Annealing and gate model paradigms are two
independent approaches to quantum computation. Resorting to different mathematical
principles in their construct, both are now working in competition.
Quantum algorithms that have mathematically known advantages are yet to be
put to test on a quantum computer with enough qubits and computational resources.
Quantum hardware is a major obstacle in the way. At the moment the quantum
computers of several hundred qubits are in use but we would require millions of them
to prove the quantum advantage of Shor’s algorithm for instance. We have several types
of qubits currently in use such as superconducting qubits, trapped ions qubits, photonic

1
qubits etc. some of them needs ultra-low temperatures and others are difficult to scale
up. Qubits are extremely sensitive to noise and decoherence that cause them to give
faulty outputs, altering the results. Despite these challenges, active research is making
progress by leaps and bounds to make the qubits immune to noise and go beyond
the NISQ (noisy intermediate scale quantum computers) era. Latest achievements
announced by IBM and google is a silver lining.

1.2 Goals
Machine learning has become a pertinent part of modern tech industry. AI revolution is
marching forward, resting on the pillars of machine learning and data science. With the
advent of Quantum computing technology, a race has begun to achieve the quantum
advantage. And just like other algorithms a parallel of classical machine learning,
the quantum machine learning is looming. From quantum support vector machine to
quantum adversarial networks to quantum variational eigen solver, numerous successful
algorithms are underway and many have stood the test of time.
With all the hype on horizon goal of all the endeavor is to look for the quantum
advantage and possibly expand it beyond merely solving a few mathematical prob-
lems. Quantum computing has theoretically shown clear advantage with its potential
to solve some NP problems in polynomial time. QML is expected to outperform clas-
sical counterpart by employing its unique features of entanglement and superpositions
that enables it to represent complex data structure and map intricate relationships on
feature space. In NISQ era it’s all about building an algorithm that can efficiently run
on the existing device with minimum numbers of qubits and gates, and least possible
circuit depth. Looking for optimal number of gates to map the data efficiently onto
the quantum states is mandatory for an efficient model and this is the main goal of the
following work.

1.3 Thesis structure


Chapter 2 is introduction to quantum computing- key concepts and important ter-
minology is introduced at apt length. quantum computing application and various
models are discussed. Chapter 3 is an overview of quantum machine learning, its
various frameworks, and numerous algorithms that are popular in QML. Important
segments of Variational quantum classifier(VQC) are discussed in detail which is the
classification algorithm that has been used for generating results in the following chap-
ters. Chapter 4 begins with the coding excerpts for the implementation of VQC for the
Penguins’ dataset along with the same task performed in Classical machine learning
model for accuracy comparison. Similar comparison for Titanic data is presented in

2
the same chapter. Chapter 5 contains the training and test data results for different
numbers of features for both datasets. Graphs of accuracy against different combina-
tions of reps are presented for easy comparison. Chapter 6 presents combined graphs of
various number of features and ends with analysis of the results. It concludes with the
final commentary on optimal combination of reps. Chapter 7 summarises the complete
work and thesis ends with Bibliography.

3
Chapter 2

Introduction to Quantum Computing

Quantum computing leverages the peculiar laws of quantum mechanics to bring the
computation to life. The concept took birth in the time of Feynman [1] nevertheless, it
has been promulgated quite recently. Just like bits are the basic units of computation
in classical computing, Qubits or Q-bits are the fundamental units of computation
in quantum computing. Qubit is a mathematical entity to which information can be
coded. Besides, it also has a physical existence and there are many ways to prepare
qubits and we still are in the phase of hunting for the most efficient way to prepare
them.

2.1 Bits vs Qubits


Bits in classical computer refers to 0 or 1. All the computation and circuital operations
that take place at the machine level in the CPU are in the forms of 0s, 1s, and their
complex combinations. Qubits are different from bits in two ways [2]:

1. Bits exists as 0 or 1 whereas Qubits exist as both 0,1 and their superposition.

2. Measurements of Bits yield definite result as either 0 or 1 but, measurement of


Qubits yields probabilistic result.

4
Figure 2.1: Bit vs Qubit
[3]

2.2 Representation of Qubits


There are three ways to represent a Qubit namely; Dirac notation, State vector repre-
sentation and graphically on Bloch sphere.

Figure 2.2: Various Representations of Qubits


[4]

2.2.1 Dirac Notation


Dirac notation refers to the representation of Qubit in the form of bra and Ket. This
is the most common and easy to deal with, way to represent qubits in quantum infor-
mation and computation science. A special kind of bracket ‘|〉’ is used to put ‘0’ or
‘1’ state in it and called as ket-vector. Zero state is represented as |0〉 and one state
as |1〉. A more general representation of qubit is this; |ψ⟩ = α|0⟩ + β|1⟩. Here, α and
β are the complex probability amplitudes, whereas |α|2 and |β|2 are the probabilities

5
associated with |0〉 and |1〉 respectively. Also, by Born Rule we have |α|2 + |β|2 = 1.
[5]

2.2.2 Vector Notation


In vector notation, a qubit is represented by a two-dimensional column vector with
complex numbers as entries. The first entry represents the probability amplitude of
measuring the qubit in the |0〉 state, and the second entry represents the probability
amplitude of measuring the qubit in the |1〉 state. These probabilities must satisfy the
normalization condition, which means that the sum of the squares of the magnitudes
of the amplitudes must be equal to 1. [6]
" # " #
1 0
|0⟩ = , |1⟩ = (2.1)
0 1
. " # " #
1 0
|ψ⟩ = α +β (2.2)
0 1
.

2.2.3 Bloch sphere


Bloch sphere is graphical representation of Qubit. It is a unit sphere on the surface of
which all possible pure states exist and all the mixed states exist inside it. |0〉 and |1〉
occupy the north and south pole of the sphere respectively, and, |+〉 and |-〉 states lie
on the equator, as shown in Figure 2.3

Figure 2.3: Bloch sphere Representing qubits


[7]

6
2.3 Multi Qubit System
The representation of qubits can be extended to express multi-qubit systems. This can
be achieved by taking the tensor product of individual qubit states. For instance, a
two-qubit state can be represented by the tensor product of two single-qubit states.
This approach provides a powerful formalism for describing the quantum states of
multi-qubit systems.
       
1 0 0 0
0 1 0 0
       
|00⟩ =   , |01⟩ =   , |10⟩ =   , |11⟩ =   (2.3)
0 0 1 0
0 0 0 1

2.4 Quantum Parallelism


Anticipated quantum advantage over classical computation stems from the parallel
computing that former one offers. It is fundamental concept in quantum computation
which means that quantum computer can do computations on multiple inputs simul-
taneously, a remarkable ability that gives exponential speed up and offers solutions to
classically unsolvable problems. [8]

2.4.1 Quantum Superposition


Quantum superposition is a counter-intuitive phenomenon associated with quantum
mechanics which does not have its counterpart in classical computing. It simply means
that qubits can simultaneously exist in multiple states. For example an electron can be
in both up and down spin states with equal probability at the same time. Superposition
breaks and system collapses in one of the states when it is measured.[9]

2.4.2 Quantum Entanglement


Entanglement is another counter-intuitive quantum phenomenon which refers to the
exchange of information between two quantum particles over the vast distances. If
two entangled electrons are separated through cosmic distances and the spin of one of
them is measured, then the other one certainly will have the opposite spin. Einstein
famously called this behaviour ‘spooky action at distance’. Entanglement, because
of it fast and unconventional approach of communicating information, can potentially
provide exponential speed up over the classical computer. Mathematically one of the
four maximally entangled states (Bell states) can be written as: [10]
1 1
|ψ⟩ = √ |00⟩ + √ |11⟩ (2.4)
2 2

7
2.5 Models of Quantum Computation
There are two main architectures of quantum computation. Both are independent and
have ample resources. One is adiabatic computation also known as quantum annealing
and the other one is, gate model quantum computation.

2.5.1 Adiabatic Quantum Computing (AQC)


Adiabatic quantum computing also known as quantum Annealing, invokes the adia-
batic theorem of thermodynamics, which states that; if a quantum system initially in
the ground states, is gradually changed then its overall Hamiltonian will remain con-
stant with time.[11] Hamiltonian is the energy operator and ground state refers to the
lowest energy state of a system.

In the implementation, first qubits are prepared in the ground state of problem in-
dependent Hamiltonian and that Hamiltonian is adiabatically evolved to reach the
Hamiltonian whose ground state energy corresponds to the problem at hand. There
is a whole class of problems called QUBO (Quadratic Unconstrained Binary Opti-
mization) that is considered for solution under AQC framework. D-Wave quantum
computers are specially designed to undertake this implementation [12]. A general
Qubo is of the form:

s∗ = argmin s⊤ Qs + q ⊤ s (2.5)
s∈{−1,+1}n

Here s represents the possible global states of a system over which it is to be min-
imized. Q ∈ Rn×n is the coupling matrix that models interactions and vector q ∈ Rn
models additional constraints, Work of Faehi et al. [13] gives an amazing approach
which makes the quantum computing a fair choice to deal with complicated QUBOs.
It starts with known values of Q and q then a time dependent system of n qubits is
considered.
n −1
2X
ψ(t) = αi (t) |ψi ⟩ (2.6)
i=0

Then it is evolved under time dependent Hamiltonian H(t) so that its behaviour is
governed by the Schrodinger equation

dψ(t)
iℏ = H(t)ψ(t) (2.7)
dt
8
Qubits are prepared in the ground state of the Hamiltonian at the beginning called
HB .
X n
HB = − σxi (2.8)
i=1

which is slowly transformed into problem Hamiltonian HP which is given by


n X
X n n
X
HP = Qij σzi σzj + qi σzi (2.9)
i=1 j=1 i=1

Where σzi = I ⊗ . . . ⊗ I ⊗ σz ⊗ I ⊗ . . . ⊗ I is the Pauli spin matrix acting on s ith


qubit. The Hamiltonian governing the time evolution is given by:
 
t t
H (t) = 1 − · HB + · HP (2.10)
T T

Finally, after the time evolution at time T measurement is performed on the qubits.
This causes the system to collapse onto one of its 2n basis states whose probability
2
is given by αi (T ) . Since, the adiabatic evolution was steered towards the problem
Hamiltonian HP so, the basis states |ψi ⟩ that correspond to the ground state of HP
are more likely to be found.

2.5.2 Gate Model Quantum Computing


Gate model quantum computing is another well-established framework of quantum
computation. It employ same level of infrastructure as that of classical computer
where we have inputs in the form of 0s and 1s and then there are logical operators
called gates. After the applications of the gates inputs gets transformed into outputs
at the end. Gates use logical operations such as AND, OR, XOR and NOT. In Quantum
computation these gates are unitary rotation operations that rotates the input state on
the Bloch sphere to get to the desired output[2]. A visual comparison of both classical
and quantum computation can be seen in the figure 2.4, where digital subtractor circuit
in the combination of AND and OR gate on the left where as a quantum circuit for
creating 4 qubit Bells-state in the combination of Hadamard and control NOT gates is
given.

9
Figure 2.4: Left: Example of a digital subtractor circuit. Right: Example of a quan-
tum gate circuit for generating a 4-qubit Bell state.

Here are some frequently used gates in Quantum computation.

X gate does the same operation of bit switch as does the NOT gate in classical com-
puting. X gate switch
" #
0 1
X= = |0⟩⟨1| + |1⟩⟨0| (2.11)
1 0

X |0⟩ = |1⟩ , X |1⟩ = |0⟩ (2.12)


Y- Gate has following action

" #
0 −i
Y = = −i|0⟩⟨1| + i|1⟩⟨0| (2.13)
i 0

Y |0⟩ = i|1⟩, Y |1⟩ = −i|0⟩ (2.14)


Z- Gate has the following action on Z basis

" #
1 0
Z= = |0⟩⟨0| − |1⟩⟨1| (2.15)
0 −1

Z |0⟩ = |1⟩ , Z |1⟩ = − |0⟩ (2.16)


Hadamard is a superposition gate, when applied to |0⟩ and |1⟩ states it results into
|+⟩ and |−⟩ states respectively. Detailed operation is shown below:

10
" #
1 1 1
H= = √ |0⟩⟨0| + |0⟩⟨1| + |1⟩⟨0| − |1⟩⟨1| (2.17)
1 −1 2

1 1
(2.18)
 
H |0⟩ = √ |0⟩ + |1⟩ , H |1⟩ = √ |0⟩ − |1⟩
2 2
In addition to single qubit gates we also have multi-qubit gates, which are applied
on more than one qubits. Two important examples are CNOT gate and Toffoli gate.
CNOT gate is also known as entangling gate since it is used to entangle two or more
qubits. It has one or more control qubits and one target qubit. If control qubit is in 0
state then the target qubit won’t change and if control qubit is in 1 state then target
qubit is flipped. Its operation is shown below:

 
1 0 0 0
0 1 0 0
 
CNOT =   = |0⟩ ⟨0| ⊗ I + |1⟩ ⟨1| ⊗ X (2.19)
0 0 0 1
0 0 1 0

Action of CNOT on computational basis is given as:

CNOT |00⟩ = |00⟩ , CNOT |01⟩ = |01⟩ , CNOT |10⟩ = |11⟩ ,


CNOT |11⟩ = |10⟩
(2.20)
Toffoli gate is a three qubit gate, also known as CCNOT gate. It flips the third
qubit only if the first and second qubits are in 1 state, otherwise target will remain
unaltered.[14]

 
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
 
0 0 1 0 0 0 0 0
 
0 0 0 1 0 0 0 0
 
Toffoli =  (2.21)
0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0
 
0 0 0 0 0 0 0 1
 
0 0 0 0 0 0 1 0

11
Action of Toffoli gate is given as

T |000⟩ = |000⟩
T |001⟩ = |001⟩
T |010⟩ = |010⟩
T |011⟩ = |011⟩
(2.22)
T |100⟩ = |100⟩
T |101⟩ = |101⟩
T |110⟩ = |111⟩
T |111⟩ = |110⟩

2.5.3 Universal Quantum Gates


Hadamard, CNOT and Pauli gates are called universal quantum gates because circuit
of almost any algorithm can be constructed using some combination of these gates.
This property is of great value since no new complicated gates are required, some com-
bination of these gates can simulate the same structure and desired output [15].

Both Quantum annealing and gate model framework work equally good in their re-
spective domains with great resources available.

2.6 Efficacy of Quantum Computer


An important question amid tremendous advancement in classical computing, one can
ask is about the utility of quantum computing. It is now obvious that quantum com-
puter will not render classical computer obsolete. Instead there is an area of computa-
tion where quantum computer can outperform the classical computer. In the following
subsections we will discuss the complexity classes where quantum computer is expected
to show an advantage.[16]

2.6.1 Complexity classes


Any problem computationally can be termed as easy or hard based on two important
factors; How much time it takes to be solved and verified, and how much memory
or computational resources it uses. All the problems can be classified into one of the
classes discussed in the following sections. Figure 2.5 summarises all the classes with
QUBO being the area where quantum computer is expected to show some quantum
advantage and we have strong evidence for it [17]. Mathematically computational

12
complexity can be represented using big O notation. For instance a problem which
takes polynomial time can be represented as O(nc ) and one with quadratic time can
be represented as O(n2 ).

Figure 2.5: Complexity classes


[18]

2.6.1.1 P Class
This class consists of the problems which can be solved and verified in deterministic
polynomial time. It contains decision making problems with ’Yes’ or ’No’ as outputs.
some problems could be hard but nevertheless solvable in polynomial time. Some
common examples of P class problems are matrix multiplication, maximum flow, linear
programming etc.

2.6.1.2 NP Class
NP class contains problems that take non deterministic polynomial time to solve. It
can take from polynomial to exponential time to solve but only polynomial time to be
verified. for example a decision making problem such as a simple ’Yes’ or ’No’ problem
may take non-deterministic polynomial time to solve but once the answer is known it
takes polynomial time to verify it. Common problems in NP class are graph colouring,
graph isomorphism, factoring etc [19].

13
2.6.1.3 NP Hard Class

NP hard class contains such special NP problems if any algorithm is found to solve
them in polynomial time then many other problems in NP can be solved based on that
algorithm. There is a whole class of such NP hard problem and these are considered as
hard as the hardest problem in NP. We don’t know exactly all the NP hard problems
but one common example is of travelling salesman problem [20].

2.6.1.4 NP Complete

NP complete problems lie in the overlapping region of NP and NP hard problems. A


problem is NP complete if its solution can be verified in polynomial time and every
problem in NP can be reduced to it in polynomial time. Some examples of NP complete
problems are Boolean satisfiability problem and Travelling salesman problem [21]

2.6.1.5 BQP Class

BQP stands for Bounded-error Quantum Polynomial time. It is a class of problems that
can be solved by quantum computer in polynomial time. When we talk about quantum
advantage it is primarily this class of problems which is our target [22]. Bounded error
refers to the fact that quantum computer should be able to solve these problems with
error probability no more than 1/3 [23]. It is an open question whether or not BQP
is subset of NP. We have no conclusive arguments to either support or refute the idea
hitherto [24].

2.7 Evolution of Quantum Computing


Quantum mechanics is little over a century old but its application and peripheries
like Quantum Information came on the horizon in around 1970s. Before that most of
the debate was theoretical, then with the proclamation of Richard Feynman about the
possibility of quantum computers to simulate quantum reality opened the door to prac-
tical application of the knowledge. Algorithms such as BB84 set direction for quantum
cryptography then came along the Deutch’s algorithm(1985). During 90s came out the
most popular algorithms such as Shor’s factoring algorithm (1994) and Grover’s search
algorithm (1996). And in last two decades we have seen the categorization of subject
into various useful subcategories and new areas of research emerged. Quantum ma-
chine learning first became the reality somewhere in the last decade, since then several
optimization techniques and algorithms made their way to contribute to the evolving
field. At present we have several quantum computers available online. The problem
of decoherence is the major obstacle today. Two main types of algorithms are being

14
developed one for existing NISQ era Near term quantum computers and others are for
future fault tolerant quantum computers. We have reached the quantum computers of
several hundred qubits, hitherto.

Figure 2.6: Timeline of Evolution in Quantum Computation and QML


[25]

We are quickly striding forward to look for areas and algorithms which can materialise
our dream of quantum supremacy. Extent of research and surge of new techniques hint
that a breakthrough is imminent. Corner stones of Quantum computing technology are
efficient qubits, gates and then algorithm that would give quantum supremacy using
these qubits and gates. In the next chapter brief introduction to Quantum Machine
Learning with its various types will be explained.

15
Chapter 3

Quantum Machine Learning

In the last chapter we presented the detailed architecture of quantum computing and
ended our discussion with two parallel frameworks; quantum annealing and gate model
quantum computing. In this chapter an overview of machine learning with detailed
explanation of quantum machine learning, coupled with its popular classes will be
presented.

3.1 Machine Learning


Machine Learning is an important sub-field of Artificial Intelligence which has recently
been one of the most popular topics under discussion in academia and industry because
of its wide range applications such as Large Language Models (LLMs). Machine learn-
ing is a technique of exploring the patterns in the data using various features, and on
the basis of the results automate the system as a mean to the desired end. Machine has
automated majority of human tasks in today’s world from finance to drug discovery
and weather forecasting to Large Language Models such as ChatGPT.[26]

3.2 Types of Machine Learning


On the basis of functionality and various purposes they serve, machine learning is
categorised into three major types. Although the basic structure of functioning is
similar, where data is given as input to the system and model explores the patterns
and draw conclusions, However each type has different end goal to get to using the
data. Also the type of data fed in contributes towards determining the type of machine
learning model that can be trained.

16
Figure 3.1: Types of Machine Learning

3.2.1 Supervised Learning


In supervised Machine Learning Model, Labelled data is fed into the system and model
has to look for the pattern to recognised the given instance as from one category or the
other. For example if data is about the roses and tulips is fed into the system and then
after training it is asked to recognise a given sample it could correctly mark it as rose or
tulip. Supervised machine learning can be used in self driving car system in recognising
the various signs on the road or in medical diagnosis like reading a report or X-Ray
analysis [27]. Some important algorithm in supervised machine learning includes linear
regression, logistic resgression, decision tree and Support Vector Machines.

3.2.2 Unsupervised Learning


In unsupervised learning unlabelled data is fed into the model and it captures the
patterns in the data and categorises it into various groups. A common example of un-
supervised learning is the functioning of social media platforms where algorithm keeps
record of the activities of user and on the basis of that it remembers his choices and
next time when he logs in he sees a customised screen based on his recent activity. If
he likes a specific genre of movies on a streaming platform then he is more likely to
get recommendation of that particular genre when he opens his screen [28]. Impor-
tant algorithms are K-means clustering, Principal component analysis and Anomaly
detection algorithm.

17
3.2.3 Reinforcement Learning
Reinforcement learning works on the trial and error basis. Algorithm is rewarded with
positive score when it makes correct prediction to the desired output and is awarded
negative score if the prediction goes wrong. System learns to maximize the total score
and that is how it learns to reach the output with minimum steps and correct prediction.
It is just like a chess master who carefully plans his next move based on the moves
played by the opponent. Important algorithms that employ reinforcement learning are
Q-Learning, Policy Gradient Methods etc [29].

3.3 Quantum Machine Learning


Quantum machine learning refers to the act of employing laws of quantum mechanics
for data exploration and manipulation to draw meaningful information out of it and
then use this information to automate, simulate and design systems that would assist
humans in complicated tasks.
Quantum Machine Learning(QML) is a nascent field of research with multiple con-
straints and challenges. At the moment when we are in the NISQ (Noisey Intermediate
Scale Quantum) era most of the QML models are hybrid where quantum and classical
frameworks are blended together to design the algorithms. Based on the type of data
and framework used there are four major approaches of Machine learning.

Figure 3.2: Approaches to QML


[30]

18
3.3.1 Classical Data on Classical Device (CC)
In this approach classical data is processed on the classical device which is conventional
machine learning as discussed in section (3.1,3.2). In the context of Quantum machine
learning it refers to the machine learning based on methods borrowed from the frame-
work of quantum information. Application of tensor networks is an important example
of this approach. Wittek [31] describe it as quantum -like machine learning.

3.3.2 Quantum Data on Classical Device (QC)


It refers to the approach in which Quantum data is processed using classical machine
learning algorithms. This is an active area of research, classical machine learning
algorithms are frequently being developed and employed in quantum computation such
as qubit characterization, readout, and control. It also helps understand the internal
state of the quantum computer. There are two types of QC algorithms one that use
qRAM and others which do not [32].

3.3.3 Classical Data on Quantum Device (CQ)


This is perhaps the most active area of research in the existing hybrid model of QML.
It refers to the use of quantum computational resources and algorithms to process
the classical data. Most of the well known Quantum machine learning approaches in
existence today take on acquiring classical data, encoding it to quantum states and
training a quantum algorithm based on this data. Most of the algorithms in CQ that
are being developed are for near term devices [33].

3.3.4 Quantum Data on Quantum Device (QQ)


This approach is still in very early stages of development and not many resources are
yet available. It takes quantum data and process it on the quantum device using quan-
tum algorithms. Feeding quantum data directly to the system requires measurement
of a quantum system which makes it complicated due to limitation by the quantum
mechanical laws.

3.4 Important QML Algorithms


QML, since its inception in the last decade there are quite many pertinent algorithms
that emerged, most of which are inspired by their classical counterparts. In both
supervised and unsupervised learning we generally have three main machine learning
tasks which include classification and regression using supervised learning technique

19
and clustering using unsupervised learning technique. And on the basis of the type of
machine learning task at hand we can choose what kind of algorithm is aptly applicable.
Some run purely on quantum resources but, there are some like Variational Quantum
Classifier (VQC) which run on the hybrid model, using both classical and quantum
resources. Researchers are still looking for the algorithms which have the potential for
vivid and promising quantum advantage [34].

3.4.1 Quantum Support Vector Machines and Kernals (qSVM)


Support Vector Machine (SVM) is an important supervised machine learning algorithm
that performs the classification task. Data is encoded in the form of vectors in fea-
ture space where SVM categorised that labelled data into one of the two categories by
drawing a hyper plane between the two groups. SVM maximize the distance between
the two groups and if they are not linearly separable then vectors are mapped onto a
higher dimensional plane using kernal where they can be separated using two dimen-
sional plane [35].

Figure 3.3: Linearly and non Linearly separable data


[36]

In quantum support vector machines and quantum kernal methods, quatum feature
maps such ZZfeaturemap etc are used to map the input vector states ϕ(⃗x) to higher
dimensional Hilbert space |ϕ(⃗x)⟩⟨ϕ(⃗x)| using a unitary transformation. Whereas com-
plex pattern as shown in Fig 3.5 can be separated by linear hyper plane. Constructing
quantum feature maps employing parameterized quantum circuits, which are classi-
cally hard to simulate, is expected to be an important step towards possible advantage
of QML over classical machine learning approaches, and is an active area of research.
[38]

20
Figure 3.4: Kernal Trick to eperate nonlinearly separable data
[37]

3.4.2 Quantum Generative Adversarial Networks (QGANs)


Idea of Generative Adversarial Networks was first proposed by Goodfellow et al [39] in
2014, then on it has spanned over wide range of applications. GANs is unsupervised
Learning Algorithm and it works with a pair of neural networks namely Generator and
Discriminator. During the first phase of training the model, discriminator in trained
on a real data and in the next phase generator is tasked to generate fake data which is
fed to the discriminator to mark it as real or fake. If discriminator marks the data as
fake, then generator is updated to create more realistic fake data and process keeps on
repeating until generator is capable enough to consistently goes undetected. This point
is called Nash equilibrium. GANs have their usage in creating new images, improving
already existing images, predicting future events from the patterns in the existing data
and guide for the plan of action ahead of time, and many more.

Figure 3.5: Working of Generative Adversarial Networks


[40]

21
The idea to replicate GANs in the Quantum realm was proposed by [41] and [42]
in 2018. The Basic idea is to replace the classical structure of either Generator or
Discriminator or both with quantum parameterized circuits to enable it to process
quantum data. Generator will produce a desired state |ψ⟩ using repetitive update to
its parameters. In QGAN however, landscape of cost function evolves to the goal of
improvement in creating the fake data that could dodge the discriminator. In fully
quantum QGAN a discriminator is fed with both real and fake data, to that end first
step is to encode the classical data into quantum states, for that real data is encoded
using a unitary UR .
|DataR ⟩ = UR |0⟩⊗n (3.1)
After encoding it is sent to the parameterised circuit UD(θ) which iteratively update
the parameters θ, to the end of minimizing the cost function. Similarly the generator
creates fake data state using a parameterised unitary UG(θG ) on to the initial state |0⟩⊗n

|DataG ⟩ = UG(θG ) |0⟩⊗n (3.2)


At the end discriminator measures the expectation value of the system with respect
to the incoming state which gives output either as 1 or -1, marking it as Real or Fake.
If Data is correctly marked fake then generator will update its parameters to create
more realistic data and if data is marked real then discriminator’s parameters will be
updated to make it more sensitive to detection. Change in parameters is aimed at
minimizing the cost functions of discriminator and generator given below to maximize
the corresponding probability.
 E
CostD = Pr D θD , G (θG ) =| real − Pr D (θD , R) =| real ) (3.3)


 E
CostD = − Pr D θD , G (θG ) =| real (3.4)


QGAN is a nascent approach in QML it is too early to look for quantum advantages
however due to inherent quantum mechanical laws that QGANs are knitted on expected
to enable them to sample and manipulate classically intractable probability distribution
which would potentially give us useful applications in chemistry, drugs discovery and
finance.[41] , [42].

3.4.3 Variational Quantum Eigen-Solver (VQE)


Variational Quantum Eigen-solver, first proposed by Peruzzo et al in 2014 [43], is
an optimisation algorithm which closely resembles the Neural networks in classical
machine learning. Notable thing about VQE is that it can be tested on NISQ era
quantum computers. VQE consists of following parts which will be discussed in detail:

22
1. Feature Map

2. Ansatz

3. Cost function

4. Optimizer

The gist of VQE is to look for a quantum state with the minimum eigenvalue of a
certain observable. It does it so based on the mathematical insights of variational
theorem of quantum mechanics, which states that: ’If the (normalized) state |ψ⟩ of a
quantum system depends on a parameter vector θ then the optimal approximation of
the ground state (i.e. the eigenstate |ϕ0 ⟩ with the minimum eigenvalue λ0 ) is the one
that minimizes the expectation value of the Hamiltonian H’ [44].

⃗ Ĥ|ψ(θ)⟩
⟨Ĥ⟩ = ⟨ψ(θ)| ⃗ ≥ λ0 (3.5)

3.5 Variational Quantum Classifier (VQC)


VQC is an instance of VQE extensively used in QML for classification or regression
tasks. It consists of a parametrised circuit containing features and weights. By optimi-
sation it looks for the parameters that minimise the cost function, pretty much like the
classical neural networks. As stated, above VQE has four main components namely:
Feature Map, Ansatz, Cost Function and Optimizer. In this section we will delve into
the details and how they are all integrated into a VQC structure. A general overview
of whole design is shown in Figure 3.6.

Figure 3.6: Variational Quantum Classifier Design


[40]

23
Qiskit which a Python Library by IBM has been used for all the work here onward
and in the next chapter as well primarily due to extensive resources available out there.
However same design can amply be generated in the other languages like Q-sharp or
similar Libraries like Cirq and Pennylane [45].

3.5.1 Feature Map


Feature Mapping in QML refers to plotting classical data points into Hilbert space in
the form of quantum states. It consists of a parameterised circuit where parameters
are the input data points. Feature mapping is an important step in QML, for without
data encoding into quantum states it is impossible to implement quantum mechanical
operations on the data. There are many different ways to encode classical data into
quantum states, four of them are discussed below. We will consider a classical dataset
X with M samples each having N features:

X = {x(1) , . . . , x(m) , . . . , x(M ) } (3.6)

x(m) is a N dimensional vector with m = 1,. . . . . . ..,M we will use following tech-
niques to encode this datasets into quantum states.

3.5.1.1 Basis encoding

In basis encoding each value of classical dataset is encoded in Z basis (computational ba-
sis) of 0s and 1s. for instance x=3 can be written in 4-bit string as 0011 and same x=3 in
4-qubit system can be be written in Dirac notation as |0011⟩ more generally a classical
dataset x = (b1 , b2 , . . . , bN ) will be mapped into quantum state |xm ⟩ = |b1 , b2 , . . . , bN ⟩,
Where bn ∈ {0, 1}, for n = 1 , . . . , N. Entire datasets can be represented as the su-
perposition of computational basis as follows:
M
1 X m
|X ⟩ = √ |x ⟩ (3.7)
M m=1

3.5.1.2 Amplitude Encoding

In Amplitude encoding classical data is encoded as amplitude of the quantum state.


Suppose if we have a classical data values a = (a1 , a2 , . . . , an ), to encode them as
amplitude of a normalized quantum state |ψa ⟩ we will follow through the following
mathematical operation [46].

N
X
|ψa ⟩ = ai |i⟩ (3.8)
i=1

24
Where, N = 2n , ai is the ith element of a and |i⟩ is the ith computational basis
state. And ofcourse quantum state must be normalized.

3.5.1.3 Angle Encoding

Angle encoding is most common method in use for encoding classical data in quantum
states. In this method of encoding classical data is encoded as rotation angle of qubit.
For instance the data points a = (a1 , ......, aN ).

|a⟩ = cos (ai ) |0⟩ + sin (ai ) |1⟩ (3.9)

Angle encoding is different from basis and amplitude encoding in a way that it
encodes one data value at a time. In angle encoding, angles can also be encoded as the
angles of Unitary gate [47]
N
X
S(aj ) = U (ai (i) ) (3.10)
i=1

Here, " #
(i) (i)
  cos(ai ) − sin(ai )
U ai (i) = (i) (i) (3.11)
sin(ai ) cos(ai )

3.5.2 Ansatz (Parameterised Circuit)


In machine learning, neural network is a quite popular framework, besides the input
data it has weights and biases which are tweaked with each rep to get the most opti-
mized output so as to match the desired output. In QML, job of weights and biases it
taken up by the parameterized circuit often called as Ansatz. It is basically a quantum
circuit with gates which are defined through tuneable parameters, placed right after
the Feature Map in overall VQC circuit and outputs from the feature-map are fed into
it. Which are taken as the angles of unitary gates placed with various reps and entan-
gling patterns. There are various built in Ansatz available in Qistkit, most prominent
of them are Real-Amplitudes and EfficientSU2 [45].

25
Figure 3.7: Ansatz: EfficientSU2 with features = 3, Reps = 1, Entangling Pattern =
Circular

Based on the types of Gates used, Number of Reps and Entangling Pattern, we can
generate an Ansatz of our own. Efficiency of a particular type of Ansatz depends on
the input data features. However, three important features; expressibility , entangling
capability and hardware efficiency would determine the suitability of the Ansatz which
are discussed below in detail.

3.5.2.1 Expressibility

Expressibility refers to how efficiently a parameterized quantum circuit (PQC) can


occupy the Hilbert space. It is the ability of PQC to generate pure state that could
be represented on Bloch sphere. Figure 3.8 shows various circuits with different circuit
structures. In Idle circuit only gate we have is identity so no changes are made to input
qubit and the Bloch sphere appears empty, in circuit A space around the equator is
occupied, in circuit B whole surface is occupied but some spots are more concentrated
than others where as the last circuit represents the ideal construct in which uniform
stretch all over the surface is shown. This kind of PQC is expected to look for optimal
parameters more quickly than others.

26
Figure 3.8: Expressibility of various PQC models shown on Bloch sphere
[48]

3.5.2.2 Entangling capability

Entanglement is non classical correlation between the qubits. Ability of PQC to create
more entangled states is expected to express complex data feature and represent com-
plicated states on quantum landscape. Entanglement is measured by Meyer-Wallach
entanglement measure (Meyer and Wallach,2002) [49]. A non-entangled product state
would have Meyer-Wallach measurement of 0 and maximally entangled state such as
Bell state would have Meyer-Wallach measure of 1.
In Figure 3.9 Circuit A has no entanglement among the qubits so its Meyer-Wallach
measurement is 0 and circuit B has entanglement of linear pattern so it will have Meyer-
Wallach measurement greater than 0.

Figure 3.9: Entanglement capability of two different PQCs


[40]

27
3.5.2.3 Hardware efficiency

Quantum computers of NISQ (Noisy Intermediate Scale Quantum computer) era suf-
fer decoherence in which qubits are difficult to be maintained in complete isolation.
Environment continuously interact with them, causing their states to change and this
results into gate fidelities. With this problem at hand circuit depth and the number of
gates applied become an important factor determining the accuracy of results. Explor-
ing hardware efficient PQC that reconciles with hardware efficiency in existence today
is an active area of research.[50]

3.5.3 Cost Function


After data encoding and parameterised circuit next important step is the training of
the model. Essence of training lies in finding a function that represents the input data
and minimising it by the number of iteration and updating the parameters with each
iteration. This function is often called as cost function or objective function. In PQC
framework objective function minimisation usually refers to finding the expectation
value of Hamiltonian operator ⟨ψ(θ)| ⃗ Ĥ|ψ(θ)⟩.
⃗ We can use both gradient based and
gradient independent algorithms to update the parameters for optimisation. [51]

3.5.4 Optimisation
After evaluation of the expectation value of cost function result is passed on to the
optimiser to look for parameter that minimise the output. This step is repeated many
times and parameters are updated with each iteration, until it detects the minimum
value and then optimiser will return the parameters values θ⃗∗ that minimises the expec-
tation value. Leading to that minimum expectation value, from which solution state
|Ψ(θ∗ )⟩ is constructed.
We have two main frameworks of optimisation.

3.5.5 Gradient Based optimisation


 
In gradient based optimisation gradient of cost-function f θ⃗ is calculated at every
point. And parameters are updated continuously in such a way that gradient heads to
a path of greatest
  descent and this optimisation keeps on going until we reach a local
minimum f θ⃗∗ mathematically it looks like this:
 
θ⃗n+1 = θ⃗n − η ∇f
⃗ θ⃗ (3.12)

28
 
Here ∇f
⃗ θ⃗ calculates the gradient of cost function and η is a small hyper-parameter
that determines the size of parameters update. Gradient based optimisation works fine
with small number of qubits but as we increase the number of qubits to express many
features of the data it suffers from a gruesome problem of ‘baren plateaus’ , which
refers to the exponentially vanishing gradients.

Figure 3.10: Variance decreases with the increase of number of qubits, Baren Plateau
[40]

Figure 3.11 shows that variance decreases with the increase in the depth of the
circuit. It actually gets stuck at some local minima with global minima un explored
and resultantly we don’t get the optimised parameters.

3.5.5.1 Gradient Free Optimisation

Gradient independent optimisation is an alternate way to optimise parameter which do


not require the gradient measurement. It is especially important where it is difficult to
compute gradient. Although it sneaks its way out escaping the baren plateaus but, it
comes with the trade-offs between speed and accuracy. Also, gradient free method of
optimisation would require more computational resources. In hybrid QML framework
we have built in classical optimisers in python such as SPSA, COBYLA and Adam for
gradient independent optimisation.[52]

In this chapter a comprehensive explanation of Quantum Machine Learning is pre-


sented and in the next chapter detailed breakdown of VQC for penguins ad Titanic
datasets is presented.

29
Chapter 4

VQC for Penguins and Titanic Data


sets

In the last chapter an introduction to quantum machine learning coupled with different
approaches and popular algorithms was presented. Here we will present the detailed
breakdown of Variational Quantum Classifier (VQC) for Penguins and Titanic datasets
[53]. All the coding is done in python and qiskit library by IBM.

4.1 Penguins Dataset


Penguin data consits of seven features in total namely ’species’, ’island’, ’bill length’,
’bill depth’, ’flipper length’, ’body mass’, and ’sex’. Out of these, one feature ’species’
is made target and rest six of them are the features based on which prediction is made.
For QML classification below we used ’ZZ Feature Map’ (with Reps = 1) for mapping
features, ’Real Amplitudes’ (with Reps = 3) as Ansatz and ’COBYLA as the optimiser.

4.2 QML Classification for Penguin Dataset


Code excerpts from the jupyter notebook are pasted below. Figure 4.1 imports the
necessary libraries and function to prepare the environment which would be required
during the program.

30
Figure 4.1: Importing the necessary libraries and functions

4.2.1 Importing and Displaying Dataset


To start processing the data we first need to import the dataset. Here we imported
the data from a github repository [53]. Figure 4.2 imports and displays the first five
entries of penguins dataset.

Figure 4.2: Importing and displaying the penguins dataset

4.2.2 Data Cleaning


As it can be seen in the figure 4.2 that there are some missing values in the data. In
the next step we will look for all the missing values and then we will get rid of them
in the following step. Figure 4.3 give the details of all the missing values.

31
Figure 4.3: Displaying all the missing values

Now we will clean the data by dropping all the missing values. this is done so as
to make sure that the predicting model do not get a false input and train on it, since
it can signifcantly effect the accuracy. Figure 4.4 diplays the values after cleaning.

Figure 4.4: Displaying data after cleaning

4.2.3 Label Encoding


Some of the columns in Figure 4.2 have string values which need to be converted to
numeric values, for predictive model in the machine learning only recognise the numeric
values. Label encoder automatically assigns the numeric values to the string inputs.
Figure 4.5 displays the values after the label encoding.

32
Figure 4.5: Label encoding

4.2.4 Previewing Data


We can look for different features of data such as mean, standard deviation, minimum
and maximum values by giving the command data.describe. Figure 4.6 dislplays some
of these features.

Figure 4.6: Previewing the data

4.2.5 Displaying Data


Data visualisation gives deeper insights of the data. It tells how different features
are correlated. It helps us to figure out what features play more important part than
others in prediction. Some features are more clearly distinguishable than other and
make the prediction easier. Figure 4.7 displays the different features of the three species
of penguins.

33
Figure 4.7: displaying data

Figure 4.7 displays the relationship between different features. Four features of
three species of penguins make up different combinations for comparison. It helps
determine how each feature varies with respect to another. Feature values which are
closely clustered show more dependence on one other than the ones which are far apart.

4.2.6 Removal of Weak Variables


In this step we will get rid of the features which are weakly correlated to enhance the
predictive accuracy. This is the important step, since weak correlations can decrease
the accuracy. In figure 4.8 features with correlation coefficient between -0.3 and 0.3
are removed and our number of features decrease from 6 to 4. As it can be seen in

34
Figure 4.9 features that were retained after reduction are Island, bill length, flipper
length and body mass.

Figure 4.8: Removing weak variables

4.2.7 Ordering the Data


Now we will order the data so that the target feature which is "species" in our case is
placed in the last column. Data after reordering is shown in Figure 4.9

Figure 4.9: Ordering the data

4.2.8 Normalisation and Test/Train Split


Once the data is in order, next step in to select target and labels. target is the variables
against which pridiction is to be made. Figure 4.10 selects "species" as target and rest
of the features as labels.

Figure 4.10: Selecting target and labels

35
Next we will normalise the data so that all values appear between 0 and 1. After
normalisation data will be splitted into two parts one part which is usually 80% of the
data is used for training and rest is the test data that will be used to test the model,
as shown in figure 4.11.

Figure 4.11: Splitting data into train and test data

4.2.9 Encoding Classical Data into Quantum States


After the test and train split, next important step is to encode the classical data into
quantum states to be able to process it through quantum model. There are many ways
we can do so, as discussed in subsection 3.5.1. Here we will use angle encoding through
built in feature map named ZZ-feature map.

Figure 4.12: Data encoding through ZZ feature map

36
Feature map initializes the qubits (propotional to the number of input features)
in zero state. Hadamard gates put each qubit in superposition. Next, P represents
the Puli rotation gates which are Rz gates in this case. Input features are encoded in
the phase of these gates. Phase angles are multiplied by the factor of 2 by the choice
of design to amplify the angles and to derive some advantage during the gradient
calculation. Followed by it, are the entangling gates, whose pattern can be manually
adjusted as desired [54].

4.2.10 Application of Suitable Ansatz


Next step is to feed the prepared quantum states into a parameterized circuit named
Ansatz. There are many built-in ansatzes in Qiskit, and we have used one called Re-
alAmplitudes. Ansatz makes an educated guess about the ground state of the Hamil-
tonian of the system. This is further optimized classically, and this step is iterated over
and over until the state converges to the actual ground state [55].

Figure 4.13: RealAmplitudes ansatz with Reps = 3

The RealAmplitudes circuit is a heuristic trial wave function commonnly used as


Ansatz in classification circuits in machine learning. The circuit consists of of alter-
nating layers of Ry rotation gates and CX entanglements. The entanglement pattern
and number of reps can be selected manually while calling the RealAmplitudes func-
tion. It is called RealAmplitudes since the prepared quantum states will only have real
amplitudes, the complex part is always 0. The circuit above has three repetitions. [56].

37
4.2.11 Optimisation
VQE is a hybrid model in which both classical and quantum computational resources
are used simultaneously. Here we will use COBYLA, a classical optimizer, to minimize
the objective function. Figure 4.14 gives code to call the optimiser.

Figure 4.14: COBYLA Optimizer

In VQE framework, we have two popular primitives for optimization: the sampler,
that generates the probability distribution bit-strings, and the other is estimator, which
calculates the expectation values [57]. Here we will use sampler as shown in figure 4.15

Figure 4.15: Sampler

4.2.12 Calling VQC Function


Qiskit has a built-in VQC function for linear classification. Figure 4.16 presents the
code to call the VQC, which takes feature map, ansatz, sampler, and optimizer as its
arguments.

38
Figure 4.16: Variational quantum classifier

4.2.13 Displaying Optimisation Path


Figure 4.17 displays the optimization pathway of the COBYLA. Graph represents the
objective function values against the number of iterations. It can be seen that the
graph converges to the optimal value as the iterations increase. The time taken for
the complete process is also displayed at the bottom; it actually depends on the circuit
depth, number of parameters, hardware efficiency of the quantum computer, the local
computer’s specifications, choice of optimizer, and the data being processed [58].

Figure 4.17: Objective function optimization against the number of iterations

39
4.2.14 Displaying Accuracy
Last step in the VQC classification model training is to get the accuracy of how good
the model has performed. For this instance we have 86% accuracy for training data and
82% for test data. It is not as good as the classical data, but it is still fairly accurate.
Considering the under developing quantum technology and hardware constraints it is
quite interesting result. QML is an active area of research and different frameworks are
proposed frequently. It is quite possible that same data might give a better or worse
accuracy results if trained through algorithms other than VQC.

Figure 4.18: Accuracy of training and test data

4.3 Classification of Penguins Data Through Classical


Support Vector Machine
For the comparison of accuracy, computational resources employed, and training time
we have trained classical machine learning model as well using support vector machine.

Figure 4.19: Classification through classical support vector machine algorith

It is quite evident from the Figure 4.20 that classical algorithm outperforms the
quantum one, but the silver lining is that quantum mechanical inherent properties
such as superposition, entanglement and interference have suggested strong clues about

40
the potential advantage that can be extracted through a quantum machine learning
algorithm [59].

Figure 4.20: Accuracy of results using SVM

4.4 QML and CML Comparison for Titanic Data


Figure 4.21 and 4.22 give a comparison of performance of classification task by VQC
and SVM respectively on Titanic data set. Titanic data consists of 12 Features namely
’PassengerId’, ’Survived’, ’Pclass Name’, ’Sex’, ’Age’, ’SibSp(passengers with any sib-
lings onboard)’, ’Parch(passengers with parents or childern onboard)’, ’Ticket’, ’Fare’,
’Cabin’, ’Embarked (name of place he/she hail from)’ [18]. Classification is made,
whether a passenger survived or not. Some of the features which were weakly corre-
lated were removed during data processing and for this instance the number of features
used are 4, namely ’Pclass’, ’Sex’ , ’Fare’ , ’Embarked’. The number of reps for feature
map and ansatz are 1 and 3 respectively. It can be seen that there is a narrow gap in
the results, with some improvement on the hardware and selecting an efficient ansatz
can boost the quantum advantage.

Figure 4.21: VQC result for Titanic data

41
Figure 4.22: SVM result for Titanic data

In this chapter a detailed breakdown of VQC algorithm is presented with python code
snippets for Penguins and Titanic datasets. It was one example each for a fixed number
of reps and it can be seen that QML algorithm is slightly less efficient for penguins data
but works equally good for titanic dataset. In the next chapter accuracy results for var-
ious combinations of reps of feature map and ansatz will be presented for comparison.

42
Chapter 5

Optimal Number of Reps For


Feature-Map and Ansatz

In the previous chapter data analysis and VQC for training and test data accuracy
in the framework of python was explained. In this chapter we will discuss how the
numbers of reps of Feature-Map and Ansatz contribute to the accuracy of the the
classification algorithm VQC. In the following examples number of reps of feature map
varies from 1 to 3 and for ansatz it varies from 1 to 5. We have used ZZfeaturemap as
Feature map and Real Amplitude as Ansatz. Number of features taken for penguins
data are 3, 4, 5 and 6 and for that of Titanic data are 3, 4, 5, 6, and 7. We have used
two different optimizers Cobyla and SPSA for fair comparison.

5.1 Relation between Numbers of Reps and Accuracy


For Penguins Data-set
in VQC during the encoding part number of reps of the feature map and follwing it the
number of reps of ansatz can be tweaked in the favour of better accuracy. Finding an
optimal number of reps is quite an endeavour, that requires trying each combination
and selecting the one with best results. In the following sections we have presented the
results of multiple combinations, some of them perform significantly better then the
others.

5.1.1 Numbers of Reps vs Accuracy with Cobyla


Figures 5.1, 5.2, 5.3, and 5.4 shows the accuracy bar graphs plotted against different
numbers of reps for varying numbers of features. Classical optimizer cobyla has been
used for objective function minimisation. On x-axis in the Reps values, the first number

43
in each value represents the number of reps for feature map whereas the second number
represents the number of reps for the ansatz.

Figure 5.1: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = cobyla.

Figure 5.2: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = cobyla.

44
Figure 5.3: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = cobyla.

Figure 5.4: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = cobyla.

5.1.2 Numbers of Reps vs Accuracy with SPSA


In Figures 5.5, 5.6, 5.7, and 5.8 we have presented the accuracy results against the
numbers of reps of featuremap and ansatz of the same data set, but this time with a

45
different optimizer called SPSA.

Figure 5.5: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = SPSA.

Figure 5.6: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 4, optimizer = SPSA.

46
Figure 5.7: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 5, optimizer = SPSA.

Figure 5.8: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 6, optimizer = SPSA.

47
5.2 Relation between Numbers of Reps and Accuracy
For Titanic Data-set
In this section we see how accuracy and number of reps of feature map and ansatz are
related using Titanic dataset. Titanic data sets originally consists of 12 features which
can be trained for classification between survived or deceased passengers. For our pur-
pose of classification through VQC, 7 features have been retained and remaining have
been removed during the data cleaning which were deemed unnecessary or incomplete.

5.2.1 Numbers of Reps vs Accuracy with Cobyla


Figures 5.9, 5.10, 5.11, 5.12 and 5.13 shows the accuracy bar graphs plotted against
different numbers of reps for varying numbers of features. Classical optimizer cobyla
has been used for objective function minimisation. On x-axis in the Reps values, the
first number in each value represents the number of reps for feature map whereas the
second number represents the number of reps for the ansatz.

Figure 5.9: Training and test data accuracy against different number reps of feature-
map and ansatz, number of features = 3, optimizer = Cobyla.

48
Figure 5.10: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 4, optimizer = Cobyla.

Figure 5.11: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 5, optimizer = Cobyla.

49
Figure 5.12: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 6, optimizer = Cobyla.

Figure 5.13: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 7, optimizer = Cobyla.

5.2.2 Numbers of Reps vs Accuracy with SPSA


In Figures 5.14, 5.15, 5.16, 5.17 and 5.18 the accuracy results against the numbers of
reps of featuremap and ansatz of the same data set have been presented and this time

50
for optimization SPSA is used.

Figure 5.14: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 3, optimizer = SPSA.

Figure 5.15: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 4, optimizer = SPSA.

51
Figure 5.16: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 5, optimizer = SPSA.

Figure 5.17: Training and test data accuracy against different number reps of
feature-map and ansatz, number of features = 6, optimizer = SPSA.

52
Figure 5.18: Training and test data accuracy against different reps of feature-map
and ansatz, number of features = 7, optimizer = SPSA.

53
Chapter 6

Analysis of The Results

In the previous chapters the detailed break down of model training and output of
individual instances have been presented. Now in this chapter we will analyse the
results in comparison. Training and test data data accuracy graphs against different
combinations of reps have been plotted for various number of features.

6.1 Training and Test Data accuracy for Penguins


data
In the following subsections accuracy graphs for various number of features have been
combined for easy comparison. Graphs of models optimised with cobyla and SPSA
have been plotted separately for thorough investigation.

6.1.1 Training and Test Data Accuracy Comparison Using Cobyla


Figure 6.1 and 6.2 present combined graphs of training and test data accuracy against
the number of reps for various number of features, using the Cobyla optimizer.

54
Figure 6.1: Training data accuracy against different No. of reps for various No. of
features.

Figure 6.2: Test data accuracy against different No. of reps for various No. of fea-
tures.

55
6.1.2 Training and Test Data Accuracy Comparison Using SPSA
Figure 6.3 and 6.4 present combined graphs of training and test data accuracy against
the number of reps for various number of features, using the SPSA optimizer.

Figure 6.3: Training data accuracy against different No. of reps for various No. of
features.

Figure 6.4: Test data accuracy against different No. of reps for various No. of fea-
tures.

56
6.1.3 Optimal Combination of Reps
Table 6.1 below shows the combination of repetitions of feature map and Ansatz which
give best accuracy for various number of features.

Training Data Test Data No. of Reps of No. of Reps of


No. of Features Optimizer
Accuracy Accuracy Feature-map Ansatz

3 0.82 0.78 2 5 COBYLA


4 0.86 0.82 1 3 COBYLA
5 0.85 0.76 1 4 COBYLA
6 0.71 0.67 2 4 COBYLA
3 0.85 0.7 1 4 SPSA
4 0.83 0.7 1 5 SPSA
5 0.83 0.76 1 5 SPSA
6 0.76 0.70 1 1 SPSA

Table 6.1. Optimal combinations of reps of feature map and Ansatz, penguins data

6.2 Training and Test Data accuracy for Titanic data


In the following subsections accuracy graphs for various number of features of titanic
dataset have been combined for easy comparison. Graphs of models optimised with
cobyla and SPSA and of training and test data have been plotted separately for in-
depth investigation.

6.2.1 Training and Test Data Accuracy Comparison Using Cobyla


Figure 6.5 and 6.6 present combined graphs of training and test data accuracy against
the number of reps for various number of features, using the Cobyla optimizer.

57
Figure 6.5: Training data accuracy against different No. of reps for various No. of
features.

Figure 6.6: Test data accuracy against different No. of reps for various No. of fea-
tures.

6.2.2 Training and Test Data Accuracy Comparison Using SPSA


Figure 6.7 and 6.8 present combined graphs of training and test data accuracy against
the number of reps for various number of features, using the SPSA optimizer.

58
Figure 6.7: Training data accuracy against different No. of reps for various No. of
features.

Figure 6.8: Test data accuracy against different No. of reps for various No. of fea-
tures.

59
6.2.3 Optimal Combination of Reps
Table 6.2 below shows the combination of repetitions of feature map and Ansatz which
give best accuracy for various number of features.

Training Data Test Data No. of Reps of No. of Reps of


No. of Features Optimizer
Accuracy Accuracy Feature-map Ansatz

3 0.79 0.8 2 5 COBYLA


4 0.8 0.83 1 3 COBYLA
5 0.76 0.8 1 4 COBYLA
6 0.72 0.73 2 4 COBYLA
7 0.73 0.73 1 5 COBYLA
3 0.8 0.83 2 5 SPSA
4 0.79 0.83 1 5 SPSA
5 0.78 0.83 1 5 SPSA
6 0.75 0.73 2 4 SPSA
7 0.76 0.82 1 3 SPSA

Table 6.2. Optimal combinations of reps of feature map and Ansatz, Titanic data

6.3 Analysis of the results


From the accuracy graphs above following conclusions can be drawn in the context of
the role, numbers of repetitions of feature map and ansatz play in the accuracy of the
algorithm.

• The number of repetitions of the feature map and ansatz dramatically affects the
output of VQC algorithm for both training and test data, and setting the reps
to the right combination will produce optimal results.

• A trade-off between the number of reps and circuit depth is important since, in
NISQ era hardware, we are constrained to minimise the number of qubits.

• In figures 6.1 through 6.8 It can be seen that there accuracy peaks appear when
number of reps of ansatz are increased from 1 to 5. But at the same time Number
Increaeing the number of reps of feature map limits the performance.

• Our proposition is that, by increasing the number of reps of ansatz we increase


the expressibility (although at the cost of increasing circuit size) of the circuit,
which will help explore the state that minimizes the cost function quickly by
increasing the search space [60].

60
• We also propose that, although increasing the number of reps of feature map help
capture intricate patterns in the data and might be of statistical value, but at
the same time it negatively affects the accuracy of the model, as seen in figures
6.1 through 6.8. It is complicated to comment on why exactly this is the case.
However, a likely reason for the impact is the increase in circuit length that would
invite noise.

• From the results in sections 6.1 and 6.2 it can be concluded that the optimal
combinations for the number of repetitions for the feature map and ansatz are
13, 14, 15, 23, 24, 25.

61
Chapter 7

SUMMARY OF RESEARCH WORK

To summarise the results produced in the last chapter and before it, for both penguins
and titanic datasets different combinations of the repetitions of feature map and Ansatz
seem to play a pivotal role in determining the accuracy. Besides the choice of feature
map and ansatz and number of times you apply them in a circuit noticeably improves
the accuracy.
Quantum hardware of NISQ era is prone to noise and could result in erroneous
outputs. With this constraint circuit depth and entangling pattern can increase deco-
herence. But more repetitions of feature map can represent complex data structures in
Hilbert space and by increasing the number of repetitions for ansatz and can increase
the expressibility resultantly broadening the solution search space that could poten-
tially capture intricate data patterns, which will not only help enhance the accuracy
but it will also make the model as good with less number of features. Here comes the
trade off between the accuracy and complexity of algorithm. Until the fault tolerant
quantum computers we have to look for algorithms that less number if qubits with
minimum circuit depth.
Sections 6.1 and 6.2 presents combination with the optimal number of reps. from
the graphs accuracy peeks can be seen for the combinations 13, 14, 15, 24, and 25,
where furst number in each combination represents feature map’s reps and second
number represents number of reos for the ansatz.

62
Bibliography

[1] M. Demmer, R. Fonseca, F. Koushanfar, Richard feynman: Simulating physics


with computers (01 2008).
[2] M. A. Nielsen, I. L. Chuang, Quantum computation and quantum information,
Cambridge university press, 2010.
[3] S. S. Gill, A. Kumar, H. Singh, M. Singh, K. Kaur, M. Usman, R. Buyya, Quantum
computing: A taxonomy, systematic review and future directions (2021). arXiv:
2010.15559.
[4] T. coding school, Introduction to quantum computing 23-24, lecture week 8 (2023).
[5] P. A. M. Dirac, A new notation for quantum mechanics (1939).
[6] E. Rieffel, W. Polak, An introduction to quantum computing for non-physicists,
ACM Computing Surveys (CSUR) 32 (3) (2000) 300–335.
[7] F. Jazaeri, A. Beckers, A. Tajalli, J.-M. Sallese, A review on quantum comput-
ing: From qubits to front-end electronics and cryogenic mosfet physics, in: 2019
MIXDES-26th International Conference" Mixed Design of Integrated Circuits and
Systems", IEEE, 2019, pp. 15–25.
[8] I. B. Djordjevic, Chapter 1 - basics of quantum information, quan-
tum communication, quantum sensing, and quantum networking, in:
I. B. Djordjevic (Ed.), Quantum Communication, Quantum Networks,
and Quantum Sensing, Academic Press, 2022, pp. 1–30. doi:https:
//doi.org/10.1016/B978-0-12-822942-2.00002-9.
URL https://www.sciencedirect.com/science/article/pii/
B9780128229422000029
[9] J. Preskill, The physics of quantum information (2022). arXiv:2208.08064.
[10] R. Horodecki, P. Horodecki, M. Horodecki, K. Horodecki, Quantum entanglement,
Rev. Mod. Phys. 81 (2009) 865–942. doi:10.1103/RevModPhys.81.865.
URL https://link.aps.org/doi/10.1103/RevModPhys.81.865

63
[11] T. Albash, D. A. Lidar, Adiabatic quantum computation, Reviews of Modern
Physics 90 (1) (2018) 015002.

[12] Z. Bian, F. Chudak, W. G. Macready, G. Rose, The ising model: teaching an old
problem new tricks, D-wave systems 2 (2010) 1–32.

[13] E. Farhi, J. Goldstone, S. Gutmann, M. Sipser, Quantum computation by adia-


batic evolution, arXiv preprint quant-ph/0001106 (2000).

[14] J. A. Jones, D. Jaksch, Quantum information, computation and communication,


Cambridge University Press, 2012.

[15] Proceedings of the Royal Society of London. Series A: Mathematical and Physical
Sciences 449 (1937) (1995) 669–677. doi:10.1098/rspa.1995.0065, [link].
URL http://dx.doi.org/10.1098/rspa.1995.0065

[16] N. Herrmann, D. Arya, M. W. Doherty, A. Mingare, J. C. Pillay, F. Preis, S. Pres-


tel, Quantum utility – definition and assessment of a practical quantum advantage,
in: 2023 IEEE International Conference on Quantum Software (QSW), IEEE,
2023. doi:10.1109/qsw59989.2023.00028.
URL http://dx.doi.org/10.1109/QSW59989.2023.00028

[17] H. Yasuoka, Computational complexity of quadratic unconstrained binary opti-


mization (2022). arXiv:2109.10048.

[18] http://s3.amazonaws.com/sf-web-assets-prod/wp-content/uploads/2018/
05/29144646/Scott_AAronson_Quantum_And_Classical_Uncertainty.svg,
[Accessed 14-03-2024].

[19] J. H. Lutz, N. Lutz, E. Mayordomo, Dimension and the structure of complexity


classes (2021). arXiv:2109.05956.

[20] T. Cormen, C. Leiserson, R. Rivest, C. Stein, Introduction to Algorithms, Third


Edition, 2009.

[21] D. Bovet, P. Crescenzi, An introduction to the theory of computational complexity,


2006, pp. 102–111. doi:10.1007/3-540-50316-1_9.

[22] A. W. Harrow, A. Montanaro, Quantum computational supremacy, Nature


549 (7671) (2017) 203–209. doi:10.1038/nature23458.
URL http://dx.doi.org/10.1038/nature23458

[23] J. Watrous, Quantum computational complexity (2008). arXiv:0804.3401.

[24] J. Librande, Bqp is not in np (2022). arXiv:2209.10398.

64
[25] K. A. Tychola, T. Kalampokas, G. A. Papakostas, Quantum machine learning—an
overview, Electronics 12 (11) (2023) 2379.

[26] A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow,
" O’Reilly Media, Inc.", 2022.

[27] T. M. Mitchell, Machine learning (1997).

[28] T. Hastie, R. Tibshirani, J. H. Friedman, J. H. Friedman, The elements of statis-


tical learning: data mining, inference, and prediction, Vol. 2, Springer, 2009.

[29] R. S. Sutton, A. G. Barto, Reinforcement learning: An introduction, MIT press,


2018.

[30] L. Alchieri, D. Badalotti, P. Bonardi, S. Bianco, An introduction to quantum ma-


chine learning: from quantum logic to quantum deep learning, Quantum Machine
Intelligence 3 (2021) 1–30.

[31] P. Wittek, Quantum machine learning: what quantum computing means to data
mining, Academic Press, 2014.

[32] S. Aaronson, The learnability of quantum states, Proceedings of the Royal Society
A: Mathematical, Physical and Engineering Sciences 463 (2088) (2007) 3089–3114.

[33] V. Dunjko, J. M. Taylor, H. J. Briegel, Quantum-enhanced machine learning,


Physical review letters 117 (13) (2016) 130501.

[34] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd, Quantum


machine learning, Nature 549 (7671) (2017) 195–202.

[35] P. Rebentrost, M. Mohseni, S. Lloyd, Quantum support vector machine for big
data classification, Physical review letters 113 (13) (2014) 130503.

[36] E. Badr, S. Almotairi, M. A. Salam, H. Ahmed, New sequential and parallel sup-
port vector machine with grey wolf optimizer for breast cancer diagnosis, Alexan-
dria Engineering Journal 61 (3) (2022) 2520–2534.

[37] O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schutt,


A. Tkatchenko, K.-R. Muller, Machine learning force fields, Chemical Reviews
121 (16) (2021) 10142–10186.

[38] M. Schuld, N. Killoran, Quantum machine learning in feature hilbert spaces, Phys-
ical Review Letters 122 (4) (Feb. 2019). doi:10.1103/physrevlett.122.040504.
URL http://dx.doi.org/10.1103/PhysRevLett.122.040504

65
[39] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, Y. Bengio, Generative adversarial networks (2014). arXiv:1406.
2661.

[40] various authors, Qiskit Textbook, Github, 2023.


URL https://github.com/Qiskit/textbook

[41] P.-L. Dallaire-Demers, N. Killoran, Quantum generative adversarial networks,


Physical Review A 98 (1) (2018) 012324.

[42] S. Lloyd, C. Weedbrook, Quantum generative adversarial learning, Physical review


letters 121 (4) (2018) 040502.

[43] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love,


A. Aspuru-Guzik, J. L. O’Brien, A variational eigenvalue solver on a photonic
quantum processor, Nature Communications 5 (1) (Jul. 2014). doi:10.1038/
ncomms5213.
URL http://dx.doi.org/10.1038/ncomms5213

[44] D. J. Griffiths, D. F. Schroeter, Introduction to quantum mechanics, Cambridge


university press, 2018.

[45] V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow,


J. M. Gambetta, Supervised learning with quantum-enhanced feature spaces, Na-
ture 567 (7747) (2019) 209–212.

[46] S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, N. Killoran, Quantum embeddings for


machine learning (2020). arXiv:2001.03622.

[47] R. LaRose, B. Coyle, Robust data encodings for quantum classifiers, Physical
Review A 102 (3) (2020) 032420.

[48] S. Sim, P. D. Johnson, A. Aspuru-Guzik, Expressibility and entangling capabil-


ity of parameterized quantum circuits for hybrid quantum-classical algorithms,
Advanced Quantum Technologies 2 (12) (2019) 1900070.

[49] G. K. Brennen, An observable measure of entanglement for pure states of multi-


qubit systems (2003). arXiv:quant-ph/0305094.

[50] A. Kandala, A. Mezzacapo, K. Temme, M. Takita, M. Brink, J. M. Chow, J. M.


Gambetta, Hardware-efficient variational quantum eigensolver for small molecules
and quantum magnets, nature 549 (7671) (2017) 242–246.

[51] M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, N. Killoran, Evaluating analytic


gradients on quantum hardware, Physical Review A 99 (3) (2019) 032331.

66
[52] A. Gilyén, S. Arunachalam, N. Wiebe, Optimizing quantum optimization algo-
rithms via faster quantum gradient computation, in: Proceedings of the Thirtieth
Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2019, pp. 1425–
1444.

[53] M. chandra, Github.


URL https://raw.githubusercontent.com/michellechandraa/
michellechandraa/main/penguins.csv

[54] V. Havlíček, A. D. Córcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow,


J. M. Gambetta, Supervised learning with quantum-enhanced feature spaces, Na-
ture 567 (7747) (2019) 209–212. doi:10.1038/s41586-019-0980-2.
URL http://dx.doi.org/10.1038/s41586-019-0980-2

[55] Y. Cao, J. Romero, J. P. Olson, M. Degroote, P. D. Johnson, M. Kieferová, I. D.


Kivlichan, T. Menke, B. Peropadre, N. P. D. Sawaya, S. Sim, L. Veis, A. Aspuru-
Guzik, Quantum chemistry in the age of quantum computing, Chemical Reviews
119 (19) (2019) 10856–10915. doi:10.1021/acs.chemrev.8b00803.
URL http://dx.doi.org/10.1021/acs.chemrev.8b00803

[56] D. Yoffe, A. Natan, A. Makmal, A qubit-efficient variational selected


configuration-interaction method (2023). arXiv:2302.06691.

[57] J. R. McClean, J. Romero, R. Babbush, A. Aspuru-Guzik, The theory of varia-


tional hybrid quantum-classical algorithms, New Journal of Physics 18 (2) (2016)
023023.

[58] A. Agarwal, Y. Song, W. Sun, K. Wang, M. Wang, X. Zhang, Provable benefits


of representational transfer in reinforcement learning (2023). arXiv:2205.14571.

[59] C. Ciliberto, M. Herbster, A. D. Ialongo, M. Pontil, A. Rocchetto, S. Severini,


L. Wossnig, Quantum machine learning: a classical perspective, Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Sciences 474 (2209)
(2018) 20170551.

[60] J. Tilly, H. Chen, S. Cao, D. Picozzi, K. Setia, Y. Li, E. Grant, L. Wossnig,


I. Rungger, G. H. Booth, J. Tennyson, The variational quantum eigensolver: A
review of methods and best practices, Physics Reports 986 (2022) 1–128. doi:
10.1016/j.physrep.2022.08.003.
URL http://dx.doi.org/10.1016/j.physrep.2022.08.003

67

You might also like