[go: up one dir, main page]

0% found this document useful (0 votes)
38 views114 pages

Introduction to Quantum Information Science

Uploaded by

Nicolas Redfern
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views114 pages

Introduction to Quantum Information Science

Uploaded by

Nicolas Redfern
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 114

Introduction to Quantum Information Science

Lecture 1: General introduction

1 The first quantum revolution

At the end of the nineteenth century physics seemed to be solidly built on the pillars of Newto-
nian mechanics, Maxwell’s electromagnetism and statistical mechanics. A central tenet of what
we now call classical physics, was that particles have definite position and momentum and their
trajectories are governed by laws of motion. However, in the beginning of the twentieth century
this view led to paradoxes like the instability of matter, and failed to account for the discreteness
of the atomic emission and absorption spectra. The paradoxes disappeared provided that one made
some ad-hoc assumptions. For example, certain physical observables like energy could only take
some discrete values, in particular light is made up of discrete packets of energy called photons.
This eventually lead to the birth of quantum mechanics in the 1920’s as a fundamental theory of
nature. In Schrödinger’s version the state of a particle is described by a wave function (q; t) rather
than a point (q, p) in the phase space. The time evolution of the wave function is described by
Schrödinger’s equation i~@ /@t = iH . In Heisenberg’s version, the classical observables de-
fined as functions on the phase space, were replaced by (infinite) matrices (or operators) which in
general do not commute with each other. For example the position and momentum observables
satisfy the commutation relation QP PQ = i~1. The quantisation of energy observed in ex-
periments could be explained by the fact that the hamiltonian H has a discrete spectrum. The two
theories turned out to be equivalent and the mathematical foundation of quantum mechanics was
laid by von Neumann in his book “The Mathematical Foundations of Quantum Mechanics”, using
the theory of Hilbert spaces. Fast forward to present times, quantum mechanics and quantum field
theory are among the most successful scientific theories ever created, and testify to what Wigner
called “the unreasonable effectiveness of mathematics in physical sciences” [1].

2 The second quantum revolution

“We are currently in the midst of a second quantum revolution. The first quantum revolution gave
us new rules that govern physical reality. The second quantum revolution will take these rules and
use them to develop new technologies” [2].

Physicists agreed early on that quantum mechanics is a probabilistic theory, the notable exception
being Einstein who considered that quantum mechanics is an incomplete theory – “God does not
play dice”. However, until the 60’s this randomness was not directly observable. Most experiments
were performed on huge ensembles of quantum systems and one would observe “frequencies” of

1
different outcomes which for practical purposes are indistinguishable from the actual probabilities.
To get an intuition about this, think of the difference between tossing a coin and observing heads
or tails, and tossing 1023 coins and observing that the proportions of heads and tails are 50% each.
Many physicists believed that quantum measurements can only be about the latter rather than the
former type, as it was summarised by Schrödinger [3]: “we never experiment with just one electron
or atom or (small) molecule. In thought experiments we sometimes assume that we do; this invari-
ably entails ridiculous consequences... we are not experimenting with single particles any more than
we can raise Ichthyosauria in the zoo. ”
What happened in the last decennia is precisely that (well, except for the Ichthyosauria bit). Thanks
to the discovery of new technologies individual quantum systems can be prepared, manipulated
and measured with a high degree of control. A great variety of quantum devices like ion traps,
quantum dots, optical cavities, nanomechanical systems, superconducting qubits, are engineered
and used to probe the limits of quantum theory in a systematic fashion. The drive for pushing
technology to the quantum limit has a double motivation. The first is the natural trend towards
miniaturisation, which will ultimately lead to building devices at nanometres length-scales where
quantum mechanics starts to play a role. The second more fundamental motivation is that quantum
mechanics offers the promise of a completely new technology which uses quantum effects to achieve
superior performance in comparison to the classical one. This possibility was famously envisaged
by Feynman who realised that a quantum computer would be much more efficient than a classical
one in simulating the dynamics of quantum systems [4].

3 The branches of Quantum Information Science

Quantum Information Science is a cross-disciplinary field joining quantum theory with a number of
“classical” disciplines: information theory, computation, control theory, probability and statistics.
Although each of these directions has a distinct flavour, the common feature is that physics takes
a back seat and one concentrates on the mathematical formalism, the interplay, the similarities and
but especially the new possibilities brought by the quantum world.
Quantum Information and Cryptography. Classical information theory originates from Shan-
non’s 1948 paper [5] which establishes the entropy as a measure of information and shows that it
is possible to communicate reliably over noisy channels provided that the rate of communication
is below a certain threshold, called channel capacity. The information’s format, whether spoken
language or radio waves is not essential, as long as it can be transformed into a binary message, that
is a sequence of classical zero’s and one’s. Quantum Information starts from the idea that quantum
systems are the ultimate physical medium for storing and processing information, and tries to ex-
tend Shannon’s theory by replacing bits of information by generic 2-dimensional quantum systems
called “qubits”, and classical channels by noisy quantum channels. Current research in this fields
concentrates on understanding and characterising various types of capacities of quantum channels,
quantifying quantum correlations such as entanglement, revisiting the foundations of quantum me-
chanics from an information theory perspective. Closely related to this, is the field of Quantum
Cryptography whose protocols are already being tested in realistic conditions. Here one would like
to transmit or share information securely, by using the fact that quantum states cannot be learned
without being disturbed.
Quantum Computation and Quantum Error Correction. An algorithm on a classical computers
consists of a sequence of simple operations called gates performed on a register of bits (variables

2
with values 0 or 1). The result of the computation is given by the values of the bits at the end of
the algorithm. In a quantum computer, the bits are replaced by qubits, whose state is manipulated
by performing a sequence of quantum operations called quantum gates on one or two-qubits. The
last step of a quantum algorithm is to measure all qubits, producing a sequence of classical bits.
Since the result is random, quantum algorithms give the right answer with a certain probability, so
the operation may need to be repeated several times. One of the most surprising results in quantum
computation is Shor’s algorithm [6] which shows that the problem of factoring large numbers could
be solved efficiently on a quantum computer, while it is believed to be a “hard” problem in the
classical set-up. Ongoing theoretical efforts point in the direction of other exciting applications in
learning and artificial intelligence, medicine, environment science.
The promise of immense computational power offered by quantum computers motivates the ex-
perimentalists’ effort in engineering the building blocks which and giving proof of principle of
their working. The scalability of the different physical architectures remains however an impor-
tant challenge! In a realistic setting most quantum operations are imperfect or noisy, and the cu-
mulative effect of the gate errors leads to decoherence and failure of the computation. Quantum
error correction is an important sub-field of quantum computation which deals with the encoding
of logical qubits, and the detection and correction of quantum errors. Recently, Tech giants like
Google, Microsoft and IBM have joined the race to build a quantum computer, and Google is on
track to build a 72 qubits quantum computer. IBM has developed the "Q Experience" website
(https://www.research.ibm.com/ibm-q/) which currently allows external users to test and program a
16 qubits processor.
Quantum Probability, Statistics and Metrology. Quantum probability aims at understanding
quantum phenomena in the light of classical probabilistic concepts. This allows the transfer of
ideas and techniques from classical to quantum and forms a mathematical framework for related
fields such as quantum stochastic control theory. As an example of the deep connections unveiled
in this way, the quantum stochastic calculus [7] (developed in part by Robin Hudson in Notting-
ham) allows to interpret the dynamics of an atom interacting with the electromagnetic field, as a
non-commutative version of a stochastic dynamics driven by Brownian motion. Quantum noise,
quantum Markov processes, stochastic Schrödinger equations, have become standard probabilistic
topics in quantum physics [8].
Quantum Statistics deals primarily with making statistical inference from the quantum measurement
data. While quantum mechanics specifies the direct map from quantum states to probability distri-
butions, quantum statistics is about the inverse problem of estimating an unknown state from the
random outcomes of quantum measurements [9]. In the quest for preparing new and exotic quantum
states (e.g. highly entangled states of ions, or highly squeezed states of light), the experimentalist
has to certify the result by performing a statistical reconstruction of the state, which often turns out
to be a very challenging problem in itself [10].
Quantum systems can also be used as high precision probes for the measurement of unknown pa-
rameters. For instance, the spin of a particle is sensitive to a magnetic field B; by preparing an
ensemble of n particles in a given spin direction and measuring them after the interaction with
the magnetic field we obtain information about the strength of the field. One of the key findings
of Quantum Metrology is that the estimation precision can be significantly improved by preparing
the systems in special entangled states [11]. Significant experimental efforts are dedicated to the
creation of measurement devices which are able to achieve quantum enhanced precision.
Quantum Control Engineering. In classical Filtering Theory one tries to make a good estimate of

3
a signal of interest which is not accessible, from another noisy signal which can be measured and
is correlated with the former. Viacheslav Belavkin (also formerly in Nottingham!) was the first to
realise that filtering theory offers the right mathematical perspective for understanding the problem
of continuous time measurements, and that of “quantum jumps”. Since most measurements are
performed indirectly (we do not measure an atom but the light emitted by it), we can think of the
system’s dynamics as the inaccessible signal, while the outcomes of measuring the environment are
the accessible signal. Then the “quantum jumps” represent the system’s conditional evolution, given
the information contained in the outcomes [12]. Quantum (stochastic) control goes one step further,
and aims at controlling the behaviour of the system by acting on it in a real time feedback loop
which takes into account this information. Judging by the role played by classical control theory
in technological development, starting with the Watts governor, it is probably safe to assume that
quantum control theory may play a similar role in the quantum technological revolution! [13]

REFRENCES
[1] E.P. Wigner, The unreasonable effectiveness of mathematics in the natural sciences, Communi-
cations on Pure and Applied Mathematics 13 1-14 (1960)
[2] J.P. Dowling and G. Milburn, Quantum technology: the second quantum revolution, Phil. Trans.
R. Soc. Lond. A 361 1655-1674 (2003)
[3] E. Schrödinger, Are there quantum jumps ?, British Journal of the Philosophy of Sciences, 3
233-242 (1952)
[4] R. P. Feynman, Simulating physics with computers, Int. J. Theor. Phys. 21 467 (1982)
[5] C.E. Shannon, A mathematical theory of communication, Bell System Technical Journal 27
379-423 and 623-656 (1948)
[6] P.W. Shor, Algorithms for quantum computation: discrete logarithms and factoring, in Proceed-
ings, 35th Annual Symposium on Foundations of Computer Science, IEEE Press, Los Alamitos, CA
(1994)
[7] K. R. Parthasarathy, An introduction to quantum stochastic calculus, Birkhäuser (1992)
[8] C. W. Gardiner and P. Zoller Quantum noise: a handbook of Markovian and non-Markovian
quantum stochastic methods with applications to quantum optics, Springer (2010)
[9] O. E. Barndorff-Nielsen R.D Gill and P.E. Jupp, On quantum statistical inference (with discus-
sion), J. R. Statist. Soc. B 65 775-816 (2003)
[10] H. Häffner et al, Scalable multiparticle entanglement of trapped ions, Nature 438 643-646
(2005)
[11] V. Giovannetti, S. Lloyd and L. Maccone, Advances in quantum metrology, Nature Photonics
5 222-229 (2011)
[12] H. M. Wiseman and G. Milburn, Quantum Measurement and Control, Cambridge University
Press (2009)
[13] H. Mabuchi and N. Khaneja, Principles and applications of control in quantum systems, Int. J.
Robust Nonlinear Control 15 647-667 (2005)

4
Introduction to Quantum Information Science (G14QIS)

Lecture 2: Hilbert spaces

Abstract: The Hilbert space is a key mathematical concept for quantum mechanics; each quantum system
has an associated Hilbert space whose vectors describe the states of the system. In this lecture we recap basic
notions of Hilbert spaces like orthonormal basis, Fourier decomposition, projection on a subspace.

1 Column and row vectors


Recall that the n-dimensional complex linear space Cn is the space of column vectors of the form
0 1
z1
B z2 C
B C
z := B . C (1)
@ .. A
zn

with z1 , . . . , zd arbitrary complex numbers. In these lectures we adopt the physics (Dirac) convention and we
denote the column vector z by the “ket” symbol |zi. The linear structure on Cd is that of entry-wise addition
and multiplication by complex scalars:
0 1 0 1 0 1
z1 w1 ↵z1 + w1
B z2 C B w2 C B ↵z2 + w2 C
B C B C B C
↵|zi + |wi = ↵ B . C + B . C = |↵z + wi = B .. C, ↵, 2 C.
.
@ . A @ . A . @ . A
zn wn ↵zn + wn

By transposing the column vector (1) we obtain the row vector

z T = (z1 , . . . , zd ), (2)

and a similar rule applies for addition and multiplication by scalars. If in addition we complex conjugate all
entries we obtain a row vector z ⇤ called the adjoint of z

z ⇤ = (z̄1 , . . . , z̄d ).

In the Dirac notation the adjoint is denoted as the “bra” vector hz|. The advantage of this convention is that
many calculations become more transparent, the first instance of this being the inner product which is usually
defined by
Xd
hz|wi = z ⇤ · w = z¯j wj
j=1

The inner product encapsulates the idea of a linear space in which we have a notion of orthogonality, or more
generally of angle between vectors, and length of a vector.

1
Most calculations in these lectures can be carried out by using the concrete column-row representation of
vectors described above. However, from a conceptual point of view it is useful to think of such objects in a
more abstract, “coordinate-free” fashion; The goal of the first few lectures is to get used to working in this
mathematical framework, which will then be applied to quantum mechanics and quantum information.

2 Hilbert spaces
We start by defining the general notion of an inner product and that of a Hilbert space. In these lectures
the focus will be on complex rather than real Hilbert spaces, and all results will be formulated for finite
dimensional spaces. In a separate section we will briefly discuss the additional features involved in the
infinite dimensional case, which sometimes have to do with “technicalities”, but are often of a qualitative
nature, and become indispensable when describing continuous variables such as position and momentum of
a free particle.
Definition 1 (Hilbert space). Let H be a finite dimensional complex linear space. An inner product on H is
a map h·|·i : H ⇥ H ! C satisfying the following conditions. For all |ui, |vi|wi 2 H, ↵ 2 C:

1. hu|ui 0
2. hu|ui = 0 if and only if |ui = 0.
3. hu|v + wi = hu|vi + hu|wi.
4. hu|↵vi = ↵hu|vi
5. hu|vi = hv|ui.
p
The norm of a vector |ui is defined as kuk = hu|ui.
A space H with an inner product h·|·i is called a Hilbert space.

Note that property 5. is different from its analogue in real vector spaces hu|vi = hv|ui! In particular 5.
together with 4. imply that h↵u|vi = ↵ ¯ hu|vi so the inner product is linear in the second entry and anti-linear
in the first. Such forms are also called sesquiliniar. When hu|vi = 0 = hv|ui, we say that the two vectors are
orthogonal on each other.
Example 2. These are two basic examples of finite and an infinite dimensional Hilbert spaces:

1. Cn as space of column vectors z := (z1 , z2 , . . . zn )T with inner product


n
X
hz|wi = zj w j = z ⇤ w
j=1

2. L2 (R): the space of square integrable complex valued functions on R with


Z 1
hf |gi = f (x)g(x)dx.
1

The inner product structure provides us with two important inequalities.


Lemma 3 (Cauchy-Schwarz and triangle inequality). Let (V, h·|·i) be an inner product space and let
|ui, |vi be arbitrary vectors in V . Then the following inequalities hold

|hu|vi|  kuk · kvk Cauchy Schwarz inequality (3)


ku + vk  kuk + kvk triangle inequality (4)

2
Proof. We prove the Cauchy-Schwarz inequality for kvk = 1, from which the general case follows by
multiplication with a constant. We write the vector |ui as

|ui = hv|ui|vi + |wi

where |wi is a vector which is orthogonal on |vi, i.e. hv|wi = 0 (check that this is true). Then

kuk2 = hhv|uiv + w | hv|uiv + wi = |hv|ui|2 + kwk2

which implies that |hv|ui|2  kuk2 .


The triangle inequality follows from the sequence of inequalities

ku + vk2 = hu + v|u + vi = kuk2 + hu|vi + hv|ui + kvk2


= kuk2 + 2Re(hu|vi) + kvk2
 kuk2 + 2|hu|vi| + kvk2
 kuk2 + 2kuk · kvk + kvk2 = (kuk + kvk)2

where we have used that Re(z)  |z| in the first inequality and and Cauchy-Schwarz in the second inequality.

Definition 4 (orhonormal basis (ONB)). Let H be a Hilbert space. A set of vectors {|u1 i, . . . , |ud i} is a
basis of H if the vectors are linearly independent and span the whole space, so than any vector |ui 2 H has
a unique decomposition of the form
Xd
|ui = ci |ui i,
i=1

with complex coefficients c1, . . . , cd .


Example 5. The column vectors
✓ ◆ ✓ ◆
1 1
|ui = , |vi =
0 1

form a basis in C2 . Indeed any vector |zi can be written as


✓ ◆ ✓ ◆ ✓ ◆
z1 1 1
|zi = = (z1 z2 ) + z2
z2 0 1

Definition 6 (orhonormal basis (ONB)). A basis {|e1 i, . . . , |ed i} of H is called orthonormal if and only if

hei |ej i = ij , for all i, j = 1, . . . , d. (5)

Example 7. The “standard basis” in Cn consists of the vectors


0 1 0 1 0 1
1 0 0
B 0 C B 1 C B 0 C
B C B C B C
|e1 i := B . C , |e2 i := B . C , ··· |en i := B .. C
.
@ . A .
@ . A @ . A
0 0 1

In this case, any column vector |zi = (z1 , . . . , zn )T can be written as


n
X
|zi = zi |ei i.
i=1

3
The fact that there exists an orthonormal basis can be shown by performing the so called Gram-Schmidt
process which starts with and arbitrary basis and produces an ONB as the result of an orthonormalisation
procedure (see Ex. 3, Ex. Sheet 1 for a 3 dimensional example). An immediate consequence of this definition
is the formula
Xd
|ui = hei |ui|ei i (6)
i=1

which shows that the coefficients of a vector |ui with respect to an ONB can be easily computed as inner
products with the basis vectors. In analogy to the theory of Fourier series, we will call hei |ui the Fourier
coefficients of |ui. The proof of the identity (6) follows by taking inner product of both sides with the basis
vectors |ei i and using the orthogonality condition (5). Furthermore, the inner product hu|vi and the norm
kuk can also be expressed in terms of the Fourier coefficients as:
d
X
hu|vi = hu|ei ihei |vi
i=1
d
X
kuk2 = hu|ui = |hei |ui|2 .
i=1

This means that for any given ONB, the Hilbert space H of dimension d can be identified with Cd via the
correspondence 0 1
u1
d
X B u2 C
B C
|ui = ui |ei i 2 H ! B . C 2 Cd .
i=1
@ .. A
un
In Hilbert space theory we often deal with (orthogonal) subspaces, and decompositions of vectors along
subspaces.
Definition 8 (subspace, orthogonal complement). Let H be a Hilbert space. A linear subspace K of H is
a set of vectors such that

|ui, |vi 2 K =) ↵|ui + |vi 2 K, for all ↵, 2 C.

The orthogonal complement of K is the linear subspace K? defined by

K? := {|vi 2 H : hv|ui = 0, for all u 2 K}.

We will show that each vector has a unique decomposition into its components in K and K? . Let us choose
an orthonormal basis {|e1 i, . . . , |ek i, |f1 i, . . . , |fp i} in H such that |ei i 2 K and |fj i 2 K? . For any vector
|ui 2 H we can write the Fourier decomposition
k
X p
X
|ui = hei |ui|ei i + hfj |ui|fj i = |uK i + |uK? i
i=1 j=1

with two unique components |uK i 2 K and |uK? i 2 K? . The geometric interpretation of these components
is given by the following lemma which is left as an exercise.
Lemma 9. Let H be a Hilbert space, and let K be a linear subspace of H. Then for any |ui 2 H

min ku vk = kuK? k
|vi2K

and the minimum is achieved for |vi = |uK i.

Next lecture we revisit the map |ui 7! |uK i when defining the notion of orthogonal projection.

4
3 Infinite dimensional Hilbert spaces
Here we collect some brief remarks on how the notions introduced so far extend to infinite dimensional
spaces. This section is not examinable.
Completeness. In infinite dimensions, a space with an inner product (Definition 1) is a Hilbert space if
it satisfies the additional completeness property that any Cauchy sequence of vectors converges to a limit
vector. For instance, the space of continuous functions C([0, 1]) can be equipped with an inner product
Z 1
hf |gi = f (x)g(x)dx.
0

but is not complete. For example one can approximate the indicator function [1/3,2/3] arbitrarily well by
continuous functions but it is not a continuous function itself. Incomplete space such as C([0, 1]) can be
completed by a standard procedure which effectively amounts to enlarging the space by adding the limits all
Cauchy sequences. In the case of C([0, 1]) the result in the space of square integrable functions L2 ([0, 1])
which carries the same inner product. A very useful ONB in p L ([0, 1]) is the (infinite)
2
p Fourier basis which
consists of the vectors (functions) |c0 i = 1, and |cn i = 2 cos(2⇡nx), |sn i = 2 sin(2⇡nx) for n =
1, 2 . . . so that any function can be expanded as
1
X 1
X
p p
f = h1|f i + hcn |f i 2 cos(2⇡nx) + hsn |f i 2 sin(2⇡nx)
n=1 n=1

4 Summary of (finite) Hilbert spaces


Notation / Property Description

|xi vector in the Hilbert space, also known as “ket”


can be thought of as the column vector of its coefficients

hy|xi inner product between the vectors |yi and |xi

hy| vector in the dual space of linear functionals, also known as “bra”
its action on the Hilbert space is hy| : |xi 7! hy|xi
can be thought of as a row vector

{|e1 i, . . . , |ed i} if the vectors are linear independent and span the space, this is a basis
if additionally, hei |ej i = ij this denotes an orthonormal basis (ONB)
Pd
|xi = i=1 hei |xi|ei i Fourier decomposition of a vector |xi with respect to an
orthonormal basis {|e1 i, . . . , |ed i}

K, K? ⇢ H a linear subspace K of H and its orthogonal complement K?


|xi = |xK i + |xK? i unique decomposition of a vector with |xK i 2 K and |xK? i 2 K?

5
6
Introduction to Quantum Information Science

Lecture 3: Linear operators on Hibert spaces

Abstract: In quantum mechanics, observables are described in terms of linear transformations (operators)
on the space of states. In this lecture we review the concept of linear operators on Hilbert spaces, and their
matrix representation.

1 Matrices
The rows and columns introduced in Lecture 2 are special cases of rectangular matrices. A m ⇥ n complex
matrix is an array of complex number with m rows and n columns:
0 1
A11 A12 . . . A1n
B A21 A22 . . . A2n C
B C
A=B . .. .. .. C
@ .. . . . A
Am1 Am2 . . . Amn
so in particular the column vector is a n ⇥ 1 matrix and the row vector is a 1 ⇥ n matrix.
Throughout the lectures we will denote the space of m ⇥ n complex matrices by Mm,n or simply Mn in the
special case of square n ⇥ n matrices. We will also write A = [Aij ] to denote a matrix with matrix elements
Aij . Let us recall the basic operations operations with matrices:
1. Transposition: if A = [Aij ] 2 Mm,n then its transpose is the matrix AT 2 Mn,m with matrix elements
ATij := Aji .
2. Conjugation: if A = [Aij ] 2 Mm,n then its complex conjugate is the matrix Ā 2 Mm,n with matrix
elements (Ā)ij := Aij .
3. Adjoint: if A = [Aij ] 2 Mm,n then its adjoint A⇤ 2 Mn,m is obtained by taking transposition and
complex conjugation A⇤ = ĀT .
4. Product: if A = [Aij ] 2 Mm,n and B = [Bij ] 2 Mn,k then the product AB is the matrix in Mm,k with
elements
Xn
(AB)il = Aij Bjl
j=1

The product rule is best understood if we think of matrices as linear transformations rather than arrays of
complex numbers. Let us identify the space of column vector Mn,1 with Cn as before, and define the linear
transformation A : Cn ! Cm associated to the matrix A, by its action on a column vector z 2 Cn :
A : z 7! Az 2 Cm
P
or on components: (Az)i = j Aij zj . With this identification, the product AB becomes the matrix associ-
ated to the composition of the linear transformations A and B:
B A
AB : Ck ! Cn ! Cm .

1
To make the distinction between linear transformations and their associated matrices we will use boldface
letters (e.g. A) for the former and normal letters (e.g. A) for the latter. Linear algebra and matrix analysis are
fundamental subjects in mathematics with numerous applications in physics, probability theory, engineering,
etc. While most of the concepts encountered in this module can be formulated in terms of matrices, there are
several reasons why it is preferable to work with linear transformations (or operators), and use matrices only
as representations of such transformations with respect to some particular basis:
1. many notions of quantum physics have a natural interpretation as transformation, e.g. the dynamics of
a quantum system is determined by a unitary transformation, a quantum channel is a completely positive
transformation over states.
2. the matrix formalism becomes inadequate when dealing with infinite dimensional spaces such as L2 (R),
the states space of a free quantum particle. John von Neumann was the first to realise that quantum mechanics
should be founded on the theory of Hilbert spaces and linear operators on such spaces, rather than infinite
matrices as first introduced by Heisenberg.
3. the linear operators formalism is basis-free, which often offers a greater conceptual clarity.

2 Linear operators on Hilbert spaces


All Hilbert spaces of a given dimension are isomorphic, i.e. they are related by one to one linear maps which
preserve the inner products. Due to this fact the Hilbert spaces are less interesting in themselves and should
rather be seen as the scene, the framework, for the action of linear operators.
Definition 1 (linear operator). Let (H, h·|·i) be a Hilbert space. A linear operator on H is map A : H ! H
satisfying the linearity condition

A(↵|xi + |yi) = ↵A|xi + A|yi, .

for all |xi, |yi 2 H and ↵, 2 C.

Any linear operator on a d-dimensional Hilbert space H can be represented as a matrix. Indeed if we choose
a basis {|v1 i, . . . , |vd i} of vectors in H, then we can write
d
X
A|vi i = Aji |vj i, i = 1, . . . , d (1)
j=1

for some unique coefficients Aji (see below for the reason why the indices are in this order). If
d
X
|xi := xi |vi i
i=1

is an arbitrary vector, then the action of A is the same as the left multiplication of the column vector of
coefficients (x1 , . . . , xd )T by the matrix A = [Aij ]:
0 1
Xd d
X
A|xi = @ Aij xj A |vi i.
i=1 j=1

If additionally, the basis {|v1 i, . . . , |vd i} is orthonormal, then by taking inner product with with the vector
|vk i in (1) we obtain the following expression of the matrix elements:

Aki = hvk |A|vi i.

2
Example 2 (rank-one operators). Let |ui, |vi be two vectors in H. The operator |uihv| is defined by

|uihv| : |wi 7! hv|wi · |ui

Its image is the one dimensional subspace C|ui and the corresponding matrix has rank one. An arbitrary
operator A can be expressed as a linear combination of rank-one operators
d
X
A= Aij |vi ihvj |,
i,j=1

which can be verified by taking inner product with basis vectors on both sides.

Until now we have discussed only about operators on a single Hilbert space H. This can be easily extended
to linear operators acting between different spaces A : H1 ! H2 which will be represented as rectangular
matrices Aij = hvi |A|ej i for two fixed ONB in H1 and respectively H2 . A special case is that of a linear
functional : H ! C which is always of the form (prove this!)

u (|xi) = hu|xi

for a certain vector |ui 2 H. Its action suggest to denote u by a “bra” vector hu|, which does not belong to
H but rather to the dual of H, the space of linear functionals on H.

3 The adjoint
Definition 3 (adjoint). Let A be an operator acting on the Hilbert space H. The adjoint of A denoted A⇤
is the unique operator satisfying

hx|A⇤ |yi = hy|A|xi, for all |xi, |yi 2 H. (2)

An operator for which A = A⇤ is called self-adjoint.

Let Ai,j := hvi |A|vj i be the matrix elements of A with respect to an orthonormal basis {|vi i}. It follows
from definition that the matrix elements of A⇤ are hvi |A⇤ |vj i = A⇤ij which agrees with the definition of
adjoint for matrices introduced earlier. The adjoint has the following properties which follow from their
matrix analogues:

1. (aA + bB)⇤ = āA⇤ + b̄B⇤

2. (AB)⇤ = B⇤ A⇤
3. (A⇤ )⇤ = A
Remark 4. For real valued matrices, the adjoint is the same as transpose, while for complex matrices one
needs to perform an additional complex conjugation. From the following matrices
✓ ◆ ✓ ◆ ✓ ◆
2 1+i 1 i 1+i 1
A := , B := C :=
1 i 1 i 1 1 1 i

only A is self-adjoint (hermitian).


Remark 5. Sometimes we will use |Axi as alternative notation for the vector A|xi. From the definition of
the adjoint we deduce the following useful property

hx|Ayi = hx|A|yi = hy|A⇤ |xi = hy|A⇤ xi = hA⇤ x|yi.

3
4 Projections
We have seen that for each linear subspace K of a Hilbert space H there exists a unique decomposition of a
vector |xi into its component in K and that orthogonal to K.

|xi = |xK i + |xK? i, |xK i 2 K, |xK? i 2 K? .

We define the linear operator PK called the orthogonal projection onto K by

PK : |xi 7! |xK i.

It follows immediately that PK · PK = P2K = PK , and PK = P⇤K since

hx|PK |yi = hxK + xK? |PK |yK + yK? i = hxK |yK i = hyK |xK i = hy|PK |xi = hx|P⇤K |yi.

We leave it as an exercise to show that any operator with these two properties must be an orthogonal projec-
tion onto the range Ran(P) := {P|xi : |xi 2 H} := [PH], and make the following equivalent definition
which does not refer to the subspace K.
Definition 6 (projection). Let H be a Hilbert space. A projection operator P is a linear operator on H
satisfying

1. P = P⇤
2. P2 = P · P = P
Example 7. Let |xi be a unit vector (kxk = 1). Then the rank-one operator |xihx| is a projection onto the
one-dimensional subspace C|xi spanned by |xi.

In the general case where K := Ran(P) ⌘ {P|xi : |xi 2 H} is multidimensional, we can choose an
arbitrary orthonormal basis {|k1 i, . . . , |kr i} in K and express P as the sum of one-dimensional projections
onto these vectors
Xr
P= |ki ihki |.
i=1

Indeed by choosing an arbitrary orthonormal basis {w1 , . . . , wp } in K? we can write any vector as a Fourier
sum over all basis elements
r
X p
X r
X
|xi = hki |xi|ki i + hwj |xi|wj i = ( |ki ihki |)|xi + |xK? i = P|xi + |xK? i.
i=1 j=1 i=1

5 Normal and unitary operators


Definition 8. Let H be a Hilbert space. An operator N on H is called normal if N⇤ N = NN⇤ . An operator
U is called unitary if UU⇤ = U⇤ U = I, where I is the identity operator I : |xi 7! |xi.

From the definition it follows that a unitary U is invertible and its inverse is U 1 = U⇤ . Moreover unitary
operators have the property that both U and U⇤ preserve the inner products (are isometric):

hUx|Uyi = hx|U⇤ Uyi = hx|yi = hx|UU⇤ yi = hU⇤ x|U⇤ yi, |xi, |yi 2 H.

which is the complex analogue of the property hOx|Oyi = hx|yi for orthogonal operators on real vector
spaces. If {|v1 i, . . . , |vd i} is an ONB in H, then by the isometry property we find that {U|v1 i, . . . , U|vd i}
is also an ONB, so unitary transformations can be seen as changes of ONB’s. In fact the converse is also

4
true: for each given pair of ONB’s there is a unique unitary which transforms the vectors of the first basis
into those of the second basis (exercise).
Normal operators include selfadjoint and unitary operators; an example of a normal operator which is neither
of these is the partial isometry
Xk
V := ei i |vi ihvi |
i=1

where e 2 C are some phases, and k is strictly smaller than the dimension d of H. The spectral theorem
i i

will give the general characterisation of normal operators.

6 Trace
Recall that the trace of a matrix A 2 Md is given by
d
X
Tr(A) := Aii .
i=1

Since any linear operator A : H ! H on a d-dimensional Hilbert space H can be represented as a matrix
with elements Aij = hvi |A|vj i, we will define its trace as
d
X
Tr(A) := Tr(A) = hvi |A|vi i.
i=1

The trace has the following properties:

1. linearity: Tr(aA + bB) = aTr(A) + bTr(B)


2. cyclicity: Tr(AB) = Tr(BA)
3. basis independence: if {|w1 i, . . . , |wd i} is another ONB basis in H then
d
X d
X
Tr(A) = hvi |A|vi i = hwi |A|wi i
i=1 i=1

4. Tr(|xihy|) = hy|xi for all |xi, |yi 2 H


5. If Px := |xihx| and Py := |yihy| are two one dimensional projections (with kxk = kyk = 1) then
Tr(Px Py ) = |hx|yi|2

Proof.
1. follows immediately from the definition.
2. we make use of the following useful trick: we write AB as A · I · B and then write the identity operator
Pd
as I = j=1 |vj ihvj |. Then
* 0 1 +
X n X d Xd
Tr(AB) = hvi |AB|vi i = vi |A @ A
|vj ihvj | B|vi
i=1 i=1 j=1
d d
* d
! +
X X X
= hvi |A|vj i · hvj |B|vi i = vj |B |vi ihvi | A|vj
i,j=1 j=1 i=1
= Tr(BA).

5
3. we use the fact that there exists a (unique) unitary U such that U|vi i = |wi i, for all i = 1, . . . , d. Then
d
X d
X d
X
hwi |A|wi i = hUvi |A|Uvi i = hvi |U⇤ AU|vi i
i=1 i=1 i=1
= Tr(U⇤ AU) = Tr(UU⇤ A) = Tr(A)

where in the last two steps we used the cyclicity of the trace and the fact that U is unitary.
Pd
4. We use the handy Dirac notation and the decomposition of the identity I = i=1 |vi ihvi | to get
d d d
!
X X X
Tr(|xihy|) = hvi | (|xihy|) |vi i = hvi |xihy|vi i = hy| |vi ihvi | |xi = hy|I|xi = hy|xi.
i=1 i=1 i=1

5. Similarly to point 4. we have

d d d
!
X X X
Tr(Px Py ) = hvi |xihx|yihy|vi i = hx|yi hy| |vi ihvi | |xi = |hx|yi|2 .
i=1 i=1 i=1

7 Operators on infinite dimensional Hilbert spaces


Unboundedness. If H is infinite dimensional (e.g. H = L2 (R)) then some of the linear transformation that
occur in naturally in quantum mechanics, may not be defined on the whole space H. For example the action
of the ‘position’ operator
Q : (x) 7! x (x), 2 L2 (R),
R
is not defined for vectors such that x2 | (x)|2 = 1 since the right hand side is not a vector in L2 (R).
Such transformations are called unbounded operators, and their proper mathematical treatment is the subject
of operator theory. This involves the definition of their domain, as a dense linear subspace of H on which
the action makes sense. Much of the finite dimensional theory can be extended to unbounded operators, a
key example being the spectral theorem for unbounded self-adjoint operators, with direct applications in the
measurement of unbounded observables.
Trace. If H is an infinite dimensional space, Tr(A) may be infinite or ill defined. For example Tr(I) = 1
since the space has a (countably) infinite ONB. However, the trace can be defined on a certain linear subspace
T1 (H) of B(H), called the space trace-class operators. Later we will see that this space is the span of density
matrices (quantum states), and is the non-commutative (quantum) analogue of the L1 space of probability
densities (e.g. L1 (R) for probability densities on R).

6
8 Summary of linear operators on Hilbert spaces

Notation / Property Description


A:H!H linear operator on the Hilbert space H
its matrix in a given ONB basis {|e1P i, . . . , |ed i} is Aij := hei |A|ej i
action on basis vectors A : |ej i 7! j Aij |ei i
A⇤ : H ! H adjoint of A with defining property hx|A⇤ |yi = hy|A|xi
its matrix in a given ONB basis {|e1 i, . . . , |ed i} is (A⇤ )ij := Aji
hx|Ayi = hA⇤ x|yi rule for passing A on the other side of the inner product
|xihy| rank one operator
action on vectors |xihy| : |zi 7! hy|zi|xi
the corresponding matrix is of rank one
P its adjoint is (|xihy|)⇤ = |yihx|
A= ij Aij |ei ihej | decomposition of A into rank one “units”

N N = NN ⇤
normal operator
U⇤ U = UU⇤ = I unitary operator; special case of normal operator
isometry property: hUx|Uyi = hx|yi; maps an ONB into another ONB
A = A⇤ selfadjoint operator; special case of normal operator
P = P⇤ = P2 projection operator
P projects a vector |xi = |xK i + |xK? i onto a subspace K, such that P : |xi 7! |xK i
Tr(A) = i hei |A|ei i the trace of the operator A
trace is cyclic: Tr(AB); it is basis independent; Tr(|xihy|) = hy|xi

7
8
Introduction to Quantum Information Science

Lecture 4: Spectral Theorem

Abstract: This lecture deals with the characterisation of operators in terms of their “spectral data”, i.e. eigen-
values and eigenvectors. The key result is that, in the case of normal operators, the spectral data completely
determines the operator. This theorem provided the probabilistic interpretation quantum mechanics.

1 The Spectral Theorem


Definition 1 (eigenvalue and eigenvector). Let H be a Hilbert space and let A : H ! H be a linear
operator. A (non-zero) vector |xi is called eigenvector of A with eigenvalue 2 C if

A|xi = |xi.

The set of all possible eigenvalues of A is called the spectrum of A and is denoted (A). The linear subspace
spanned by the eigenvectors with a given eigenvalue is called eigenspace, and its dimension is called the
multiplicity of .

It can be shown that the spectrum of an operator is always a non-empty set. The following example shows
that in general, the number of eigenvectors may be smaller that the dimension of the space, or that a given
eigenvalue may have several linearly independent eigenvectors.
Example 2. 1 . Let A be the operator on C2 whose matrix with respect to the standard basis is
✓ ◆
2 0
A= .
2 1

The eigenvalue problem is to find the solutions (pairs of eigenvalues and eigenvectors) to the equation A|xi =
|xi, or in terms of the matrix A
✓ ◆✓ ◆ ✓ ◆
2 0 c1 c1
= , , c 1 , c2 2 C
2 1 c2 c2

From this we obtain the following equations



2c1 = c1
(1)
2c1 + c2 = c2

To find the eigenvalues we solve the equation Det(A I) = 0, and obtain two solutions 1 = 1, 2 = 2.
Next we plug each of these values in (1) and solve for the eigenvector. For = 1 we obtain the eigenvector
(c1 , c2 ) = (0, 1), and for 2 we get (c1 , c2 ) = (1, 2). Note that the two eigenvectors are not orthogonal.
2. Consider now the selfadjoint operator A with matrix
✓ ◆
0 1
A= .
1 0

1
By solving the same equations as above we find two different eigenvalues = 1 and = 1 with eigenvectors
(1, 1) and respectively (1, 1). The fact that the two eignvectors are orthogonal is a general property of self-
adjoint operators, as we will see in the Spectral Theorem.
3. Let A be the operator with matrix ✓ ◆
1 0
A= .
1 1
In this case there is only one eigenvalue = 1 with eigenvector (0, 1), i.e. the basis vector |e2 i.
4. Finally, if the matrix A is proportional to the identity e.g.
✓ ◆
3 0
A= ,
0 3

then there exist a single eigenvalue = 3 and the corresponding eigenvectors span the whole space, i.e. any
vector is an eigenvector, so the multiplicity of is 2.

In quantum mechanics, self-adjoint operators play the special role of observables, while unitaries represent
the dynamics of closed quantum systems. Such operators, and more generally any normal operator, can be
completely characterised in terms of their eigenvalues and eigenvectors.
Theorem 3 (Spectral Theorem). Let N be a normal operator on a d-dimensional Hilbert space H. Then
there exists an ONB {|u1 i, . . . , |ud i} and a sequence of complex numbers 1 , . . . , d such that
d
X
N= i |ui ihui |. (2)
i=1

Before giving the proof, let us make a few remarks on the statement of the Spectral Theorem.
1. If N is of the form (2) then N|ui i = i |ui i since hui |uj i = ij ; thus i is an eigenvalue of N with
eigenvector |ui i.
2. The matrix of N with respect to the ONB of eigenvectors {|u1 i, . . . , |ud i} is the diagonal matrix D :=
Diag( 1 , . . . , d ). If {|v1 i, . . . , |vd i} is another ONB basis, then the corresponding matrix is
X
Njk = hvj |N|vk i = hvj |ui i i hui |vk i = [U DU ⇤ ]j,k
i

where U is the unitary matrix Uji := hvj |ui i. Therefore the Spectral Theorem is equivalent to the statement
that any normal matrix can be diagonalised by a unitary transformation.
3. Some of the eigenvalues in the sequence 1 , . . . , d may be equal. In that case we can sum up all projectors
|ui ihui | which have the same eigenvalue to obtain a larger projector P onto the eigenspace [P H] and
rewrite N as X
N= P
2 (N)

with the sum running over all distinct eigenvalues with eigenprojectors P .
4. The eigenprojectors P are uniquely defined and mutually orthogonal P P 0 = 0 .

Proof of the Spectral Theorem. (not examinable)


The idea of the proof is to diagonalise N step by step, adding a new eigenvalue at each step. Since the number
of eigenvalues is finite the construction will end after a finite number of steps.
Let be an eigenvalue of N, and let P be the projection onto the corresponding eigenspace, and Q := I P
the projector onto the orthogonal complement. Then

N = (P + Q)N(P + Q) = PNP + PNQ + QNP + QNQ

2
Let us see how the 4 terms on the right side act on an arbitrary vector |xi = |Pxi + |Qxi.
Since |Pxi is an eigenvector of N we have
PNP : |xi 7! PN|Pxi = P|Pxi = |Pxi
Then, since QP = 0 we have
QNP : |xi 7! QN|Pxi = Q|Pxi = 0
which means that QNP = 0.
We will now show that PNQ = 0. Note that if N|vi = |vi then NN⇤ |vi = N⇤ N|vi = N⇤ |vi which
means that N⇤ |vi is also an eigenvector and belongs to the space [PH]. This implies that QN⇤ P = 0 and
by taking the adjoint we obtain PNQ = 0.
Thus far we have shown that N acts separately on the orthogonal subspaces [PH] and [QH], as a block-
diagonal matrix ✓ ◆
P 0
N=
0 QNQ

We will now prove that the block QNQ is a normal operator. Note that QN = QN(Q + P) = QNQ and
similarly QN⇤ = QN⇤ (Q + P) = QN⇤ Q. Using this we obtain
(QNQ) · (QN⇤ Q) = QNQN⇤ Q = QNN⇤ Q = QN⇤ NQ
= QN⇤ QNQ = (QN⇤ Q) · (QNQ),
which means that QNQ is normal.
We can now repeat the procedure used in the beginning, to further decompose QNQ into a block of the form
P and another normal operator, and so on, until we end up with the spectral form
0 0

X
N= P
2 (N)

where P is the eigenprojection corresponding to the eigenvalue .

Example 4. We apply the spectral theorem to three important classes of normal operators:
P
1. an operator A is selfadjoint if and only if A = P where all eigenvalues are real.
P
2. an operator U is unitary if and only if U = P where all eigenvalues are complex phases = ei .
3. an operator P is a projection if and only if it has only two eigenvalues, 0 and 1. The rank of P is the
multiplicity of the eigenvalue 1.

2 Positive operators
Definition 5 (positive operator). Let H be a Hilbert space. An operator A is called positive if
hx|A|xi 0, for all vectors |xi 2 H.

The simplest example of a positive operator is a one dimensional projection Py := |yihy| for which
hx|Py |xi = hx|yihy|xi = |hx|yi|2 0.
Note that by definition, if A 0 and B 0 then aA+bB 0 for any positive numbers a, b. Mathematically,
this means that the positive operators form a cone. At the end of the section we will show that any positive
operator is a linear combination of projections with positive coefficients.

3
Lemma 6. Any positive operator A is self-adjoint.

Proof. We will use a trick called the the polarisation identity. Since hx|A|xi 0 for all |xi we have

hx + iy|A|x + iyi = hx|A|xi + hy|A|yi + ihx|A|yi ihy|A|xi 0


hx iy|A|x iyi = hx|A|xi + hy|A|yi ihx|A|yi + ihy|A|xi 0

By subtracting the two equations we find that

i(hx|A|yi hy|A|xi) 2 R. (3)

Similarly, by subtracting the following

hx + y|A|x + yi = hx|A|xi + hy|A|yi + hx|A|yi + hy|A|xi 0


hx y|A|x yi = hx|A|xi + hy|A|yi hx|A|yi hy|A|xi 0

we obtain
hx|A|yi + hy|A|xi 2 R (4)

From (3) and (4) we get hx|A|yi = hy|A|xi = hx|A⇤ |yi, which means that A = A⇤ .
We will use the symbol A 0 to denote a positive operator and A B to denote the partial order on
selfadjoint operators defined by A B 0. Using the spectral theorem we will show that any selfadjoint
operator A can be written as a difference of two positive operators. Indeed by summing over positive and
negative eigevalues separately we get
X X X
A= P = P | |P = A+ A
2 (A) >0 <0

where A+ and A are positive operators called the positive and negative part of A.
Lemma 7. An operator A is positive if and only if it can be written as A = B⇤ B, where B is another
operator.

Proof. The converse implication is immediate: if B is an arbitrary operator then

hx|B⇤ B|xi = kBxk2 0.

For the direct implication, if A 0 then all its eigenvalues are non-negative, and using the spectral theorem
we have
0 10 1
X X p X p p p p ⇤p
A= P =@ P A@ P A= A A = A A.
2 (A) 2 (A) 2 (A

Here we have used P the fact


p that the
p spectral projections are orthogonal on each other and we denoted the
selfadjoint operator P by A. As we will see, this is a special case of a general functional calculus
for self-adjoint operators.

In conclusion, we have shown that an operator A is positive if and only if it is self-adjoint and all its eigen-
values are non-negative.

4
3 Functions of normal operators
We have already p seen that for a positive operator p A we p can define its square root as the unique positive
operator denoted A which has the property that A A = A. Can this be done more generally, does
f (A) exist for any function, and does it satisfy the right properties ?
Definition 8 (functional calculus). Let A be a normal operator with spectral decomposition
X
A= P
2 (A)

and let f : C ! C be a function. We define the normal operator f (A) by


X
f (A) = f ( )P .
2 (A)

It is now easy to show that the definition “works” in the sense that the functions of A have the following
properties, which means that the map f 7! f (A) is a ⇤ -morphism from functions to operators
1. af (A) + bg(A) = (af + bg)(A)
2. (f · g)(A) = f (A) · g(A)
3. f¯(A) = f (A⇤ )
P
Example 9. Let H = P be a selfadjoint operator. Then exp(itH) is the unitary
X
exp(it )P .

In quantum mechanics H could be the Hamiltonian of a system in which case exp(itH) is the unitary de-
scribing the evolution over a time period t.

4 Spectral Theorem in infinite dimensional Hilbert spaces


The Spectral Theorem can be extended to general (bounded or unbounded) normal operators on infinite
dimensional Hilbert spaces, and is the backbone of the probabilistic interpretation of quantum mechanics.
The proof goes beyond the scope of this module, but we mention here a few important features.
1. Some self-adjoint operators like the position and momentum operators on L2 (R) do not have any eigenvec-
tors. To develop their spectral theory one needs to extend the notion of spectrum based on eigenvalues to that
of complex numbers for which I A does not have a bounded inverse. An important class of operators for
which the spectrum does consist only of eigenvalues are the compact operators which contain the trace-class,
the finite rank, the Hilbert-Schmidt operators.
2. Since in general a point in the spectrum does not have an associated eigenvector, we also need to generalise
the notion of eigenprojector. The appropriate notion is that of a projection valued measure which associates
to each measurable subset E of the spectrum, an orthogonal projection in such a way that -additivity holds,
this time with projections rather than probabilities! Once this is done the spectral theorem says that any
normal operator can be expressed as an integral over its spectrum of the form
Z
A= P(d ).
(A)

where P(d ) is the projection valued measure associated to A.

5
5 Summary of Spectral Theorem

Notation / Property Description


PA|xi = |xi is an eigenvalue of A with eigenvector |xi
A = i i |ei ihei | Spectral Theorem for normal operators:
{ 1 , . . . , d }are the eigenvalues (possibly repeated) of A
{|e1 i, . . . |ed i} is the ONB of eigenvectors of A
A 0 positive operator, hx|A|xi 0 for all |xi
it is a selfadjoint operator with non-negative eigenvalues
B⇤ B 0 for any B
A= U |A| polar decomposition of A p
P with U unitary and |A| = A⇤ A the absolute value of A
f (A) = i f ( i )|ei ihei | for any normal operator A we can define a function of A
by replacing i with f ( i ) in the spectral decomposition

6
Introduction to Quantum Information Science

Lecture 5: States, observables and qubits

Abstract: In classical physics the state of the system is typically described by a point in a phase space
P whose time evolution is governed by dynamical laws. In principle the state can be completely known
by measuring the different variables like positions and momenta of the constituents, and this can be done
without disturbing the system. The measurable quantities, or observables are functions f : P ! R on the
phase space.
In quantum mechanics the fundamental notions are still those of state, dynamics, observable and measurement
but the mathematical framework in which they operate is radically different. Setting up this framework is
usually done by enumerating a (variable) number of postulates which cannot be proven, or derived from a
“simpler” theory. Since the experimental evidence up to date is strongly in favour of the quantum mechanical
framework, we will accept the postulates as such, and concentrate on understanding their consequences. The
4 postulates can be summarised as follows:
Postulate 1 defines the state space of a quantum system.
Postulate 2 describes the time evolution of the state.
Postulate 3 defines measurements on quantum systems.
Postulate 4 specifies how several quantum systems can be combined into a composite system.
The postulates should be seen as a set of general guiding rules rather than an axiomatic approach to quan-
tum mechanics. Their purpose is to prepare the ground for a more comprehensive operational framework
developed afterwards.

1 States

Postulate 1
a) Associated to any closed quantum system is a complex Hilbert space H, known as the state space of the
system.
b) The state of the system is represented by a unit vector in H known as the state vector. Vectors which differ
by a phase factor represent the same state.
The postulate does not tell us which particular Hilbert space is associated to each quantum system, nor
does it tell us what the state vector of the system is. To answer such questions physicists have developed
sophisticated theories describing atoms, light, the elementary particles, the interactions between them. For
our purposes it will suffice to know that the state spaces we deal with, do have a physical realisation, but
we will concentrate instead on the mathematical structures and the connections with other fields such as
computation and information theory. For instance, a quantum system whose state space is n-dimensional,
also called an n-level system, could be realised by restricting the state space of an atomic system to certain
energy levels. A continuous variables system with the infinite dimensional state space L2 (R) could be a
particle moving on the real line, or a mode of light trapped in a cavity.

1
The simplest possible quantum system is the two level system. In Quantum Information and Computation
this is called a qubit, and is the quantum counterpart to the classical bit. In this context, it is convenient to
identify a special orthonormal basis called the “computational basis” of the qubit, whose vectors are denoted
|0i and |1i, as the analogues of the 0 and 1 states of the classical bit. However, the state postulate says that
the qubit does not have only two states but a continuum of states described by the vectors
| i = a|0i + b|1i,
where a and b are complex numbers satisfying the condition |a|2 + |b|2 = 1 which is equivalent to the nor-
malisation condition k k2 = 1. Such states are called (linear) superpositions and often have intriguing prop-
erties, e.g. Schrödinger considered the thought experiment in which a cat was in a superposition of |deadi
and |alivei states, until this coherence was destroyed by a measurement of its state. In real experiments,
creating such superpositions of “basic” states may sometimes come for free due to the interactions between
the different components of the system, but often is an experimental challenge, e.g. the implementation of a
quantum algorithm.
Remark 1 (states as one dimensional projections). Another way of defining a state | i is to specify its
one-dimensional orthogonal projection | ih |. Unlike the vector which is defined only up to a phase (| i
and | 0 i := ei | i represent the same state), the operator representation is unique: | 0 ih 0 | = | ih |. Other
advantages of working with the “density matrix” representation will become clear when we discuss statistical
mixtures of states and partial states of composite systems.

2 Observables
While the state describes the preparation of the system, the observables represent physical quantities that can
be measured like position, momentum and energy. We will first introduce the mathematical definition of an
observable and postpone the discussion of its measurement until postulate 3.
Definition 2 (observable). An observable of a system with state space H is a self-adjoint operator on H.

By the spectral theorem any observable A on a d-dimensional space has a spectral decomposition
d
X X
A= i |ei ihei | = P ,
i=1 2 (A)

where {|e1 i, . . . , |ed i} is an orthonormal basis of eigenvectors and {P : 2 (A)} is the set of mutually
orthogonal eigenprojectors associated to the distinct eigenvalues. As we will see in the measurement postu-
late, the eigenvalues i represent the possible outcomes of a measurement of A. When the system is prepared
in an eigenstates |ei i the outcome of the measurement is i with probability one.

Non-commutative probability theory. In the classical set-up the observables form a commutative algebra:
if f, g : P ! R are two observables then the product f · g = g · f is also an observable. In quantum
mechanics different observables do not commute in general, i.e. AB 6= BA, and in particular AB is not an
observable. This observation is the starting point of non-commutative (or quantum) probability theory which
tries to extend concepts and techniques from the classical world of commuting random variables to that of
quantum mechanics. This philosophy has lead to important achievements such as the development of the
quantum stochastic calculus, quantum filtering and control and applications in quantum open systems.

3 Dynamics
The second postulate describes the evolution of a closed quantum system, and is reminiscent of the hamilto-
nian dynamics in classical mechanics.

2
Postulate 2
The evolution of a closed quantum system is described by a unitary transformation of the state space. More
precisely, the state | (t2 )i of the system at time t2 is related to the state of the system at an earlier time t1 by
a unitary operator U(t1 , t2 ) :
| (t2 )i = U(t1 , t2 )| (t1 )i.

Typically, the evolution of an isolated system is given by the Schrödinger equation (which is sometimes
included in the postulate)
d
| (t)i = iH| (t)i
dt
P
where H = i Ei |Ei ihEi | is a special observable called the Hamiltonian of the system, whose eigenvalues
Ei are the energy levels. From the Schrödinger equation it follows that if the state at time t1 is | (t1 )i = |Ei i
then
| (t2 )i = e i(t2 t1 )Ei |Ei i
which means that the state is invariant under
P the evolution since the two vectors differ only by a phase factor.
In general, if the initial state | (t1 )i = ↵i |Ei i is a superposition of energy eigenstates, the time evolution
will develop relative phases between the different amplitudes, and the state is not time invariant anymore:
X
| (t2 )i = ↵i e i(t2 t1 ) |Ei i =
6 | (t1 )i.
i

By using the functional calculus for the selfadjoint operator H we find the postulated unitary operator de-
scribing the evolution from t1 to t2

U(t1 , t2 ) = exp ( i(t2 t1 )H) .

The unitary U(t1 , t2 ) does not always have to be of this form. For example, the dynamics of an atom
can be changed by driving it with a laser which (in a certain approximation) amounts to having a time-
dependent Hamiltonian. In Quantum Computation, an algorithm consists of applying a certain sequence of
simple unitary transformations called quantum gates to a register of qubits, followed by a measurement which
produces the result of the computation. The challenge here is to enginner and control systems in which such
transformations can be reliably implemented.

Remark 3. Since unitary transformations preserve the norms of the vectors, if | i is a state then | 0 i :=
U| i is a state. If we describe the state as a projection, then the action of the unitary is

| ih | 7! | 0 ih 0 | = U| ih |U⇤ .

4 The qubit and the Bloch ball representation


In this section we look in more detail at the qubit by introducing a very useful representation of its states and
observables called the Bloch sphere representation.
As before we use the orthonormal basis with vectors |0i and |1i and write an arbitrary vector state as | i =
a|0i + b|1i, with |a|2 + |b|2 = 1. An equivalent way of writing this is
✓ ◆
i ✓ i ✓
| i=e cos |0i + e sin |1i
2 2

where , ✓, are real, 0  ✓  ⇡ and 0   2⇡. Since the overall phase is irrelevant, the state is
determined by the angles ✓, and can be represented as a vector r on the unit sphere with polar coordinates
(✓, ), as in Figure 1. In particular, the basis vectors |0i and |1i at the north and south pole of the sphere.

3
Figure 1: The Bloch ball representation of the state | i

We represent now the observables in a similar fashion. For this we first introduce three observables x, y, z
whose matrices with respect to the standard basis are the Pauli matrices
✓ ◆ ✓ ◆ ✓ ◆
0 1 0 i 1 0
x = , y = , z = .
1 0 i 0 0 1
Physically, these observables may represent for instance the spin components along the axes x, y, z for a spin-
1/2 particle. They are also the generators of unitary transformations of determinant one (the special unitary
group SU (2)), i.e. they satisfy the commutation relations
[ x, y] = 2i z, [ y, z] = 2i x, [ z, x] = 2i y.

The second property is that the Pauli matrices together with the identity span the linear space of selfadjoint
matrices, i.e. any A = A⇤ can be written in a unique way as (verify this!)
✓ ◆
1 a + rz rx iry 1
A= = (aI + rx x + ry y + rz z ) , (1)
2 r x + ir y a r z 2
where a = Tr(A) 2 R, and r = (rx , ry , rz ) 2 R3 is a vector called the Bloch vector of A. The following
lemma gives a simple characterisation of positive operators and states, seen as one dimensional projections.
Lemma 4. Let A be a selfadjont operator with Bloch representation (1). Then
q
1. A 0 if and only if a 0 and krk := rx2 + ry2 + rz2  a

2. A is a one dimensional projection if and only if a = 1 and krk = 1


3. A is trace-one positive operator (density matrix) if and only if a = 1 and krk  1.

Proof. 1. A is positive if and only if its trace and determinant are positive. From (1) the determinant is
Det(A) = (a2 krk2 )/4 which proves 1.
2. A is a one dimensional projection if and only if it eigenvalues 0 and 1, i.e. Tr(A) = 1 and Det(A) = 0,
which means krk = a = 1.
3. This follows from 1. with Tr(A) = 1.
In particular point 2. gives another way of writing the state as
1
| ih | = (I + rx x + ry y + rz z) (2)
2
and by taking matrix elements it is easy to verify that r = (rx , ry , rz ) coincides with the vector defined earlier
through its polar coordinates (✓, ) (exercise). We will call (2) the Bloch vector representation of | ih |.

4
5 Single qubit gates
A key concept in quantum computation is that of a quantum gate. This is a unitary transformation action on
one or several qubits, and is the basic building block of a quantum algorithm. Here we introduce the basic
one-qubit gates and their relationships. We will return to this topic in a future lecture discussing quantum
computation.
The Pauli gates denoted X, Y, Z are given by the Pauli matrices x, y,seen as unitary transformations
z
✓ ◆ ✓ ◆ ✓ ◆
0 1 0 i 1 0
X := , Y := , Z :=
1 0 i 0 0 1

In particular the X gate has a similar action to the classical NOT gate

X : |0i 7! |1i, X : |1i 7! |0i.

In quantum computation the X gate is represented as box with an input and an output wire, corresponding
to the qubit state before and respectively after applying the gate. Similar symbols are used for the Y and Z
gates.

Figure 2: Symbolic representation of an X gate in quantum computation

Additional single qubit gates are the Hadamard gate (denoted H), the phase gate (denoted S) and the T gate
✓ ◆ ✓ ◆ ✓ ◆
1 1 1 1 0 1 0
H := p , S := , T := .
2 1 1 0 i 0 exp(i⇡/4)

The Pauli gates have the following commutation relations with H (exercise)

HXH = Z, HY H = Y, HZH = X. (3)

An arbitrary single qubit unitary can be written in the form U = exp(i↵)R~n (✓) where
✓ ◆ ✓ ◆
✓ ✓
R~n (✓) = exp( i✓~n~ /2) = cos I i sin (nx X + ny Y + nz Z) (4)
2 2

denotes unitary performing a rotation by an angle ✓ around the unit vector ~n in the Bloch sphere representation
(exercise). Using (3) we find
HR~n (✓)H = Rm ~ (✓) (5)
where the rotation axis is m
~ = (nz , ny , nx ).
In fact one can perform an arbitrary unitary U by using only rotations around two axes given two non-parallel
unit vectors ~n and m
~ (exercise):

U = exp(i↵)R~n ( )Rm
~ ( )R~
n( ) (6)

for some appropriate angles , , . Following up on this, we will later show that one can approximate any
given unitary gate with arbitrary precision by applying only the H and T gates in an appropriate sequence.

5
6
Introduction to Quantum Information Science

Lecture 6: Composite systems and tensor products

Abstract: The Hilbert space of a composite system is the tensor product of the spaces associated to the
different components. In this lecture we study the properties of tensor products of Hilbert spaces and operators
and the concept of entangled state of a bipartite system.

1 Heuristic argument for entanglement


Until now we have only discussed about how to describe a single isolated quantum system. However the most
interesting phenomena occur when several systems are brought together and interact with each other, whether
by a natural interaction such as that between light and matter, or by a more "engineered" one as in Quantum
Computation and Quantum Control. The last postulate addresses the question of how to describe the state
space of several quantum systems. In the classical case the joint phase space of two systems is simply the
cartesian product P1 ⇥ P2 , so any joint state is a pair of individual states x = (x1 , x2 ) 2 P1 ⇥ P2 . We will
now give a heuristic argument indicating that in the quantum case, the right notion of joint space is the tensor
product of Hilbert spaces. Consider two isolated qubits with states

| 1i = a0 |0i + a1 |1i, | 2i = b0 |0i + b1 |1i.

Together, the two qubits form a larger isolated system, so by the first postulate the state space of this system
must be a Hilbert space H12 . Let us denote the vector state of the joint system by | i := | 1 i| 2 i 2 H12 .
Since H12 is a linear space, it is natural to assume that the joint state is linear with respect to each of the two
qubit vectors, i.e.

| 1 i| 2 i = a0 b0 |0i|0i + a0 b1 |0i|1i + a1 b0 |1i|0i + a1 b1 |1i|1i.

This means that any state | 1 i| 2 i can be written as a linear combination of 4 ‘basic’ states
{|0i|0i, |0i|1i, |1i|0i, |1i|1i}. But if we also assume that these states are linearly independent, we find that
there exist vectors which are not of the type | 1 i| 2 i, for example (verify this!)

|0i|0i + |1i|1i.

Hence, the joint space of states is not the cartesian product of the two separate states spaces as in the classical
case, but contains additional linear superpositions of such vectors called entangled states. Now, if we assume
additionally that the basic states form an orthonormal basis, we have constructed the tensor product Hilbert
space C2 ⌦ C2 ! This concept is analysed in more detail in the next section.

1
2 Tensor products of Hilbert spaces
We start by defining the tensor product of column vectors. If z = (z1 , , . . . , zn )T 2 Cn and w = (w1 , . . . , wm )T 2
Cm are two column vectors, then their tensor product is the vector z ⌦ w 2 Cn·m
0 1
z1 w 1
B .. C
B . C
B C
B z1 w m C
B C
z⌦w =B
B .. C
. C
B C
B zn w 1 C
B C
B .. C
@ . A
zn w m
which consists of n blocks of of form zi · w with i = 1, . . . , n. Note that z ⌦ w is linear with respect to each
z and w, e.g. (az + by) ⌦ w = a(z ⌦ w) + b(y ⌦ w) , for a, b 2 C. We take now a "basis free" perspective
by defining tensor products of Hilbert spaces.
Definition 1 (tensor product of Hilbert spaces).
1. Let V1 and V2 be two linear spaces. The tensor product V1 ⌦ V2 is the linear space spanned by elements
of the form u ⌦ v where u 2 V1 and v 2 V2 such that the following relations hold

i. (u + u0 ) ⌦ v = u ⌦ v + u0 ⌦ v,
ii. u ⌦ (v + v 0 ) = u ⌦ v + u ⌦ v 0 ,
iii. (↵u) ⌦ v = u ⌦ (↵v) = ↵(u ⌦ v)

where ↵ is a scalar, u, u0 2 V1 and v, v 0 2 V2 are arbitrary vectors.


2. Let H1 and H2 be two Hilbert spaces. The tensor product H1 ⌦ H2 becomes a Hilbert space when
endowed with the inner product obtained by extending
hu ⌦ v|u0 ⌦ v 0 i = hu|u0 ihv|v 0 i
to a sesquilinear form h·|·i : H1 ⌦ H2 ⇥ H1 ⌦ H2 ! C.

To understand the structure of the tensor product it is convenient to consider an orthonormal basis {|e1 i, . . . , |ed1 i}
in H1 and another one {|f1 i, . . . , |fd2 i} in H2 . Then
hei1 ⌦ fj1 |ei2 ⌦ fj2 i = hei1 |ei2 i · hfj1 |fj2 i = i1 ,i2 j1 ,j2

which implies that the d1 ⇥ d2 vectors


{|e1 ⌦ f1 i, . . . , |e1 ⌦ fd2 i, . . . , |ed1 ⌦ f1 i, . . . , |ed1 ⌦ fd2 i} (1)
form an orthonormal basis of H1 ⌦ H2 . This means that any vector | i 2 H1 ⌦ H2 has a Fourier decompo-
sition
d1 X
X d2 d1 X
X d2
| i= cij |ei ⌦ fj i = hei ⌦ fj | i |ei ⌦ fj i.
i=1 j=1 i=1 j=1

In particular, if
d1
X
| 1i = ↵i |ei i 2 H1
i=1
d2
X
| 2i = j |fj i 2 H2
j=1

2
are two vectors, and we denote their column vectors of coefficients by ↵ 2 Md1 ,1 and 2 Md2 ,1 , then
| 1 ⌦ 2 i is the product vector
d1 X
X d2
| 1 ⌦ 2i = ↵i j |ei ⌦ fj i,
i=1 j=1

whose column vector of coefficients is equal to ↵ ⌦ 2 Md1 d2 ,1 , when written in the same order as the basis
vectors (1).

3 The 4th postulate and the concept of entanglement


We are now in the position to formulate the last postulate of quantum mechanics.

Postulate 4.
1. The state space of a composite system is the tensor product of the state spaces of the component systems.
2. If the components are numbered 1 through N, and the system number i is prepared in state | i i in isolation
from the others, then the joint state of the total system is

| i=| 1i ⌦ ··· ⌦ | N i.

As noted earlier, the postulate allows for states of composite systems which are not products. These are
called entangled states, and were first discussed in a famous paper by Einstein Podolsky and Rosen (EPR)
as the basis of an argument claiming that quantum mechanics is incomplete [1]. The term entanglement was
later coined by Schrödinger who was the first to recognise the importance of the concept [3]:
“I would not call [entanglement] one but rather the characteristic trait of quantum mechanics, the one that
enforces its entire departure from classical lines of thought.”
In 1964, John Bell devised an inequality for correlations between outcomes of certain measurements per-
formed on spatially separated systems. The inequality must be respected by all classical “local realist”
theories but is violated by quantum mechanics when the systems are prepared in an entangled state [2]. Since
the 70’s and 80’s the inequality has been tested in numerous experiments, but a conclusive loophole-free
experimental verification has been achieved only recently [4]. In Quantum Information Theory entanglement
is seen as a resource of non-classical correlations whose usefulness has been demonstrated in protocols like
teleportation, superdense coding, in quantum cryptography, and in quantum computation. Some of these
protocols will be discussed in detail in future lectures, but for the moment we take a quick look at a simple
example.

3.1 Bell basis for two qubits states

Consider a system composed of 2 qubits, and for simplicity denote the product basis vectors of corresponding
space C2 ⌦ C2 as |00i := |0i ⌦ |0i, |01i := |0i ⌦ |1i, |10i := |1i ⌦ 0i, |11i := |1i ⌦ |1i.
We now define the following orthonormal basis of (maximally) entangled states called the Bell basis (verify
that they form an ONB!) 8 p
>
> | + i = (|0i ⌦ |0i + |1i ⌦ |1i)/p2
<
| i = (|0i ⌦ |0i |1i ⌦ |1i)/p2
(2)
>
> | + i = (|0i ⌦ |1i + |1i ⌦ |0i)/p 2
:
| i = (|0i ⌦ |1i |1i ⌦ |0i)/ 2.

How can we prepare the qubits in such entangled states ? If the systems are initially independent, they have
a joint product state, and performing local unitary operations on each systems will not produce an entangled

3
state. Instead, one needs to apply unitary transformations such as those arising from interactions between
the systems. For instance we will show that the Bell states can be obtained by transforming the 4 standard
basis product states according to the unitary prescribed by the following circuit which contains a two-qubits
“entangling” gate.

Here the two lines represent the two qubits, H is the Hadamard gate (unitary)
✓ ◆
1 1 1
H := p
2 1 1
while the second transformation is the controlled not (CNOT) gate whose action is to leave the state un-
changed if the first qubit is in state |0i, and flip the second qubit if the first is in |1i

CN OT : |00i 7! |00i
CN OT : |01i 7! |01i
CN OT : |10i 7! |11i
CN OT : |11i 7! |10i

We initialise the qubits in one of the 4 basis states, and apply the two unitary transformation successively.
The calculation below show that the outputs are the 4 Bell states:
1 1
|00i 7! p (|0i + |1i)|0i 7! p (|0i|0i + |1i|1i) = | +i
2 2
1 1
|01i 7! p (|0i + |1i)|1i 7! p (|0i|1i + |1i|0i) = | +i
2 2
1 1
|10i 7! p (|0i |1i)|0i 7! p (|0i|0i |1i|1i) = | i
2 2
1 1
|11i 7! p (|0i |1i)|1i 7! p (|0i|1i |1i|0i) = | i
2 2

4 Tensor products of operators


As H1 ⌦ H2 is a Hilbert space, all notions related to linear operators apply to this space. A special type of
operators are the tensor products, which we discuss in more detail here.
Definition 2 (tensor product of operators). Let A : H1 ! H1 and B : H2 ! H2 be linear operators. The
tensor product A ⌦ B is the linear operator on H1 ⌦ H2 whose action on product vectors is

A⌦B:| 1i ⌦| 2i 7! A| 1i ⌦ B| 2i

The tensor products A ⌦ IH2 and IH1 ⌦ B are called the ampliations of A and respectively B to the tensor
product space.

Computing the matrix of A ⌦ B with respect to the orthonormal basis (1)

(A ⌦ B)i1 j1 ,i2 j2 = hei1 |A|ei2 ihfj1 |B|fj2 i = Ai1 ,i2 Bj1 ,j2 = (A ⌦ B)i1 j1 ,i2 j2 .

4
The right side is the tensor product of the two matrices A 2 Md1 ,d1 and B 2 Md2 ,d2
0 1
A11 B A12 B ... A1d1 B
B A21 B A22 B ... A2n B C
B C
A⌦B =B .. .. .. .. C 2 Md1 d2 ,d1 d2
@ . . . . A
Ad 1 1 B Am2 B ... A d1 d1 B
in which the block (i, j) is the d2 ⇥ d2 matrix equal to Aij B.
Example 3. The tensor product of the Pauli operators x and z is given by the 4 ⇥ 4 matrix
0 1
✓ ◆ 0 0 1 0
0 B 0 0 0 1 C
⌦ = z
=B
@ 1
C
x z
z 0 0 0 0 A
0 1 0 0

Properties of the tensor product. The following properties of the tensor product of operators (or matrices)
can be verified directly from the definition, and are left as an exercise.
1. In general A ⌦ B 6= B ⌦ A.
2. A ⌦ (bB + cC) = bA ⌦ B + cA ⌦ C, for all scalars b, c and operators A, B, C
3. (aA + bB) ⌦ C = aA ⌦ C + bB ⌦ C, for all scalars a, b and operators A, B, C
4. (A ⌦ B) ⌦ C = A ⌦ (B ⌦ C) for all operators A, B, C
5. (A ⌦ B) · (C ⌦ D) = AC ⌦ BD, whenever the products AC and BD exist
6. A⇤ ⌦ B⇤ = (A ⌦ B)⇤ .
7. Another immediate property of the tensor product is that its action can be realised by first acting with A
on the left tensor and then with B on the right tensor, or viceversa,
A ⌦ B = (IH1 ⌦ B)(A ⌦ IH2 ) = (A ⌦ IH2 )(IH1 ⌦ B).

8. If both A and B are normal or selfadjoint or unitary or positive then so is A ⌦ B.


P P
9. If A and B have spectral decompositions A = i i |ei ihei | and B = j µj |fj ihfj | then
X X
A⌦B= i µj |ei ihei | ⌦ |fj ihfj | = i µj |ei ⌦ fj ihei ⌦ fj |
i,j i,j

and its spectrum is (A ⌦ B) = { µ : 2 (A), µ 2 (B)}.


10. The following trace rule holds: Tr(A ⌦ B) = Tr(A)Tr(B).
Example 4. For every |ui, |u0 i 2 H1 and |vi, |v 0 i 2 H2 we construct the rank one tensor product |uihu0 | ⌦
|vihv 0 | with the action
(|uihu0 | ⌦ |vihv 0 |) : | 1i ⌦| 2i 7! hu0 | 1i · hv 0 | 2i |ui ⌦ |vi.
By applying both sides on a product vector one can verify the identity
|uihu0 | ⌦ |vihv 0 | = |u ⌦ vihu0 ⌦ v 0 |.

As in the case of operators on a single space, any operator X on H1 ⌦ H2 can be expanded in the basis of
rank one operators
XX XX
X= Xi1 j1 ,i2 j2 |ei1 ihei2 | ⌦ |fj1 ihfj2 | = Xi1 j1 ,i2 j2 |ei1 ⌦ fj1 ihei2 ⌦ fj2 |.
i1 ,i2 j1 ,j2 i1 ,i2 j1 ,j2

5
Tensor product notations. In calculations it is convenient to use two different notations to represent the
same tensor products of vectors, as we have already done. In order to avoid confusion about the meaning of
such expressions, we state the following equivalences

|ui ⌦ |vi ⌘ |u ⌦ vi, hu| ⌦ hv| ⌘ hu ⌦ v|, (|ui ⌦ |vi)(hu0 | ⌦ hv 0 |) = |u ⌦ vihu0 ⌦ v 0 |.

Sometimes the tensor product sign is omitted altogether and we write |ui|vi or simply |uvi.
References
[1] Einstein A., Podolsky B., Rosen N., Can Quantum-Mechanical Description of Physical Reality Be Con-
sidered Complete?, Phys. Rev. 47 777-780 (1935).
[2] Schrödinger E., Discussion of probability relations between separated systems, Mathematical Proceed-
ings of the Cambridge Philosophical Society 31 555-563 (1935).
[3] Bell J. S., On the Einstein- Poldolsky-Rosen paradox, Physics 1 195-200 (1964).
[4] Hensen, B. et al, Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometres,
Nature 526 682-686 (2015).

6
Introduction to Quantum Information Science

Lecture 7: Measurements

Abstract: Quantum mechanics is a probabilistic theory. In this lecture we discuss the measurement postulate
which prescribes the probabilities of different outcomes for a given state. For illustration, we look at examples
of one and two qubits measurements.

1 Basic notions of probability with finite spaces


Before discussing quantum measurements we briefly review some basic notions of probability.

1.1 Probability spaces

A finite probability space consists of a finite set ⌦ := {!1 , . . . , !k } of “outcomes” and a probability
P distri-
bution P on ⌦. This means that each !i 2 ⌦ is assigned a probability P(!i ) 0 such that i P(!i ) = 1.
For instance, when throwing a fair die, each outcome in {1, . . . , 6} occurs with probability P(i) = 1/6.
A subset E of the space ⌦ is called an event and
X
P(E) = P(!i ).
i : !i 2E

is the probability that the event E occurs. In the example above, the event “die is even" is given by the set
E := {2, 4, 6} and its probability is P(E) = 1/2. From the definition it follows that probabilities have the
following properties. If two events are disjoint E \ F = ; then the probability of either of them occurring is
the sum
P(E [ F ) = P(E) + P(F ).
In particular if E and F are complementary, i.e. E [ F = ⌦, then P(E) = 1 P(F ). Similarly, if E ⇢ F ,
that is F occurs whenever E occurs, then P(E)  P(F ).

1.2 Joint distributions and independence

Often we are interested in the relationship (e.g. correlations) between random phenomena described by
elements of different sets. For this we need to consider the cartesian product X ⇥ Y = {(x, y) : x 2 X , y 2
Y} and a common probability distribution given by a set of probabilities PX ⇥Y ((x, y)). The probability that
a certain outcome x 2 X occurs is obtained by summing over all possible values of y 2 Y. This gives the
marginal distribution over X X
PX (x) = P((x, y)),
y2Y

and similarly for the marginal PY . The events in X and Y are said to occur independently if
PX ⇥Y (E ⇥ F ) = PX ⇥Y ((E ⇥ Y) \ (X ⇥ F )) = PX (E) · PY (F ).

1
Here Ẽ := E ⇥ Y needs to be interpreted as E occurring in the space X (while any outcome is allowed in
Y), and similarly for F̃ := X ⇥ F , so that Ẽ \ F̃ represents both events occuring simultaneoulsy.

1.3 Conditioning

When additional information about the state of the “world” ⌦ becomes available, the probability distribution
is adjusted in order to take this information into account. For example, consider that we have an urn containing
two black and two red balls, and we repeatedly draw a ball from the urn, without putting it back. Before the
first draw the probabilities of drawing a red or a black ball are equal to 1/2. Now suppose that the first ball
was red. Then the two probabilities for the next draw have changed to 1/3 and 2/3.
Definition 1 (conditional distribution). Let (⌦, P) be a probability space, and let E be an event such that
P(E) > 0. We defined the conditional distribution P(·|E) on ⌦
P(E \ F )
P(F |E) := .
P(E)

Note that P(·|E) satisfies the properties of a probability distribution, in particular P(⌦|E) = 1. P(F |E) is
interpreted as the probability that the event F to occur, given the knowledge that the event E has occured.
If F is incompatible with E, that is E \ F = {;} then F will have probability zero under the conditional
distribution. If E and F are independent events, i.e. P(E \ F ) = P(E) · P(F ), then P(F |E) = P(F ), so
the probability does not change because the information was about an independent event. In the die throw
example, if E = {2, 4, 6} is the event "outcome is even" then the conditional distribution is

1/3 if i is even
P(i|E) =
0 if i is odd.

2 Measurements
The state of a classical system can in principle be known to arbitrary accuracy by performing an appropriate
measurement, which moreover does not disturb the system. In particular, this provides us with the values
of all the observables at any time. Not surprisingly the concept of measurement does not play such an
important role here. Imagine now that the system is inside a black box, which only has some inputs and
outputs, and we would like to know its internal state. Such problems appear in many areas of science (e.g.
reverse engineering a computer program) and are the subject of an engineering field called systems theory. In
some sense, quantum systems are like black boxes of a very special nature. To learn about them we need to
“wire” them with the outside world. The measurement postulate makes this crucial connection between the
mathematical concepts of states and observables (the inside of the box), and the observations made in the lab
(the outside world).
Definition 2 ( projection valued measure (PVM)). Let H be the state space of a quantum system. A projec-
tion valued measure over a set { 1 , . . . , k } is given by a collection of orthogonal projections {P 1 , . . . , P k }
onto subspaces of H, with the following properties:
1. orthogonality: P i P j = ij P i .
Pk
2. completeness: i=1 P i = I.
For each subset E 2 { 1, . . . , k} we define
X
P(E) := P i.
i 2E

Note that there is a strong similarity between the notion of PVM and that of -algebra of events in probability
theory. In fact the projection P(E) will be used to define the probability of the event E to occur. Note also

2
that the set elements i merely play the role of labels for the projections which could just as well be called
Pi , and sometimes we will adopt the latter convention.

Postulate 3
1. A quantum measurement with outcomes { 1, . . . , k} on a system with state space H, is described by a
PVM {P 1 , . . . , P k } on H.
2. The result of the measurement is random and its probability distribution is

P( i ) := kP i | ik2 = h |P i | i = Tr(| ih | P i )

where | i is the state of the system before the measurement.


3. If the measurement outcome is i then the state of the system immediately after the measurement is

1 P i| i
| 0i = p P i| i = .
P( i ) kP i | ik

The measurement postulate is perhaps the most intriguing postulate of quantum mechanics. It tells us that
reality – as the collection of our observations – is intrinsically probabilistic, rather than deterministic, as
modelled by classical physics. The quantum state is not directly accessible, no matter how precise the mea-
surement device is. There is a limited amount of information that can be gaining about the quantum state,
and this information is of a statistical type; for example if a qubit is prepared in the state | i = a|0i + b|1i
and we measure in the standard basis, then we obtain two possible outcomes with probabilities |a|2 and |b|2 .
Since after the measurement the state is projected to either |0i or |1i, further measurements cannot provide
additional information about | i. An upshot of this is that quantum system and measurements can be used
as “truly” random number generators, which are difficult to implement on classical computers! The other
striking consequence of the postulate is the fact that measurements disturb the state of the quantum system.
To model the measurement process one has to consider the interaction between system and the measurement
device which leads to a transfer of information from the former to the latter. In a future lecture we will see
that quantum information cannot be copied, so the transfer is accompanied by a change in the system’s state.
The type of measurement described in the postulate is called von Neumann or projection valued measurement
(PVM). In the special case of one dimensional projections onto vectors of an ONB P i = |ei ihei |, the
probabilities are simply the absolute values square of the Fourier coefficients:

P( i ) = |hei | i|2 ,

and the state after the measurement is the basis vector |ei i. Note that if | i is one of the basis vectors say
|ej i then the measurement gives result j with probability one (i.e. the other outcomes do not occur) and the
state remains undisturbed.
When the outcomes i are real numbers P
(as it is the case in many experiments), we are in fact talking about
the measurement of the observable A = i i P i . In this case, the following important rules follow directly
from the measurement postulate.

Classical and quantum expectation. Let a denote the random variable describing the result of measuring
A, which takes values in (A) = { 1 , . . . , k }. Then for any function f : (A) ! C

X X
E(f (a)) = f ( i )P( i ) = f ( i )h |P i | i = h |f (A)| i = Tr(| ih |f (A)), (1)
i i

and in particular

E(a) = h |A| i, Var(a) := E(a2 ) E(a)2 = h |A2 | i h |A| i2 . (2)

3
P P
Joint measurements. Let A = i ai |ei ihei | and B = j bj |fi ihfi | be two observables. Suppose that the
system is prepared in a state | i and we measure first the observable A and then the observable B. What
is the distribution of the outcomes, and the posterior state of the system? Do we get the same result if the
measurements are performed in the opposite order ?
By the 3rd postulate, the probability that the outcome of the first measurement is ai is

PA (ai ) = |hei | i|2 , (3)

and the corresponding posterior state is |ei i. Given the outcome ai of the first measurement, we now measure
the observable B and obtain the outcome bj with the conditional distribution

PB|A (bj |ai ) = |hfj |ei i|2 (4)

and the posterior state is |fj i. From (3) and (4) we get the joint distribution of the two outcomes (verify that
this is indeed a probability distribution!)

PBA (bj , ai ) = hei | i|2 · |hfj |ei i|2 .

As an exercise, you can show that the distribution of the reverse order measurement PAB is equal to PBA if
and only if the two operators commute AB = BA. More generally, you can show that a ‘joint measurement’
of A and B exists if and only if AB = BA.

3 One qubit measurements


We consider now the special case of qubit measurements and use the Bloch sphere notation introduced in
Lecture 5.
Suppose that we measure one of the Pauli observables, for instance x . Since the spectrum of each Pauli
matrix is {+1, 1} the outcome sx of the measurement belongs to this set. We can now use (2) to compute
the expectations

E(sx ) = (+1) · P([sx = +1]) + ( 1)P([sx = 1]) = 2(P([sx = 1]) 1 = Tr(| ih | x) (5)

where we have used that P([sx = 1]) = 1 P([sx = 1]). Now by using the Bloch representation
1
| ih | = (I + rx x + ry y + rz z)
2
and the relations Tr( x y) = Tr( x z) = 0 and 2
x = I we get

Tr(| ih | x) = rx . (6)

From (5) and (6) we obtain the two probabilities


1 + rx 1 rx
P([sx = 1]) = and P([sx = 1]) = .
2 2
which have a simple geometric interpretation as halfs of lengths of the two segments of [ 1, 1] determined by
the projection of r onto the axis x. Similar results hold for the measurement of y , z and in fact for arbitrary
spin directions (exercise).

4 Two qubits measurements


Consider a system composed of 2 qubits, and the product basis {|0i ⌦ |0i, |0i ⌦ |1i, |1i ⌦ |0i, |1i ⌦ |1i} of
the corresponding space C2 ⌦ C2 . We now define the following orthonormal basis of entangled states called

4
the Bell basis (verify that they form an ONB!)
8 p
>
> | + i = (|0i ⌦ |0i + |1i ⌦ |1i)/p2
<
| i = (|0i ⌦ |0i |1i ⌦ |1i)/p2
(7)
>
> | + i = (|0i ⌦ |1i + |1i ⌦ |0i)/p 2
:
| i = (|0i ⌦ |1i |1i ⌦ |0i)/ 2.

Suppose that the qubits are prepared in the state | + i and we perform a meausurement of the observable
z = |0ih0| |1ih1| on each of the two qubits separately. Each qubit measurement has outcomes {+1, 1}
and the corresponding projections are P+ = |0ih0| and P = |1ih1|. The joint measurement has 4 outcomes
{(+1, +1), (+1 1), ( 1, +1), ( 1, 1)} and the projections are the tensor products of the projections
corresponding to the outcome for each qubit

P + ⌦ P+ , P + ⌦ P , P ⌦ P + , P ⌦ P .

By applying the third postulate we find that the probabilities of the 4 outcomes are
1 2
P(+1, +1) = Tr [(P+ ⌦ P+ )| + ih + |] = |h0 ⌦ 0| + i| =
2
2
P(+1, 1) = Tr [(P+ ⌦ P )| + ih + |] = |h0 ⌦ 1| + i| = 0
2
P( 1, +1) = Tr [(P ⌦ P+ )| + ih + |] = |h1 ⌦ 0|=0 + i|
1
P( 1, 1) = Tr [(P ⌦ P )| + ih + |] = |h1 ⌦ 1| + i|2 = .
2
Clearly, the outcomes of the two measurements are perfectly correlated, more precisely

P([+1 on left] | [+1 on right]) = P([ 1 on left] | [ 1 on right]) = 1.

If we imagine that the two qubits are at distant locations, this means that knowing the result of one measure-
ment immediately tells us the result of the distant measurement, a phenomenon which Einstein derided as
“spooky action at distance” since it seems to contradict our intuition that the information cannot travel faster
than the speed of light. A closer look however shows that the correlations cannot be used to transmit informa-
tion, in other words quantum mechanics is a no-signalling theory. In fact perfect correlations can occur in the
"classical set-up" as well; imagine for instance an experiment in which two balls are sent in opposite direc-
tions towards two distant labs, and the sender arranges that both balls are either red or black, with probability
1/2. The two experimenters would see the same distribution of outcomes as in the qubits measurement, but
clearly this is not a "spooky action at distance", but the consequence of the sender’s distribution. However
as we will see later, Bell’s theorem shows that the balls set-up (or so called hidden variables theories) cannot
explain the correlations predicted by slightly more elaborate measurements on the 2 qubits!

5
5 Summary of the postulates of quantum mechanics
Postulate 1. The state space of each quantum system is a Hilbert space. A state is a normalised vector | i,
modulo phase factors. Equivalently, a state can be described by the projection | ih |.

Postulate 2. The evolution of a closed quantum system is given by a unitary transformation


U : | (t1 )i 7! | (t2 )i = U| (t1 )i.

Postulate 3. Any measurement with outcomes in { 1 , . . . , k } is given by a set of mutually orthogonal


projections {P 1 , . . . , P k } adding up to the identity. The probability of obtaining the outcome i when the
system was prepared in state | i is
P( i ) := kP i | ik2 = h |P i | i = Tr(| ih | P i )
The state of the system immediately after the measurement is
1 P i| i
| 0i = p P i| i = .
P( i ) kP i | ik
P
When i are real numbers, we say that we measure the observable A = i iP i
.

Bloch vector representation: any vector state can be written as


✓ ◆
✓ ✓
| i = ei cos |0i + ei sin |1i
2 2
and the corresponding projection has the following representation
1
| ih | = (I + rx x + ry y + rz z )
2
where r = (rx , ry , rx ) is a vector of length one with polar coordinates (✓, ).
The measurement of the spin observables i has an outcome si 2 {+1, 1} with distribution
1 + ri 1 ri
P(si = 1) = , P(si = 1) = .
2 2

Tensor product of two Hilbert spaces: H1 ⌦ H2 is the span of tensor product vectors |u ⌦ vi with inner
product obtained by sesquilinear extension of
hu ⌦ v|u0 ⌦ v 0 i = hu|u0 ihv|v 0 i.

Tensor product of two linear operators: if A : H1 ! H1 and B : H2 ! H2 then A ⌦ B is the linear


operator on H1 ⌦ H2 whose action on product vectors is
A⌦B:| 1i ⌦| 2i 7! A| 1i ⌦ B| 2 i.

Postulate 4. The state space of a composite system is the tensor product of the state spaces of the component
systems. If the systems are prepared in isolation from each other, then the joint state is
| i=| 1i ⌦ ··· ⌦ | N i.

Entangled states: States which are not of the product form are called entangled, for example the 2 qubit
state p
(|0i ⌦ |0i + |1i ⌦ |1i)/ 2.
These states play an important role in several quantum information protocols.

6
Introduction to Quantum Information Science

Lecture 8: Bell’s inequality

Abstract: In 1935 Einstein, Podolsky and Rosen put forward the idea that quantum mechanics is incomplete,
and may have a classical description in terms of random variables on a probability space (hidden variables).
John Bell devised an experimental test to settle this question. The Bell inequality shows that outcomes
correlations of certain measurements on entangled bi-partite system cannot be described in terms of classical
random variables. Recently, this has been confirmed experimentally, which shows that quantum mechanics
is not a hidden variables theory.

1 The EPR paradox


Throughout the lectures we compared the formalism of quantum mechanics with that of classical physics and
probability. We already found evidence that quantum mechanics is ‘special’, e.g. the fact that measurements
are intrinsically probabilistic, that quantum systems are disturbed by observation, that there exist pure states
of bipartite systems which have a special type of correlation called entanglement. However, a sceptic physi-
cist can argue that Nature is ultimately classical and all the special features of quantum mechanics are just
artefacts of the mathematical formalism (remember that a quantum state is ‘just’ a mathematical concept)
and ultimately we may find a different theory which makes the same predictions but is completely classical.
Einstein was such a physicist, and together with Nathan Rosen and Boris Podolsky, they devised a thought ex-
periment meant to prove that quantum mechanics is an incomplete theory of reality [1]. Consider the singlet
state
|0i ⌦ |1i |1i ⌦ |0i
| i= p (1)
2
which has the property that if we measure on both sides (Alice and Bob) in the standard basis (i.e. we measure
z ) then we either obtain results (+1, 1) or ( 1, +1) and never the other results. Here the outcomes are
{+1, 1}, the eigenvalues of z and the projectors are P+1 := |0ih0| and P 1 := |1ih1|. This means that
whenever Alice obtains result +1, Bob knows for sure before doing his measurement that the outcome will be
1, and the same for the pair ( 1, +1). In fact if we measure the spin on both sides in an arbitrary direction
n (i.e. we measure n = nx x + ny y + nz z , with knk = 1) the results of the measurements will have the
same property! To see that this is the case let |0, ni and |1, ni be the two eigenvectors of n and write

|0i = a|0, ni + b|1, ni


|1i = c|0, ni + d|1, ni
✓ ◆
a b
where U := is a unitary matrix. Then
c d

|0i ⌦ |1i |1i ⌦ |0i |0, ni ⌦ |1, ni |1, ni ⌦ |0, ni


| i := | i = p = (ad bc) p
2 2
where ad bc = Det(U ) is an irrelevant phase factor. Since the state has the same form in the new basis, the
measurement of n exhibits the same correlations as that of z .

1
Einstein Podolsky and Rosen (EPR) considered that since Bob knows for sure the outcome of the measure-
ment on his side, this means that the component along the axis n must be an element of reality, just like the
position of a particle, i.e. it is a classical variable. Since quantum mechanics does not talk about the value of
the spin in a certain direction, but only predicts probabilities of the outcomes, EPR concluded that it must be
incomplete.

2 The Bell inequality


If the above reasoning left you in the dark, it is partly due the fact that it is based on philosophical arguments
about the nature of reality. Nearly thirty years later John Bell found a quantitative way of testing the EPR
paradigm of ‘hidden variables’ [2]. He showed that certain correlations in classical (hidden variables) theories
must obey an inequality while the correlations arising from quantum measurements violate this inequality.
Subsequent experiments confirmed the violation of Bell’s inequality, in agreement with quantum mechanics.
Bell’s inequality is about random variables, so for the moment we can forget about quantum mechanics
and assume that physics is classical. This means that performing a measurement is essentially observing
an aspect of the world, like the position or momentum of a particle, and all these aspects are described by
random variables X, Y, .... on some probability space, so we can talk about their joint distribution.
Bell’s experiment is the following (see Figure 1). A third party C (Charlie) has a device that can prepare two
particles in a certain state, so that he can repeat this preparation many times independently. After preparing
the particles, he sends one to Alice and the other to Bob who can perform some measurements. Suppose that
Alice can choose between measuring two different properties of her particle like position and momentum,
and assume for simplicity that the measurements have outcomes in {+1, 1}. We model these outcomes
by a pair of random variables X1 and X2 . Bob has a similar device and his outcomes are described by the
random variables Y1 and Y2 with values in {+1, 1}.

Alice Bob
measures mesures
X1 or X2
Charlie prepares Y1 or Y2
2 particles

Figure 1: Bell’s experiment. Charlie sends particles to Alice and Bob and each of them measures one of the
‘observables’ X1 or X2 and respectively Y1 or Y2 with values in { 1, +1}

CHSH ‘Bell inequality’. The following inequality [3] is called CHSH ( which stands for its authors Clauser,
Horne, Shimony and Holt) is a particular instance of a Bell inequality

E(X1 Y1 ) + E(X2 Y1 ) + E(X2 Y2 ) E(X1 Y2 )  2 (2)

Proof. Note that

X1 Y1 + X2 Y1 + X2 Y2 X1 Y2 = (X1 + X2 )Y1 + (X2 X1 )Y2 (3)

Since X1 , X2 2 { 1, +1} it follows that either X1 + X2 = 0 or X1 X2 = 0. From (3) we get in either


case that X1 Y1 + X2 Y1 + X2 Y2 X1 Y2 = ±2. If the joint distribution of (X1 , X2 , Y1 , Y2 ) is

p(i, j, k, l) = P(X1 = i, X2 = j, Y1 = k, Y2 = l)

2
then by taking expectation of (3) we get
X
E(X1 Y1 ) + E(X2 Y1 ) + E(X2 Y2 ) E(X1 Y2 ) = p(i, j, k, l)(ik + jk + jl il)
ijkl2{+1, 1}
X
 p(i, j, k, l) ⇥ 2 = 2.
ijkl2{+1, 1}

Experimental verification procedure. How can such an inequality be verified in an experiment? For each
pair of particles produces by Charlie, Alice and Bob perform one of the two possible measurements and note
down the results. They repeat this many times with each of the 4 combinations of measurements. Then they
use their lists of results to estimate each of the expectations in (2). For example, to compute E(X1 , Y1 ) they
perform repeated measurements in settings SA = 1 and SB = 1 and obtain a a sequence of independent pairs
(i) (i) (i) (i)
of outcomes X1 , Y1 . Then they take products the X1 · Y1 and average over the whole sequence. For
large number of measurements this average will approximate the expectation with arbitrary accuracy
n
1 X (i) (i)
X · Y1 ⇡ E(X1 , Y1 )
n i=1 1

What we have used here is a fundamental theorem of probability called the Law of Large Numbers! Think of
estimating the probability of heads and tails for a bias coined by tossing the coin many times and noting the
frequencies of the two outcomes.
The quantum experiment. We will now show that experiments involving measurements on quantum parti-
cles violate the CHSH inequality. Consider that Charlie prepares the state | i defined in (1). He gives the
first qubit to Alice and the second to Bob. Alice measures one of the
ptwo observables X1 = p z and X2 = x
and Bob measures one of the two observables Y1 = ( z + x )/ 2 and Y2 = ( z x )/ 2. The results
are the random variables denoted X1 and X2 and respectively Y1 and Y2 which take values in {+1, 1}. The
four expectations appearing in the CHSH inequality are (exercise)
1
E(X1 Y1 ) = h |X1 ⌦ Y1 | i = p
2
1
E(X2 Y1 ) = h |X2 ⌦ Y1 | i = p
2
1
E(X2 Y2 ) = h |X2 ⌦ Y2 | i = p
2
1
E(X1 Y2 ) = h |X1 ⌦ Y1 | i = p
2
which implies that p
E(X1 Y1 ) + E(X2 Y1 ) + E(X2 Y2 ) E(X1 Y2 ) = 2 2 > 2
so the CHSH inequality is violated!
Let’s reflect for a moment on the meaning of this violation. We had just shown that the inequality must be
true for all random variables having a joint probability distribution. What has changed in the quantum case ?
The point is that the random variables in the quantum experiment do not have a joint distribution. When Alice
measures X1 and Bob measures Y1 , the results (X1 , Y1 ) do have a joint probability distribution. However
we cannot speak of the distribution of all (X1 , Y1 , X2 , Y2 ). This is because the latter two (X2 , Y2 ) represent
results which would have been obtained, had we measured the (different) observables X2 , Y2 . But since
X1 and X2 do not commute, and similarly for Y1 and Y2 , these two measurements cannot be performed
simultaneously, and quantum mechanics tells us that we cannot attribute definite values to noncommuting
observables. So the spin components in different directions are not ‘element of reality’ in the sense of EPR,
they are not hidden variables waiting to be measured, and Nature cannot be described by a hidden variables
theory.

3
Local realism (LR). In the statement of the Bell inequality we have two hidden assumptions.
1. We assume that the variables X1 , X2 , Y1 , Y2 exist independent of observation. Strictly speaking, the
experiment involves only pairs of variables at each time, so how do we know that the other two exist ? This
assumption is called realism and is a fundamental feature of classical physics.

SA SB

Alice’s Bob’s
device device
Charlie prepares
2 particles

X Y

Figure 2: Experimental setting for testing Bell’s inequality. Alice has a device on which she can choose a
setting SA 2 {1, 2}; the device ‘measures’ the particle and outputs outcome X 2 {+1, 1}. Similarly Bob’s
setting is SB 2 {1, 2} and the output is Y 2 {+1, 1}

2. We assumed that Bob’s choice of measurement cannot influence Alice’s measurement. This was done
covertly by saying that what Alice and Bob measure are pairs of ‘observables’ like (X1 , Y2 ). This assumption
is called locality and most physicists consider it as a fundamental aspect of Nature. Here is a more general
description of the experiment in which an explicit no-signalling assumption is necessary. Instead of assuming
the existence of the random variables X1 , X2 , Y1 , Y2 we assume that Alice has a measurement device which
has two settings that can be chosen by pressing one of the two buttons on the device (see Figure 2). She first
chooses a setting SA 2 {1, 2}, and then the device ‘measures’ the particle sent by Charlie and outputs the
outcome X. Similarly on Bob’s side the setting is SB 2 {1, 2} and the outcome is Y . The choice of settings
can be random and Alice and Bob may even allow for correlations in the settings choices. At the end of the
day we have a probability distribution over settings and outcomes.
q(x, y, sa , sb ) = P(X = x, Y = y, SA = sa , SB = sb )
How does the previous description fit into this more general one ? Observing the random variable Xsa in the
previous description means that Alice fixes a setting Sa = sa and then observes X. This can be expressed in
terms of conditional probabilities
q(x, y|sa , sb ) = P(Xsa = x, Ysb = y)
An important feature of these probabilities is the no-signalling condition which says that the conditional
distribution of X does not depend on the setting sb
X X
q(x|sa , sb ) = q(x, y|sa , sb ) = P(Xsa = x, Ysb = y) = P(Xsa = x)
y y

so, Alice’s result cannot be influenced by the choice of setting on Bob’s side. This no-signalling condition
which comes for free in the X1 , X2 , Y1 , Y2 formulation, needs to be added as an assumption in the general
formulation, in order to be prove Bell’s inequality.
But this means that the experimentalist who wants to prove the violation of the Bell’s inequality in quantum
mechanics, needs to make sure that his set-up is non-signalling. Since relativity theory says that signals
cannot travel faster than the speed of light, the non-signalling condition can be fulfilled if Alice and Bob are
sufficiently far apart and the settings are chosen very close to each other in time. Such an experiment was
done by Alain Aspect in Paris in the early 80’s using entangled photons [4] but the quest to close some of the
loopholes in Bell experiments continued until the definitive experimental confirmation in 2015 [5].

4
3 The GHZ experiment
The incompatibility of local realism (or hidden variables theory) with quantum mechanics can be captured
in a variety of different ways [6]. We present here a even more striking argument due to Greenberger, Horne
and Zeilinger (GHZ) [7], by adopting a more elegant version due to Mermin [8]. The experimental set-
up is similar to that of Figure 1, with the difference that we have 3 parties, Alice, Bob and Charlie, each
receiving a particle that can be ‘measured’ with the corresponding device. Each device can measure one of
two ‘observables’ X1,2 A B
, X1,2 , and respectively X1,2
C
taking values in { 1, +1}. It is easy to check that there
is no set of values for the hidden variables such that the following equalities hold simultaneously

X1A · X2B · X2C = +1


X2A · X1B · X2C = +1
X2A · X2B · X1C = +1
X1A · X1B · X1C = 1. (4)

The quantum experiment. Suppose that the parties share the 3 qubits GHZ state
1
| i = p (|000i + |111i),
2
and each party can measure either x or y which correspond to X1 and respectively X2 above. As exercise
you can verify that

x ⌦ y ⌦ y| i = +1| i
y ⌦ x ⌦ y| i = +1| i
y ⌦ y ⌦ x| i = +1| i
x ⌦ x ⌦ x| i = 1| i.

This means that in each of the 4 measurement configurations, when the three observables are measured, the
product will always be equal to the eigenvalue ±1 on the right side of the equation. This seems to contradict
the fact that equations (4) cannot hold simultaneously, and thus prove that quantum mechanics violates local
realism. Moreover, it seems that this can be done in a single experiment, rather than by collecting statistical
evidence from repeated experiments as in the case of the CHSH inequality. Indeed, if something can never
happens classically, but happens with probability one quantumly, we only need to verify it once !
However a careful analysis shows [9] that while the GHZ set-up is incompatible with local realism, this
cannot be established by performing a single experiment. The 4 triples of observables specified above cannot
be measured simultaneously since x and y do not commute. In a single experiment we can only obtain
the values of a single triple e.g. (X1A , X2B , X2C ), and any sequence of successive (independent) outcomes
has a non-zero probability in a hidden variables model. The incompatibility of the two models can however
be established statistically [9,10] as the accumulating measurement data points to the fact that the outcomes
correlations cannot be explained by a hidden variables model. Similarly to the CHSH case, this can be
expressed via an inequality. As an exercise you can show that in the hidden variables model the following is
always true

|E(X1A · X2B · X2C ) + E(X2A · X1B · X2C ) + E(X2A · X2B · X1C ) E(X1A · X1B · X1C )|  2

while on the quantum level, the expectations with respect to the state | i satisfy

h | x ⌦ y ⌦ y| i+h | y ⌦ x ⌦ y| i+h | y ⌦ y ⌦ x| i h | x ⌦ x ⌦ x| i = 4.

5
References.
[1] A. Einstein, B. Podolsky, N. Rosen, Can Quantum-Mechanical Description of Physical Reality Be Con-
sidered Complete?, Phys. Rev. 47 777-780 (1935).
[2] J.S. Bell, On the Einstein- Poldolsky-Rosen paradox, Physics 1 195-200 (1964).
[3] J.F. Clauser, M.A. Horne, A. Shimony, R.A. Holt, Proposed Experiment to Test Local Hidden-Variable
Theories Phys. Rev. Lett. 23 880-884 (1969)
[4] A. Aspect, P. Grangier, and G. Roger, Experimental Realization of Einstein-Podolsky-Rosen-Bohm
Gedankenexperiment: A New Violation of Bell’s Inequalities, Phys. Rev. Lett. 49 91-94 (1982)
[5] B. Hensen, et al, Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometres,
Nature 526 682-686 (2015)
[6] N. Brunner, et al, Bell nonlocality, Rev. Mod. Phys. 86 419 (2014).
[7] Greenberger, D. M., M. Horne, and A. Zeilinger, Bell?s Theorem, Quantum Theory, and Conceptions of
the Universe, Kluwer Academic (1989).
[8] N.D. Mermin, Quantum mysteries revisited, American Journal of Physics, 58 731-733 (1990).
[9] A.Peres, Bayesian analysis of Bell inequalities, Fortschritte der Physik 48 531-535 (2000).
[10] W. van Dam, P. Grunwald, R, Gill, The statistical strength of nonlocality proofs, IEEE-Transactions on
Information Theory 51 2812-2835 (2005).

Summary of Bell’s inequality


p
The singlet state (|01i |10i)/ 2 has the property that if qubits are measured in some given but identical
direction on each side, the results are always opposite. Hence, conditional on the measurement on one side,
the other measurement gives a fixed answer. According to EPR this means that spin component should be
an ‘element of reality’, i.e. there should be some hidden variable which determines the precise value of
the spin component. Since quantum mechanics only describes probability distributions, it must therefore be
incomplete.
Bell devised an experiment and a correlations inequality that goes with it, and showed that the inequality
must be obeyed by any classical hidden variables, or local realist theory. This inequality is violated by
correlations arising in measurements on entangled qubits, which shows that Nature is not governed by a
local hidden variables theory. The violation of Bell’s inequality has been verified with different physical
systems, e.g. entangled photons, and recently with electron spins. The fact that quantum mechanics does not
obey the Bell inequality is closely connected to the fact that non-commuting observables cannot be measured
simultaneously, so we cannot talk about their joint distribution.

6
Introduction to Quantum Information Science

Lecture 9: Partial trace and density matrices

The first postulate says that the state of a system is a normalised vector, but when the system consists of the
two entangled qubits we cannot attribute a vector state to each of the two subsystems since that would imply
that the joint state is a product state! So it appears that the first postulate is violated and we need to replace
the vector state by a more general notion of state. However one may argue that the postulate refers to the
state of a closed system such as the two qubits together, or the entire universe, and should not be applied to
states of subsystems. We take a pragmatic approach and show how the notion of state can be extended in a
consistent way without requiring a major change of the postulate but rather a more general interpretation.
The guiding principle will be that the quantum state –whatever that may be – must encode the probability
distribution of all possible measurements on the system.

1 Partial trace
The partial trace is the trace operation acting on one of the terms in a tensor product, and leaving the other
one unchanged.
Definition 1 (partial trace). Let A ⌦ B be a tensor product operator on the space H1 ⌦ H2 . Its partial
traces over H2 and H1 are the operators on H1 and respectively H2 , defined as

Tr2 (A ⌦ B) := A · Tr(B)

Tr1 (A ⌦ B) := B · Tr(A).

Since any operator X on H1 ⌦ H2 can be written as a linear combination of rank one tensor products, the
partial trace can be extended by linearity to all operators. In particular if
X X
X= Xi1 j1 ,i2 j2 |ei1 ihei2 | ⌦ |fj1 ihfj2 | = Xi1 j1 ,i2 j2 |ei1 ⌦ fj1 ihei2 ⌦ fj2 |
i1 ,j1 ,i2 ,j2 i1 ,j1 ,i2 ,j2

for some orthonormal bases {|e1 i, . . . , |ed1 i} and {|f1 i, . . . , |fd2 i} then
0 1
d1
X X d2
Tr2 (X) := @ Xi1 j,i2 j A |ei1 ihei2 |
i1 ,i2 =1 j=1
d2 d1
!
X X
Tr1 (X) := Xij1 ,ij2 |fj1 ihfj2 |
j1 ,j2 =1 i=1

1
Equivalently, the partial traces can be defined through their matrix elements
d2
X d2
X
hei1 |Tr2 (X)|ei2 i = hei1 ⌦ fj |X|ei2 ⌦ fj i = Xi1 j,i2 j
j=1 j=1
d1
X d1
X
hfj1 |Tr1 (X)|fj2 i = hei ⌦ fj1 |X|ei ⌦ fj2 i = Xij1 ,ij2 .
i=1 i=1

Properties. The following properties of the partial trace will be useful in the next lectures.

1. taking both partial traces gives the trace


X
Tr1 (Tr2 (X)) = Tr2 (Tr1 (X)) = Tr12 (X) := Tr(X) = Xij,ij
i,j

2. partial trace of tensor product of rank one operators

Tr2 (|uihu0 | ⌦ |vihv 0 |) = |uihu0 | · Tr(|vihv 0 |) = hv 0 |vi · |uihu0 |

3. operators acting on one space can be ‘pulled out’ when taking partial trace over the other. Let A : H1 !
H1 and X : H1 ⌦ H2 ! H1 ⌦ H2 , then

Tr2 ((A ⌦ I)X) = A · Tr2 (X) , Tr2 (X(A ⌦ I)) = Tr2 (X) · A

4. unlike the (full) trace, the partial trace is not cyclic

Tr1 (XY) 6= Tr1 (YX)

5. the partial trace of a positive operator is a positive operator

X 0 =) Tr1 (X) 0 and Tr2 (X) 0

Points 1. and 2. follow directly from the definition. For 3. you can write X as a linear combination of terms
of the type X1 ⌦ X2 and use the fact that (A ⌦ I)(X1 ⌦ X2 ) = AX1 ⌦ X2 . Note that here as in point 4.
the order of the terms matters, since they are (in general non-commuting) operators. Point 5. follows from
X
hu|Tr2 (X)|ui = hu ⌦ fj |X|u ⌦ fj i 0.
j

2 Partial states and density matrices


Let us return now to the question concerning Alice’s and Bob’s state. Suppose Alice and Bob share a bipartite
system HA ⌦ HB prepared in a possibly entangled state | i.
Now suppose that Alice performs a measurement on her system with PVM elements {P1 , . . . , Pk }. The
probability distribution of the outcome X 2 {1, . . . , k} is

p(x) := P([X = x]) = h |Px ⌦ I| i = Tr ((Px ⌦ I)| ih |)

= TrA [TrB ((Px ⌦ I)| ih |)] = TrA (Px ⇢A )

where on the second line we have broken the trace into the two partial traces, after which we used property
3. of the partial trace and denoted
⇢A := TrB (| ih |).

2
The point of this calculation is to show that from Alice’s viewpoint, the probabilities of measurements on
her system have the same expression as before, but the usual one-dimensional projection is replaced by the
operator ⇢A . Moreover, using properties 1. and 5. we find that ⇢A is a positive operator of trace one.
p
As an example, consider the 2-qubit entangled state | + i = (|0i ⌦ |0i + |0i ⌦ |0i)/ 2, and the projections
P0 := |0ih0| and P1 := |1ih1|. Then ⇢A = I/2 and the probabilities are (verify this!)
1
p(0) = h + |P0 ⌦ I| +i = Tr(⇢A P0 ) =
2
1
p(1) = h + |P1 ⌦ I| + i = Tr(⇢A P1 ) = .
2

We have now sufficient motivation to call ⇢A the state of Alice’s system. To emphasise the fact that this is
the restriction of the joint state | ih | to Alice’s system, we will also use the term partial state.
Definition 2 (density matrix/partial states). An operator ⇢ on H is called a density matrix if it has the
properties

i) positivity: ⇢ 0;
ii) trace one (normalisation) Tr(⇢) = 1.

If | i 2 HA ⌦ HB is a pure state on a bipartite system, the partial states of the two subsystems are given by
the density matrices ⇢A := TrB (| ih |) and respectively ⇢B := TrA (| ih |).

These states are thus sufficient to describe any measurement distribution or to compute expectations of local
observables, as the following relations show
h |A ⌦ I| i = Tr((A ⌦ I)| ih |) = TrA (A · TrB (| ih |)) = Tr(A⇢A )
h |I ⌦ B| i = Tr((I ⌦ B)| ih |) = TrB (B · TrA (| ih |)) = Tr(B⇢B ).

From now on we will abandon the notion of state as a vector (or one dimensional projection) which is too
restrictive, and replace it by the more general one of density matrix. To reconcile this with the first postulate
we can always think that the density matrix is the partial state of the system, when considered as a part of a
larger system (the universe) which does find itself in a vector state (see exercise on purification).

3 Ensembles, mixtures and the convex structure of states


We will now give a second motivation and interpretation of the density matrix. Suppose an experimenter
prepares the state of a quantum system in the following way: he/she draws a random number from {1, . . . , k}
with probabilities p(1), . . . , p(k), and depending on the result X = i he/she then prepares the state ⇢(i) from
a given set of states {⇢(1), . . . , ⇢(k)}. The experimenter explains the procedure to a colleague and gives
him the system, but forgets to tell him which of the states he prepared, i.e. what was the value of X. The
question is, what state should the colleague attribute to the system ? You probably noticed that this is the
same situation as we had before, where X was the outcome of a measurement and ⇢(i) was the conditional
posterior state. The point is that if we do not know X we have to average the states to obtain
k
X
⇢= p(i)⇢(i). (1)
i=1

This is the state which gives the right probabilities for future measurements.
Definition 3. A statistical ensemble of a system H consists of a collection of states {⇢(1), . . . , ⇢(k)} on H,
together with a probability distribution {p(1), . . . , p(k)}. The state (1) is called the mean of the ensemble,
and is a statistical mixture of the individual states.

3
The statistical ensemble can be thought of as the way to describe the ‘state’ of a classical-quantum system,
e.g. the coin and the quantum system, or the outcome of a measurement and the conditional posterior state.
Warning! An frequent mistake is to confuse the statistical mixture with the coherent superposition. Recall
that | i = a|0i+b|1i is called a coherent (or linear) superposition of |0i and |1i, and it is a vector state just as
the two basis vector. The mixture p(0)|0ih0| + p(1)|1ih1| is not a vector state, i.e. it is not a one dimensional
projection unless p(0)p(1) = 0. In general there is no simple recipe to create coherent superpositions of
simpler states, but mixtures can be easily prepared by flipping a coin! The alternative way of obtaining a
mixture is to entangle the system with another system, pwhich necessarily produces a mixed partial state. For
instance the partial states of (|0iA |0iB + |1iA |1iB )/ 2 are ⇢A = ⇢B = I/2.
Lemma 4. Let Sd denote the space of density matrices on Cd . Then Sd is a convex set i.e. for any ⇢1 , ⇢2 2 Sd
and 2 [0, 1]
⇢ := ⇢1 + (1 )⇢2 2 Sd .
The extermal points of this convex set (states which cannot be written as mixtures in a no-trivial way) are the
vector states | ih |, which are also called pure states.

Proof. To prove convexity we only need to show that ⇢ is positive and has trace one. The positivity follows
from
hu|⇢|ui = hu|⇢1 |ui + (1 )hu|⇢2 |ui 0
and the trace is
Tr(⇢) = Tr(⇢1 ) + (1 )Tr(⇢2 ) = + (1 ) = 1.

By the spectral theorem, any density matrix has a decomposition


d
X
⇢= µi |ui ihui |
i=1
P
where {µ1 , . . . , µd } form a probability distribution, i.e. they are positive and i µi = Tr(⇢) = 1. This
means that if at least two eigenvalues are non-zero then ⇢ is a mixture of other states.
It remains to prove that if ⇢ is a vector state ⇢ = | ih |, then it is extremal. Indeed suppose that ⇢ =
⇢1 + (1 )⇢2 for some states ⇢1 6= ⇢2 and 2 (0, 1). Then for any vector | i which is orthogoal to | i
we must have
p p
0 = |h | i|2 = h |⇢| i = h |⇢1 | i + (1 )h |⇢2 | i = k ⇢1 | k2 + (1 )k ⇢2 k2 ,

which means that


p p p p
⇢1 | i = ⇢1 ⇢1 | i = 0 ⇢2 | i = ⇢2 ⇢2 | i = 0.
Since this is true for all | i which are orthogonal to | i, we conclude that ⇢1 = ⇢2 = | ih |.

Mixed qubit states. As an example, let us look at the convex set of qubit states S2 using the Bloch represen-
tation. Recall that any positive trace one operator on C2 is of the form
1
⇢(r) = (I + rx x + ry y + rz z)
2
where r = (rx , ry , rz ) is a vector with krk  1. Since ⇢r depends linearly on r we have

⇢(r1 ) + (1 )⇢(r2 ) = ⇢( r1 + (1 )r2 )

which means that the Bloch representation respects the convex structure of Sd : convex combinations of Bloch
vectors correspond to convex combinations of states. Now, the extremal points of the Bloch ball are the point
on the sphere (vectors of length one) which correspond to pure states, as stated in the previous lemma.

4
B
z

r D
x

A C
y E

Figure 1: Left: a mixed state can be decomposed in different ways into pure states, each corresponding to a
line passing through the tip of the vector r, which intersects the sphere in two points. Right: a simplex with
3 extremal points A,B,C. Any other vector D has a unique decomposition into extremals.

An important fact which is immediately evident from Figure 1 is that the decomposition of a mixed state into
pure states is non-unique. For instance, for any Bloch vector krk = 1 we have

I 1 1
= ⇢r + ⇢ r.
2 2 2
This is very different from the convex structure of the set of probability measures on the set {1, . . . , d}, also
called the simplex. For the latter, the extremals are the delta measures i (j) := ij and any probability
distribution {p(1), . . . , p(d)} can be written in a unique way as convex combination of extremal measures:
X
p= p(i) i .
i

A natural question to ask is which ensembles of pure states give rise to a given mixed state ? The answer is
given be the following theorem which we state here without proof but we will return to it in the exercises.
Theorem 5 (ambiguity of the ensemble decomposition). Let ({| 1 i, . . . , | d i}, {p1 , . . . , pd }) and
({| 1 i, . . . , | d i}, {q1 , . . . , qd }) be two ensembles of states on Cd . The ensembles have the same mean
d
X d
X
p(i)| i ih i | = q(j)| j ih j |
i=1 j=1

if and only if p X p
p(i)| i i = Uij q(j)| ji
j

for all i, with Uij a unitary matrix.

5
6
Introduction to Quantum Information Science

Lecture 10: Superdense coding and Schmidt decomposition

Abstract: In this lecture we examine in more detail the structure of bipartite states. We show that any such
state can be expressed in a canonical form called Schmidt decomposition, which can be used to compute the
reduced states, or determine how entangled the state is. We will start by looking at the “superdense coding"
protocol which shows how entanglement can be used to communicate two bits of information by transmitting
a single qubit!

1 Superdense coding
In information theory one often discusses information transmission protocols between a sender and a receiver.
The two parties are traditionally called Alice and Bob, with other actors making occasional appearances, for
instance Eve the eavesdropper who tries to listen to their conversation.
The set-up of a typical communication protocol is the following. Alice wants to transmit a message to Bob,
which is originally written in an alphabet, e.g. English alphabet has 26 letters. To transmit the message
she uses a classical channel modelled as a Markov kernel which describes probabilistically how the input is
related to the output. For example a one-bit channel has both input and output set {0, 1} and is defined by the
conditional probabilities

t(i|j) = P([output = i]|[input = j]), i, j 2 {0, 1}.

The protocol consists of 3 steps: first Alice encodes her message into a binary string which serves as the
input of the communication channel. Then the bits are sent one by one, using the channel repeatedly. On the
other end Bob decodes the output bit string to recover the original message. Information theory deals with
the questions of how to faithfully transmit information through noisy channels, and how to optimise the use
of the channel. For instance, in a typical message the different letters appear with different frequencies, and
this can be used to compress the message by coding frequent letters with short bit strings and rare letters with
long bit strings, the key concept here being that of entropy as a measure of randomness. For our purposes it
suffices to note that Alice can send Bob at most one bit of information per use of a one-bit ideal channel (i.e.
where t(0|0) = t(1|1) = 1, t(0|1) = t(1|0) = 0).
We pass now to the quantum version of the protocol, in which Alice will try to send classical information
to Bob, by using a perfect one-qubit channel. For this, it will be useful to define the following orthonormal
basis of entangled states called the Bell basis (verify that they form an ONB!)
8 p
>
> | + i = (|0i ⌦ |0i + |1i ⌦ |1i)/p2
<
| i = (|0i ⌦ |0i |1i ⌦ |1i)/p2
(1)
>
> | + i = (|0i ⌦ |1i + |1i ⌦ |0i)/p 2
:
| i = (|0i ⌦ |1i |1i ⌦ |0i)/ 2.

Suppose now that Alice and Bob share a pair of qubits prepared in the entangled state | AB i := | + i. Since
the two parties are in separate locations, each of them can manipulate only the system on her/his side, i.e.

1
only local operations are allowed. For instance Alice can apply a unitary transformation U to her qubit while
Bob does nothing to his qubit. The action on the joint state is then

| i 7! (U ⌦ I)| i.

The goal is still to transmit classical information, but this time the two participants do not have a classical
channel as before but a quantum channel, i.e. a way of transmitting a qubit from Alice to Bob without chang-
ing the states of the systems involved in the protocol. Superdense coding shows that Alice can communicate
to Bob 2 bits of information (i.e. 4 possible messages 00, 01, 10, 11) by sending him only one qubit, after
encoding the message by means of a local unitary operation. In other words one use of a one-qubit channel
can transmit two classical bits, in contrast to a one bit classical channel which can only transmit one bit of
information.

The superdense coding protocol goes as follows.


Step 1. Depending on which message (bit string) she wants to send, Alice performs a local unitary transfor-
mation on her qubit. If the bit string is 00 she does nothing at all. If the bit string is 01 she applies the unitary
z (also called phase flip). If the bit string is 10 she applies the unitary x (also called quantum NOT gate). If
the bit string is 11 she applies the unitary i y . The four vectors obtained in this way are the Bell basis vectors
defined in the previous lecture (verify this!)

I ⌦ I| +i = | +i
p
z ⌦ I| +i = (|0iA ⌦ |0iB |1iA ⌦ |1iB )/ 2 = | i
p
x ⌦ I| +i = (|0iA ⌦ |1iB + |1iA ⌦ |0iB )/ 2 = | +i
p
i y ⌦ I| +i = (|0iA ⌦ |1iB |1iA ⌦ |0iB )/ 2 = | i.

Step 2. Alice sends her qubit to Bob. We suppose that the act of transmitting the ‘quantum information’ is
‘noiseless’, so that Bob has now two qubits prepared in one of the four Bell states.
Step 3. Bob measures the 2 qubits state in order to decode the message sent by Alice. He choose the mea-
surement with outcomes {00, 01, 10, 11} and associated PVM consisting of the one dimensional projections

P00 := | + ih + |, P01 := | ih |, P10 := | + ih + |, P11 := | ih |.

Since the Bell states form a basis in C2 ⌦ C2 , this is indeed a measurement. Moreover, since the four possible
states sent by Alice are precisely the Bell states, the outcome of the measurements is identical to the bit string
encoded by Alice. We have thus transmitted two bits by sending a single qubit! This is surprising because
classically it would not be possible to communicate two bits by sending only one, no matter what the prior
classical correlations between Alice and Bob are.

2 Polar and singular values decompositions


Definition 1 (absolute value). Let A be an operator on H. The absolute value of A is the positive operator
defined by p
|A| := A⇤ A

Note that since A⇤ A is positive, the above square root is well defined. The polar decomposition is the
operator analogue of the polar form for complex numbers: z = ei |z|.
Theorem 2 (polar decomposition). Let A be an operator on H. Then there exists a unitary U such that

A = U|A|.

2
P
Proof. Let |A| = i i |ei ihei | be the spectral decomposition of |A| with {|e1 i, . . . , |ed ii} an ONB in H. If
|fi i := A|ei i then
hfi |fj i = hei |A⇤ A|ej i = 2
i ij .

This means that we can define an ONB {|g1 i, . . . , |gd i} with vectors |gi i := |fi i/ i for i 6= 0, and the rest
of the vectors chosen arbitrarily in order to form an ONB.
We define the unitary operator
d
X
U := |gi ihei |
i=1

and claim that A = U|A|. To verify this we apply both sides to all the basis vectors |ei i. If i 6= 0 then

U|A||ei i = i U|ei i = i |gi i = |fi i = A|ei i,

and if i = 0 then |A||ei i = A|ei i = 0, hence A = U|A| .

The following result is an equivalent formulation of the polar decomposition in terms of singular values of A.
Theorem 3 (singular value decomposition). Let A 2 Md be a square matrix. Then there exist unitary
matrices U, V and a diagonal matrix D such that

A = U DV.

The diagonal elements of D are the eigenvalues of |A|, and are called singular values of A.

Proof. By the polar decomposition we have A = J|A|. Now the positive matrix |A| can be diagonalised by a
unitary matrix |A| = V ⇤ DV , which implies that A = JV ⇤ DV = U DV .

3 Schmidt decomposition
After discussing partial states density matrices and ensembles, we introduce another important tool of quan-
tum information theory: the Schmidt decomposition. This shows that any pure bipartite state has a canonical
form, which is closely related to the spectral decomposition of the partial states.
Theorem 4. Let | i be a pure state of a composite system HA ⌦ HB , with HA and HB Hilbert spaces
of dimensions dA , and respectively dB . Then there exists an integer r  min(dA , dB ), and two sets of
orthonormal vectors {|e1 i, . . . , |er i} in HA and {|f1 i, . . . , |fr i} in HB such that
r
X p
| i= µi |ei i ⌦ |fi i
i=1
Pr p
where µi are strictly positive and satisfy i=1 µi = 1. The coefficients µi are called the Schmidt coeffi-
cients of | i.

Proof. We assume that HA and HB have the same dimension d and leave the general case as an exercise.
Consider two arbitrary orthonormal bases in HA and HB and write | i in the tensor product basis
d
X
| i= Ajk |ji ⌦ |ki
j,k=1

for some d ⇥ d matrix of coefficients Ajk . By the singular values decomposition we have

A = U DV

3
where D is a diagonal matrix with non-negative entries and U, V are unitary matrices. Thus
X
| i= Uji Dii Vik |ji ⌦ |ki.
ijk
P P
We define |ei i := j Uji |ji and |fi i := k Vik |ki and note that the two sets are orthonormal because the
matrices U, V are unitary. Then
r
X p
| i= µi |ei i ⌦ |fi i
i=1
p
where r is the rank of D and µi = Dii are the stricty positive elements of D.

3.1 Measurement in the Schmidt basis

A consequence of the Schmidt decomposition is that there exist measurements on the A and B systems whose
outcomes are perfectly correlated. For simplicity we assume here that r = dA = dB . Let {P1 , . . . , Pr } be
the PVM on HA consisting of one dimensional projections onto the Schmidt vectors {|e1 i, . . . , |er i}, and
similarly let {Q1 , . . . , Qr } be the PVM on HB consisting of one dimensional projections onto the Schmidt
vectors {|f1 i, . . . , |fr i}. The joint probability distribution of the two measurement is then

P| i ([XA = i, XB = j]) = h |Pi ⌦ Qj | i = |h |ei ⌦ fj i|2 = µi i,j

so in particular XA = XB always!

3.2 Partial states in the Schmidt decomposition

To better understand the meaning of the Schmidt data, we consider the partial states of the two subsystems.
For system A we have
r
X p
⇢A = TrB (| ih |) = µi µj TrB (|ei ⌦ fi ihej ⌦ fj |)
ij=1
X r
p
= µi µj TrB (|ei ihej | ⌦ |fi ihfj |)
ij=1
X r
= µi |ei ihei |
i=1

and similarly for system B we obtain


r
X
⇢B = µj |fj ihfj |.
j=1

Thus the two partial state have the same non-zero eigenvalues µ1 , . . . , µr and the corresponding eigenvectors
are |e1 i, . . . , |er i and respectively |f1 i, . . . , |fr i. The number r is called the Schmidt rank and together with
the eigenvalues µi it quantifies the amount of entanglement between the systems A and B, seen as a measure
of correlations which should not change when performing local unitary operations. To see that this is indeed
the case, consider two local unitaries UA and UB which Alice and Bob apply separately to the state | i. Then
Xp
UA ⌦ UB | i = µi UA |ei i ⌦ UB |fi i
i

from which we see that the Schmidt coefficients remain unchanged, while the eigenbases of the partial states
were rotated by the unitaries.

4
Introduction to Quantum Information Science

Lecture 11: Quantum teleportation

Abstract: In this lecture we investigate how entanglement can be used to transmit quantum information
by only using classical information transmission. The quantum teleportation protocol shows that one can
transmit one qubit state by using a maximally entangled qubit pair and two bits of classical communication.

1 Quantum Teleportation
In one of the earlier lectures we discussed the fact that the unknown state of a quantum system cannot be
learned by measuring only one copy. This can be done only by performing many measurements on identically
prepared systems. Such a procedure is called state tomography and will be discussed in more detail later. For
our purposes, it suffices to note that if Alice has an isolated system prepared in an unknown state, it is not
possible to measure it and use the measurement results to prepare a state which is identical to the original
one.
In this lecture we will see that Alice can transfer the (unknown) state of a qubit from her lab to Bob’s lab in
a remote location, without sending any physical system but only 2 bits of classical information. To achieve
this Alice and Bob need to share a pair of maximally entangled qubits. This protocol is called quantum
teleportation and is illustrated in Figure 1.

X1 = i
| iin M
U M
X2 = j

| +i
{ U(i,j) | iB

Figure 1: Teleportation protocol: time goes from left to right. Alice and Bob share a pair of entangled qubits
in state | + i, and Alice has an additional qubit in state | iin . Alice performs a unitary U on her qubits, and
transmits the result (i, j) to Bob. He performs a unitary rotation Uij on his qubit to recover the input state
| iin

The steps are as follows:


1. Alice and Bob share a pair of qubits prepared in the state

|0iA ⌦ |0iB + |1iA ⌦ |1iB


| +i = p
2
where we have used the subscripts A, B to emphasise their location. Additionally, Alice has an ‘input’ qubit
prepared in an unknown state | iin = ↵|0iin + |1iin . Initially the input is independent of the shared qubits,

1
so the joint state of the three qubits is

| i = | in i ⌦ | + i
1
= p [↵|0iin ⌦ (|0iA ⌦ |0iB + |1iA ⌦ |1iB ) + |1iin ⌦ (|0iA ⌦ |0iB + |1iA ⌦ |1iB )]
2
1
= p [↵|0iin (|0iA |0iB + |1iA |1iB ) + |1iin (|0iA |0iB + |1iA |1iB )]
2
where in the third line we omitted the tensor product symbols for simplicity.
2. Alice applies a unitary transformation U to her two qubits (|·iin and |·iA ), whose action on the product
basis is the following (verify that this is a unitary!)
1
U : |0iin |0iA 7 ! p (|0iin |0iA + |1iin |0iA )
2
1
U : |0iin |1iA 7 ! p (|0iin |1iA + |1iin |1iA )
2
1
U : |1iin |0iA 7 ! p (|0iin |1iA |1iin |1iA )
2
1
U : |1iin |1iA 7 ! p (|0iin |0iA |1iin |0iA ) .
2

The state of the three qubits after applying the unitary is (verify this!)

| ˜i = (U ⌦ IB ) | i
1
= |0iin |0iA (↵|0iB + |1iB )
2
1
+ |0iin |1iA (↵|1iB + |0iB )
2
1
+ |1iin |0iA (↵|0iB |1iB )
2
1
+ |1iin |1iA (↵|1iB |0iB )
2

This expression is a superposition of tensor products of Alice and Bob states, where the coefficients ↵, of
the state we want to teleport appear now on Bob’s side. This is an encouraging sign!
3. Alice measures her qubits in the standard basis |0iin |0iA , |0iin |1iA , |1iin |0iA , |1iin |1iA . The conditional
state of Bob given the outcome (i, j) is obtained by applying the projection |iihi|in ⌦ |jihj|A ⌦ IB to the
joint state | ˜i, tracing over Alice’s qubits and normalising. This give the following set of conditional states

(0, 0) ! ↵|0iB + |1iB


(0, 1) ! ↵|1iB + |0iB
(1, 0) ! ↵|0iB |1iB
(1, 1) ! ↵|1iB |0iB

4. Alice communicates to Bob the results of her measurement (two bits of classical information).
5. At this point Bob knows in which of the four (conditional) states his qubit is. Note that this does not mean
that he knows what that state is since ↵, are unknown. However he can make a unitary transformation U(i,j)
which rotates the state to | i = ↵|0iB + |1iB as follows

2
if outcome is (0, 0) then Bob applies U(0,0) = I : ↵|0iB + |1iB 7! ↵|0iB + |1iB
if outcome is (0, 1) then Bob applies U(0,1) = x : ↵|1iB + |0iB 7! ↵|0iB + |1iB
if outcome is (0, 1) then Bob applies U(1,0) = z : ↵|0iB |1iB 7! ↵|0iB + |1iB
if outcome is (1, 1) then Bob applies U(1,1) = i y : ↵|1iB |0iB 7! ↵|0iB + |1iB

At the end of the protocol, Bob’s qubit is prepared in the state | i which is identical to the state of Alice’s
input, without any physical transfer taking place.
The Quantum teleportation protocol was proposed in 1997 in [1] and the first experimental realisation was
done in 1998 [2]. More recently, teleportation over a distance of 1400 km has been realised between ground
and satellite bases [3].

2 Discussion
Here are some comments on the interpretation of the teleportation protocol described above.
1. The protocol seems to show that we can measure a quantum system and recreate the state from the
measurement result. This is impossible, since non-orthogonal states cannot be distinguished with certainty
by measurements (exercise). Moreover, if it was possible to recreate one copy, one could create many copies
as well, which contradicts the no quantum cloning theorem (proof in a later lecture). In fact teleportation is
not used to prepare multiple copies of an unknown state, it merely transfers the state from Alice to Bob. After
the measurement, the original qubit | iin is in one of the states |0iin , |1iin so it does not carry any more
information about the state. We destroy one state to create an identical one elsewhere.
2. But the protocol does produce some measurement outcomes. Surely this classical information tells us
something about the state | iin ? As an exercise, you can show that the distribution if the measurement
outcomes is fixed, so does not depend on | iin and we do not learn anything about | iin in the process.
3. When Alice has performed her measurement, she knows that Bob’s (conditional) state has changed. Does
Bob know it, and does it mean that an instant information transfer is taking place? No, because, Bob knows
what his conditional state is only after he received the two bits of classical information from Alice. Think
of what his state is (from his own point of view) between the moment when the measurement has been
performed and the moment when he learns about the outcome. You see how the state is a measure of the
information we have about the system, and may differ from one observer to another.
4. Quantum teleportation would not be possible without the shared entangled pair. Here, as in superdense
coding we see that entanglement is resource which can be used to perform different tasks. This idea is
extensively exploited in Quantum Information. A number of methods have been developed about how to
exchange and transform classical and quantum ‘resources’. In teleportation we can say that one entangled
qubit (or ebit) plus 2 bits form a resource which is at least as powerful as a one qubit communication channel.
5. As shown in lecture 6, the maximally entangled state | +i can be prepared by using the circuit

were the two lines represent the two qubits (initially prepared in |0i) , H is the Hadamard gate
✓ ◆
1 1 1
H := p
2 1 1

3
and the 2 qubit gate CN OT is
CN OT : |00i 7! |00i
CN OT : |01i 7! |01i
CN OT : |10i 7! |11i
CN OT : |11i 7! |10i

6. The unitary U acting on Alice’s qubits can be implemented by the circuit

Indeed the transformations of the 4 product basis vectors are as follows


p
|00i 7! |00i 7! (|00i + |10i)/ 2
p
|01i 7! |01i 7! (|01i + |11i)/ 2
p
|10i 7! |11i 7! (|01i |11i)/ 2
p
|11i 7! |10i 7! (|00i |10i)/ 2
7. The unitary transformation U followed by the two z measurements can be seen as a Bell measurement
on the input state of the circuit, which is the initial state of Alice’s qubits. Indeed the circuit is the inverse of
circuit for the preparation of Bell states, so it maps the Bell states into the standard basis states
|00i + |11i |00i + |10i (|0i + |1i)|0i (|0i |1i)|0i
| +i = p 7! p 7! + = |00i
2 2 2 2
|00i |11i |00i |10i (|0i + |1i)|0i (|0i |1i)|0i
| i= p 7 ! p 7 ! = |10i
2 2 2 2
|01i + |10i |01i + |11i (|0i + |1i)|1i (|0i |1i)|1i
| +i = p 7! p 7! + = |01i
2 2 2 2
|01i |10i |01i |11i (|0i + |1i)|1i (|0i |1i)|1i
| i= p 7! p 7! = |11i
2 2 2 2

Summary
In quantum teleportation Alice can transfer the (unknown) state of an input qubit from her lab to Bob’s lab
in a remote location, by only sending 2 bits of classical information. For this, Alice and Bob need to share
a (maximally) entangled state | + i. In the protocol Alice performs a unitary transformation on her qubits,
and then measures them in the standard basis. She transmits the 2 bits of information to Bob who applies a
unitary which maps his qubit state to that of input qubit. Since the state of the original qubit is destroyed, we
deal with a state transfer rather than copying. Moreover, the outcomes of Alice’s measurement do not contain
any information about the unknown state. Teleportation as well as superdense coding are examples of how
quantum resources such as entanglement can be used to perform quantum information processing tasks.

References.
[1] C. H. Bennett, et al., Teleporting an Unknown Quantum State via Dual Classical and Einstein-Podolsky-
Rosen Channels, Phys. Rev. Lett. 70, 1895-1899 (1993)
[2] D. Bouwmeester, et al., Experimental Quantum Teleportation, Nature 390, 575-579 (1997)
[3] J-G. Ren et al., Ground-to-satellite quantum teleportation, Nature 549, 70-73 (2017)

4
Introduction to Quantum Information Science

Lecture 12: The BB84 protocol for quantum key distribution

1 Classical cryptography
The goal of public cryptographic algorithms is to allow the secure communication of secret messages over
public channels, e.g. between two parties, Alice and Bob, in the presence of an eavesdropper, Eve.
One way to achieve this is with the one-time pad algorithm which uses a secret key known only to Alice
and Bob. In this protocol, Bob encodes the plaintext message he wants to send to Alice, in the following
way. He first pairs the message’s characters with those of the secret key (or pad). After that, each character
of the plaintext is "added" to the corresponding character from the pad. For example, if B is a character of
the message (which is the letter with number 2 in the alphabet), and C is the corresponding pad character
(which is number 3) then the encoded character is E (which is number 2+3). After receiving the encoded
message, Alice decodes it by reversing the encoding procedure with the help of her own copy of the secret
pad. Although in theory this method is perfectly secure provided that the pad consists of a sequence of
random characters, one of the drawbacks of this method is that it requires that the two parties have access to
a secret key.
Other cryptographic methods do not require that the two parties share a secret key, but rather use a publicly
accessible key. These method rely instead on the fact that certain mathematical tasks are computationally
hard, for instance the factorisation of a large number in prime factors. In the RSA cryptosystem [1] for
example, the public key consists of the product n = p · q of two large prime numbers p and q, together with an
auxiliary number e. The private key consists of n and a secret number d. The communication protocol has the
following steps. Alice transmits the public key to Bob and keeps the private key to herself. Bob encodes his
message using the public key and transmits the encoded message to Alice. By applying a certain mathematical
algorithm Alice can now recover the message by using the private key. The RSA algorithm is believed to be
secure, and relies in particular of the difficulty of factoring large numbers. With current computers, the largest
number which has been factored has around 250 decimal digits, but as computers become more powerful,
larger prime factors need to be used in order to preserve the security of the algorithm. Note however, that
in 1994 Peter Shor showed that a quantum computer would be able to factor large numbers in polynomial
time [2], which presumably would compromise the security of the RSA algorithm. This discovery has given
the field of Quantum Computation a huge boost, and has motivated many of the experimental advances in
Quantum Technologies, aimed at building a quantum computer.

2 Quantum key distribution


We will now describe a key distribution protocol known as BB84, which was proposed by Bennet and Bras-
sard in 1984 [3]. The goal of the protocol is to establish a key between Alice and Bob, such that Eve cannot
learn it by listening to their "conversation". The secrete key can then be used to communicate securely by
using for instance the one time-pad algorithm.
The basic idea is that Alice and Bob use quantum systems to establish the key, and if Eve tries to learn the

1
$ l l $ $ l
+ + + + +
$ l l $ $
• • • • • • •
1 0 1 0 1 1 1

Figure 1: The BB84 steps. Each column is represents one round of the protocol. The first row is the sequence
of states prepared by Alice. The second row is the sequence of bases chosen by Bob. The third row is the
sequence of outcomes obtained by Bob. In the fourth row, the instances when the bases chosen by Alice and
Bob coincide, are marked with a black dot. The fifth row is the sequence of bits which constitute the secret
key. If Eve has not tampered with the quantum communication, the bits should agree with the rows one and
three. In order to make sure that this is the case, Alice47and Bob sacrifice a small proportion of randomly
chosen bits from the list and compare them by using the classical channel.

state of the system this will inevitably disturb the state, which can be detected by Alice and Bob. Eve can
sabotage the protocol in which case Alice and Bob cannot establish the key, but the goal is to make sure that
if they agree that the protocol was successful, then the chance that Eve knows the key would be very small.
To perform the protocol, Alice and Bob need be able to perform the following tasks:
A) Alice and Bob can communicate through a public classical channel which Eve can eavesdrop.
B) Alice and Bob can use a quantum channel, more precisely Alice can send qubits to Bob, and expect that
they arrive in the same state provided that Eve does not interfere. However Eve is allowed to tamper with this
channel, e.g. by intercepting the qubits, measuring them, and passing them to Bob in a new state.
C) Alice can prepare qubits in either one of the four states
1 1
| li := |0i, | $i := |1i, .i = p (|0i + |1i) ,
|% &i = p (|0i
|- |1i) .
2 2

D) Bob can measure the received qubits in one of the two orthonormal bases: {| li, | $i} and {| %
.i, | -
&i}.
Here there are two alternative scenarios that can occur:
a) Bob measures in the same basis as that of the received state. In this case, the measurement outcome
coincides with the label of the basis vector. For example if Alice sends the qubit in state | %
.i and Bob
measures in basis {| % &i} then the result is %
.i, | - . with probability one since

P(%
.) = |h% .i|2 = 1,
.|% P(-
&) = |h- .i|2 = 0.
&|%

b) Bob measures in the other basis than that corresponding to the received state. In this case the two possible
outcomes have equal probabilities 1/2. For example if Alice sends the qubit in state | %.i and Bob measures
in basis {| li, | $i} then the probabilities are
1 1
.i|2 =
P(l) = |hl | % , .i|2 =
P($) = |h$ | % .
2 2

The BB84 protocol consists of repeated independent rounds of the following steps, as illustrated in Figure 1:
1) Alice prepares a qubit randomly one of the four states | li, | $i, | % &i and sends it to Bob
.i, | -

2) Bob measures the received qubit in either in basis {| li, | $i} or {| % &i}, randomly with probability
.i, | -
1/2 for each basis

2
3) Alice and Bob reveal the bases in which the qubit was prepared, and respectively measured, via the
classical channel (but not the state or respectively the outcome). If these bases coincide (which happens with
probability 1/2) then the two parties add the associated one bit of information to their agreed list. If the bases
differ, the two parties discard the information regarding that qubit.
For instance if the basis on both sides is {| li, | $i} and Alice had sent the state | li then Bob’s measurement
has outcome l with probability one, unless Eve has tampered with the state. So both parties share the same
bit of information.
4) In order to make sure that Eve has not tampered with their quantum communication channel, they pick
a proportion (say half) of their agreed bits and compare them by using the classical channel. If there was
no interference, the bits should coincide. If they find significant differences, it means that the quantum
communication channel had been intercepted, and they abort the key distribution protocol.

3 Security of the BB84 protocol


It is clear from the protocol that in the absence of any tampering, or any noise in the transmission of classical
and quantum information, the two parties establish a common key. But, is the protocol secure against the
eavesdropper ? A mathematical proof of this property can be found in [4], but this goes beyond the scope of
this lecture. Instead, we limit ourselves to giving the intuition behind the proof. The main idea is that in order
to learn the key, Eve must intercept and measure the qubits sent by Alice. If Eve knew the basis to which
the prepared state belongs, then she could measure the state in that basis, learn the state with probability one
and moreover, the state would remain the same after the measurement and Bob could not find out about the
tampering. In this case the encoded information is essentially classical, and any classical information can be
copied. However, Eve does not know which basis has been used, which means that she cannot perform the
above measurement. Let us explore two possible alternatives:
1) Eve choose one of the two bases randomly, measure in that basis, and passes the system to Bob.
For example if the state was | li then when Eve measures in basis {| li, | $i} then she obtains the result l
(with probability one) and the state is sent to Bob undisturbed. In this case, when Alice and Bob check their
randomly chosen results, they will get the same value, so they would conclude that everything is all right.
On the other hand, if Eve measures in basis {| % &i}, then she obtains either one of the two possible
.i, | -
outcomes with equal probabilities.

If Outcome = %
. then posterior state (sent to Bob) = | %
.i
If Outcome = -
& then posterior state (sent to Bob) = | %
.i

Since we are only interested in instances in which Alice and Bob have used the same basis, it means that Bob
measures in basis {| li, | $i}, so that in either of the two situations the probabilities are 1/2, 1/2
1 1
.i|2 = |hl | &
P(l) = |hl | % -i|2 = , .i|2 = |h$ | -
P($) = |h$ | % &i|2 = .
2 2
Therefore with overall probability 1/4 = 1/2 · 1/2 Bob will obtain result $ although he measured the state
in the same basis as Alice, in which case they can conclude that the quantum channel has been tampered with.

2) The problem with the previous scenario was that Eve did not know the basis used by Alice. To counter
this, Eve can keep the system sent by Alice and measure it only after the classical communication in which
the bases are revealed, has occurred. In this way she is sure to have learned the correct bit of information.
However, she needs to send Bob a qubit prepared in a certain state. Now, no matter what state she chooses,
there is a significant chance that Bob’s measurement will give a different result than the original state sent by
Alice. Otherwise it would mean that Eve could correctly guess the state sent by Alice without measuring it.
This is impossible due to a result called no-cloning theorem which states that no physical device can copy an
unknown state, i.e. implement the transformation | i 7! | i ⌦ | i. The proof will be given in a later lecture.

3
In general, Eve may employ more sophisticated attacks in which several successive qubits are measured,
which makes the proof the the security more complicated. However, at the heart of the proof is the fact that
quantum information cannot be copied.
Quantum cryptography is already in the stage of technological applications, with several companies produc-
ing cryptographic systems based on the BB84 protocol where the 4 state of the qubits are implemented with
4 polarisations of a single photon state.

4 The EPR protocol


Several other protocols for quantum key distribution have been proposed. In the EPR protocol, Alice p
and
Bob start by sharing a (large) number of maximally entangled qubit pairs in state | i = (|01i |10i)/ 2.
1. Each of the two parties measures her qubit in one of the bases {| li, | $i} or {| % &i}, chosen
.i, | -
randomly and independently with equal probabilities.
2. Alice and Bob communicate the measurement bases to each other, and discard the results obtained when
they measured in different basis, while keeping the results obtained for same basis measurements. From
lecture 8 (Bell inequality) we know that when the the two parties measure the same basis in the singlet state
| i, the outcomes are perfectly “anti-correlated", e.g. when Alice obtains | % .i, Bob obtains | -
&i, and so
on. Therefore if one of them flips his outcomes, they will share a list of random identical bits.
3. In order to rule out any evesdropping, they sacrifice a small proportion of the randomly chosen rounds,
and check via a public channel that the outcomes were indeed anti-correlated.
Security of the EPR protocol. As in the BB84 protocol, a watertight proof of security is quite involved
since one needs to consider the possibility of the eavesdropper being correlated with all the entangled qubit
pairs. However a basic “one round” argument is as follows. In order for Eve to learn about the secret
key, she needs to have some form of correlation with the AB pair. This can be modelled by assuming
that there exists a tripartite state | ABE i 2 C2A ⌦ C2A ⌦ CdE such that Eve has access to the marginal
⇢E := TrAB (| ABE ih| ABE |) while Alice and Bob share the marginal ⇢AB := TrE (| ABE ih| ABE |). By
checking the outcomes, Alice and Bob can make sure that their joint state is indeed ⇢AB = | ih |, since
this is the only state that gives anti-correlated outcomes (exercise).
However, in this case, Eve cannot be correlated with the Alice-Bob state, i.e. the joint state must be of the
form
| ABE i = | E i ⌦ | i
where | E i is Eve’s state. Therefore by measuring her state, Eve cannot learn anything about Alice’s and
Bob’s outcomes. To understand that the | ABE i must be of the product form, consider this as a bipartite state
between Eve on one side and Alice and Bob on the other side. Since the AB state is pure ( | i), the Schmidt
decomposition requires that the joint state is a product of two pure states.

References
[1] R. Rivest, A. Shamir and L. Adleman, "A Method for Obtaining Digital Signatures and Public-Key
Cryptosystems". Communications of the ACM 21 120-126 (1978).
[2] P. Shor, "Algorithms for quantum computation: discrete logarithms and factoring", Proceedings 35th
Annual Symposium on Foundations of Computer Science, 124-134, Santa Fe, NM (1994)
[3] C.H. Bennett and G. Brassard, "Quantum cryptography: Public-key distribution and coin tossing", Pro-
ceedings of IEEE International Conference on Computers, Systems and Signal Processing, pages 175-179,
Bangalore, India (1984).
[4] P. W. Shor and J. Preskill, "Simple Proof of Security of the BB84 Quantum Key Distribution Protocol",
Phys. Rev. Lett. 85 441-444 (2000)

4
Introduction to Quantum Information Science

Lectures 13-14: Classical and quantum computation

Abstract: In this lecture we introduce some basic ideas from classical computation after which we pass to
the quantum setting where we discuss about single qubit and two qubits gates, and universal gates.

1 Classical computation

1.1 History of algorithms and computers

One of the main topics of computer science is the design of efficient algorithms for solving computational
problems such as factoring (large) numbers, matrix diagonalisation, finding the shortest path in a graph, etc..
Algorithms were already known to ancient Greeks (e.g. the sieve of Eratosthenes for finding prime numbers)
but the word algorithm originates from the Persian mathematician al-Khwarizmi whose book Algoritmi de
numero Indorum was widely read in late medieval Europe. The computations prescribed by algorithms need
to be implemented by physical means. Ancient civilisations used objects such as pebbles, sticks, and abacus
for computations. Middle Ages brought the invention of mechanical clocks and other “automata” in Europe.
While Blaise Pascal constructed the first mechanical calculator in 1642, it was Charles Babbage in the 19th
century who conceived the first mechanical computer, the “analytical engine” whose logical structure is
essentially the same as that of modern computers. The mathematician Ada Lovelace wrote what is considered
as the first computer program - an algorithm designed to be carried out by the “analytical engine”.
The principles of modern computing were laid out by Alan Turing in 1936. In his paper On Computable
Numbers he described what is now known as a Turing machine which is capable of performing any mathe-
matical computation represented by an algorithm. Later on he and John von Neumann separately, presented
the first designs of a modern computer architecture. On the practical side, the second world war saw the
development of electromechanical computers, later replaced by electronic digital computers such as Collosus
in UK, Z3 in Germany, and ENIAC in USA. The invention of microprocessors and introduction of personal
computers were the next great leaps in computer technology.

1.2 Classical gates and circuits

The circuit model is the standard digital computing paradigm. A circuit consists of wires which transmit
information and gates which perform simple computations. Each wire carries a bit of information with
values {0, 1}, while a gate acts as a transformation on a single bit or several bits. For instance, the single bit
NOT gate flips the bit taking 0 to 1 and 1 to 0. Other gates implementing logical operations are AND, OR,
XOR, NAND, etc. The AND gate takes two bits a, b as input and outputs one bit a ^ b as output, which is
equal to 0 unless both a and b are equal to 1. The OR gate outputs 1 if and only if at least one of the inputs is
1. The XOR gate outputs the sum modulo 2 of its inputs. The NAND gate consists of an AND gate followed
by a NOT applied to the output.
A circuit consists of a number of wires connected through gates which are applied sequentially in time,

1
Figure 1: Circuit from computing a n + 1 bits Boolean function ( from [1])

such that the resulting diagram has no loops. For each sequence of input bits a = (a1 , . . . , an ), the circuit
produces a unique output sequence b = (b1 , . . . , bm ), so it can be seen as computing a certain Boolean
function f : {0, 1}n ! {0, 1}m . It is natural to ask if there is a finite set of gates with which one can
construct circuits implementing arbitrary Boolean functions. Such sets are called universal, and one example
is {N OT, AN D, XOR, F AN OU T } where FANOUT is the gate which copies the information from one
wire into several bits. Figure 1 shows a circuit which computes a given function f : {0, 1}n+1 ! {0, 1} using
the above gates, where the two blocks represent circuits for computing the n-bits functions f0 (x1 , . . . , xn ) :=
f (0, x1 , . . . , xn ) and f1 (x1 , . . . , xn ) := f (1, x1 , . . . , xn ). As an exercise you can verify that the output is
indeed the value f (x0 , x1 , . . . , xn ). Using induction we see that any function with one bit output can be
computed in this way using the above gates. In fact, it can be shown that the set of universal gates given
above can be reduced; Indeed, the gates NOT, AND, XOR can be constructed using only a NAND gate,
FANOUT and ancillary bits (exercise).

2 Quantum gates and circuits


We would now like to emulate the classical circuit model and devise circuits which can perform “quantum
computations”. The wires will represent qubits, and the gates will be unitary transformations of one or
several qubit states. However, a naive attempt to do this will encounter an obstacle, due to the fact that
unitary transformation are reversible while some of the classical gates are not. For instance, can we define a
quantum AND gate ? If such a gate existed, it should have the action

|ai |a ^ bi
AN D
|bi

where the inputs are two basis qubit states |ai and |bi and one of the outputs contains the result of the
computation |a ^ bi while the other qubit’s state is arbitrary. However there is no unitary U on C2 ⌦ C2 which

2
can perform such a transformation. Indeed U would need be satisfy
U : |0i ⌦ |0i 7! |0i ⌦ |xi
U : |0i ⌦ |1i 7! |0i ⌦ |yi
U : |1i ⌦ |0i 7! |0i ⌦ |zi
U : |1i ⌦ |1i 7! |1i ⌦ |wi
which is impossible since the first 3 vectors on the right side cannot be mutually orthogonal as x, y, z 2 {0, 1}.
For this reason, the classical model which is more suitable for quantum generalisations is that of computation
with reversible gates, i.e. transformations in which the input is in one-to-one correspondence with the output.
For instance the NOT gate is reversible, but the AND and OR gates are not reversible since they take two
bits as input and output a single one. Fortunately, any classical computation can be implemented using only
reversible gates. In fact any computation can be implemented using only a 3 bit reversible gate called the
Fredkin gate (see Figure 2) and acillary bits. The third input bit c of the Fredkin gate acts as control bit and
remains unchanged c0 = c. If c = 0 then a, and b are left unchanged a0 = a, b0 = b. If c = 1 then a, and
b are swapped a0 = b, b0 = a. By fixing one or two of the input bits, the gate can be configured to perform

Figure 2: Fredkin gate input-output table and circuit representation ( from [1])

an elementary gate; for instance if a = 0 then a0 = b ^ c so (b, c) ! a0 is the AND gate. Similarly, the
Fredkin gate can perform NOT, FANOUT and CROSSOVER (exercise) so it can be cascaded to perform any
classical computation.
From a quantum viewpoint, any reversible classical gate amounts to a permutation of basis vectors of the
input qubits, which is a special example of a unitary transformation. Therefore, reversible classical circuits
can be implemented as quantum transformations of basis vectors. However quantum mechanics allows for
more unitary transformations which do not map basis vectors into each other, and indeed produce entangled
states of the inputs. Below we discuss some of the key quantum gates and their properties.

Single qubit gates. Let us start by reviewing the single qubit gates. The Pauli gates X, Y, Z given by the Pauli
matrices x , y , z seen as unitary transformations
✓ ◆ ✓ ◆ ✓ ◆
0 1 0 i 1 0
X := , Y := , Z :=
1 0 i 0 0 1
In particular the X gate has a similar action to the classical NOT gate
X : |0i 7! |1i, X : |1i 7! |0i.

3
Additional single qubit gates are the Hadamard gate (denoted H), the phase gate (denoted S) and the T gate
✓ ◆ ✓ ◆ ✓ ◆
1 1 1 1 0 1 0
H := p , S := , T := .
2 1 1 0 i 0 exp(i⇡/4)
The Pauli gates have the following commutation relations with H (exercise)
HXH = Z, HY H = Y, HZH = X. (1)
An arbitrary single qubit unitary can be written in the form U = exp(i↵)R~n (✓) where
✓ ◆ ✓ ◆
✓ ✓
R~n (✓) = exp( i✓~n~ /2) = cos I i sin (nx X + ny Y + nz Z) (2)
2 2
denotes unitary performing a rotation by an angle ✓ around the unit vector ~n in the Bloch sphere representa-
tion. Using (1) we find
HR~n (✓)H = Rm ~ (✓) (3)
where the rotation axis is m
~ = (nz , ny , nx ).
In fact one can perform an arbitrary unitary U by using only rotations around two axes given two non-parallel
unit vectors ~n and m
~ (exercise):
U = exp(i↵)R~n ( )Rm
~ ( )R~
n( ) (4)
for some appropriate angles , , .
Similarly to the classical case, it is natural to ask whether arbitrary unitary transformations can be imple-
mented by quantum circuits using only a finite set of “universal” quantum gates. As such, the answer to the
question is no, since even in the case of a single qubit the space of unitaries is a smooth manifold while the set
of finite circuits built from a finite set of gates is countable. The right question to ask is whether any unitary
can be approximated with arbitrary precision by a such a circuit. The answer in this case is yes, and we will
first prove this in the case of one qubit gates.
Theorem 1. The gates H and T form a universal set of gates for one qubit transformations.

Proof. Up to a global phase, T is a ⇡/4 rotation around the z axis. By conjugating with H we obtain
HT H = Rx (⇡/4) (exercise). By composing the two operations we get
⇣ ⇡ ⌘ ⇣ ⇡ ⌘ h ⇡ ⇡ ih ⇡ ⇡ i
exp i Z exp i X = cos I i sin Z cos I i sin X
8 8 8 h 8⇡ 8 8
⇡ ⇡ i ⇡
= cos2 I i cos (X + Z) + sin Y sin
8 8 8 8
From equation (2) we see that in the Bloch representation, this is a rotation R~n (✓) around the axis ~n = ~r/k~rk
given by the vector ~r = (cos ⇡8 , sin ⇡8 , cos ⇡8 ) with an angle ✓ defined by cos(✓/2) = cos2 ⇡8 .
It can be shown that ✓ is an irrational multiple of 2⇡. Using this fact we show that by iterating R~n (✓) we
can approximate any rotation R~n (↵) around the same axis with arbitrary precision > 0 in ↵. Indeed,
R~n (✓)k = R~n (✓k ) with ✓k = k✓ mod 2⇡. Let N > 2⇡/ be an integer. Then there exists two indices
i < j  N such that 0 < |✓i ✓j | < , in particular ✓j i < . This means that by varying l over the
integers, the angles ✓l(j i) will fill the interval [0, 2⇡) at intervals smaller than , so one of the angles l0
satisfies |✓l0 (j i) ↵|  .
Next, we show that by using the Hadamard gate again we can change the rotation axis and approximate
arbitrary unitaries (up to an irrelevant global phase). The following property follows from (3)
HR~n (↵)H = Rm
~ (↵)

where m is the axis of the vector (cos ⇡8 , sin ⇡8 , cos ⇡8 ). Using equation (4) we see that any unitary can be
obtained as a product of appropriate unitary rotations around ~n and m,
~ which implies that we can approximate
arbitrary one qubit unitaries by using only the H and T gates.

4
Two qubit gates. To perform interesting computations with multiple qubits we need more general gates which
act on two or more qubits. We will focus on the two qubits controlled-not gate (denoted CNOT); this gate
has two input qubits, known as control and target qubit. If the control qubit is in |0i the target qubit is left
unchanged, while it the control is in |1i the target qubit is flipped. The matrix representation of the unitary
CN OT : C2 ⌦ C2 ! C2 ⌦ C2 is
0 1
1 0 0 0
B 0 1 0 0 C
CN OT = B @ 0 0 0 1 A
C

0 0 1 0
while the graphic representation is

where the top line is the control qubit while the bottom line is the target qubit. The control and target qubits
can be swapped with the help of Hadamard gates (exercise):

H H

H H

More generally, for any one qubit unitary U we can define the two qubits controlled-U operation whose action
on the qubits states is
|0i ⌦ | i 7! |0i ⌦ | i
|1i ⌦ | i 7! |0i ⌦ U | i
and is represented as

U
.

For the next result we will need to following simple identity characterising the controlled-U transformations
where U is a phase unitary. The diagram below says that such a controlled-U is equivalent to a one bit phase
gate applied to the top qubit.

✓ ◆
1 0
0 ei↵

✓ ◆
ei↵ 0
0 ei↵
.

5
This can be verified by computing the action on basis vectors on both sides

|0i ⌦ |0i 7! |0i ⌦ |0i, |0i ⌦ |1i 7! |0i ⌦ |1i, |1i ⌦ |0i 7! ei↵ |1i ⌦ |0i, |1i ⌦ |1i 7! ei↵ |1i ⌦ |1i.

How difficult is to perform a general controlled-U transformation, compared to CNOT ? The following theo-
rem shows that CNOT is essentially sufficient to perform any controlled-U gate.
Theorem 2. The controlled-U unitary can be implemented using single qubit gates and the CNOT gate.

Proof. The key of the proof is the following fact which we state without proof: any qubit unitary can be
written as U = ei↵ AXBXC where A, B, C are unitaries such that ABC = I and ei↵ is a phase. With this,
we can construct the following circuit implementing the controlled-U operation

✓ ◆
1 0
0 ei↵

U C B A
.

The identity can be verified by checking the action for the two basis vectors of the top qubit (control). If the
control is |0i then the left side is identity while the right side performs the unitary ABC = I on the second
qubit, so we have equality.
If the control is |1i then the left side applies U to the target qubit while the right side performs the unitary
ei↵ AXBXC which is equal to U .
Now we come to the main question of this section: is there a a finite set of one and two qubit gates with
which we can construct circuits that approximate arbitrary unitaries on a finite number of qubits ? We have
seen that this is possible for one qubit and requires only two gates T and H. The next theorem shows that the
only additional two qubits gate we need is CNOT.
Theorem 3. The single qubit gates T and H together with the two qubit gate CN OT are universal.

Main ideas of the proof. The proof can be broken down in three steps.
1. Any unitary on Cd can be written as a product of at most d(d 1)/2 two level unitaries. These are special
unitary transformations which act nontrivially only on the subspace spanned by two of the standard basis
vectors. For instance if 1  i < j  d, a two level unitary between the levels |ii and |ji acts as
8
< |ki 7! |ki ifk 2 / {i, j}
U: |ii 7! ↵|ii + |ji
:
|ji 7! |ii + |ji

for some coefficients ↵, , , 2 C.


2. Any two level unitary can be implemented using CNOT and controlled-U gates.
3. As shown in Theorem 2, controlled-U can be implemented with CNOT and single qubit unitaries, which
in turn by Theorem 1 can be approximated by circuits containing only H and T gates.
A complete proof cannot be reproduced here but the details can be found in Chapter 4 of [1].

References:
[1] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University
Press 2000.

6
Introduction to Quantum Information Science

Lecture 15: The Deutsch-Josza algorithm

Abstract: After discussing the basic quantum gates and the idea of universality, we look now at a simple
quantum algorithm which gives us a first glimpse into the new possibilities opened up by quantum computa-
tion.

1 Quantum implementation of boolean functions


Similarly to the example of the “quantum AND gate”, unitarity imposes constraints on how to define the
quantum analogue of a boolean function f : {0, 1}n ! {0, 1}n . Naively, one would define the unitary
Uf : Cn ! Cn acting on the basis vectors |ai := |a1 i ⌦ · · · ⌦ |an i as

Uf : |ai 7! |f (a)i

where a = (a1 , . . . , an ) 2 {0, 1}n . However this maps is not unitary unless f is a one-to-one function.
Instead, we will define the “quantum function” as a 2n qubits unitary transformation of the form

|ai |ai
Uf
|bi |b f (a)i

On basis vectors this means Uf : |ai ⌦ |bi 7! |ai ⌦ |b f (a)i where a b is the binary addition of the two n-
bits, i.e. (a b)i = (ai +bi ) mod 2. One can check that Uf is indeed unitary and that it completely determines
the function f , and therefore constitutes a reasonable definition of the “circuit for f ” in the quantum setting.
In the next section we discuss a quantum algorithm using such such a transformation as a resource.

2 The Deutsch-Josza algorithm


The Deutsch-Josza algorithm [1,2] deals with a quantum version of the following problem. Suppose we have
access to a “black box” which can be used to compute a function f : {0, 1} ! {0, 1}. For each input x the
black box outputs f (x). The problem is to find how many queries of the black box are necessary in order to
determine whether the function f is constant (i.e. f (0) = f (1)) or balanced (f (0) 6= f (1)). Clearly, in the
classical setting one needs to compute both f (0) and f (1) in order to solve the task.
Let us consider now the quantum version of this problem. We are given a quantum black box which “com-
putes” the function f in the following sense. The box is described by a 2 qubits unitary

1
|ai |ai
Uf
|bi |b f (a)i

whose action on the product basis is

Uf : |ai ⌦ |bi 7! |ai ⌦ |b f (a)i, a, b 2 {0, 1}.

We claim that the following simple circuit using a single quantum box suffices to answer the “constant versus
balanced” question.

|0i H H
|0 |1
Uf
2

p
In order to show this, let us compute the output of the circuit, when the input state is |0i ⌦ (|0i |1i)/ 2.
Note that if f (a) = 0 then Uf has the action

Uf : |ai|0i 7! |ai|0i
Uf : |ai|1i 7! |ai|1i

while if f (a) = 1 the action is

Uf : |ai|0i 7! |ai|1i
Uf : |ai|1i 7! |ai|0i

Therefore, by linearity we have

|0i |1i |0i |1i


Uf : |ai ⌦ p 7 ( 1)f (a) |ai ⌦ p
! . (1)
2 2

Now, recall that H is the one qubit Hadamard gate with matrix
✓ ◆
1 1 1
H=p .
2 1 1

The first step of the circuit consists of applying H and produces the state

|0i |1i H⌦I |0i + |1i |0i |1i


|0i ⌦ p ! p ⌦ p .
2 2 2
In the next step we apply Uf using equation (1)

|0i + |1i |0i |1i Uf ( 1)f (0) |0i + ( 1)f (1) |1i |0i |1i
p ⌦ p ! p ⌦ p
2 2 2 2
Finally, by applying H again we get

( 1)f (0) |0i + ( 1)f (1) |1i |0i |1i H⌦I 1h i |0i |1i
p ⌦ p ! ( 1)f (0) (|0i + |1i) + ( 1)f (1) (|0i |1i) ⌦ p
2 2 2 2
 f (0) f (1) f (0) f (1)
( 1) + ( 1) ( 1) ( 1) |0i |1i
= |0i + |1i ⌦ p
2 2 2

2
Note that this is a product state. If we measure the first qubit in the standard basis, we obtain outcome 0
with probability one when the function is constant, while if the function is balanced we get outcome 1 with
probability one
P(X = 0|f (0) = f (1)) = 1 = P(X = 1|f (0) 6= f (1)).
Therefore the outcome of the measurement is the answer to the constant-versus balanced question, and has
been obtained with a single query of the box Uf .
Remark. Although the algorithms does not have a particular practical application, it illustrates an important
phenomenon called p quantum parallelism; after applying the first Hadamard, the top qubit is in the superposi-
tion (|0i + |1i)/ 2 of the two basis states, and therefore the subsequent application of Uf “computes” both
f (0) and f (1) in the same time. More precisely its outcome is sensitive to both these values. This allows us
to find the answer with a single use of the box.

2.1 General form of the Deutsch-Josza algorithm

We consider now a more general version of the Deutsch-Josza problem. Let f : {0, 1}n ! {0, 1} be a
function which has the property that it is either constant or balanced (the latter meaning that it is equal to zero
on half of the inputs). The task is to find which of the two alternatives is correct, with a minimum number of
uses of the box which computes f .
Classically, if the function is constant, one needs at least 2n /2 + 1 calls to f in order to check this property,
since it may happen that the first 2n /2 values were zero but all the rest were 1. Similarly, if the function is
balanced, then one needs 2n /2 + 1 calls, since it may happen that the first 2n /2 were all equal.
Consider now that we have a “quantum oracle”, i.e. a black box which “computes” f similarly to the previous
section, with the difference that the top line represents n qubits (the input of the function is |ai 2 (C2 )⌦n )
and the lower line is one qubit (which stores the outcome of the computation). One can check that the
transformation
Uf |ai ⌦ |bi 7! |ai ⌦ |f (a) bi
is a unitary on (C2 )⌦n ⌦ C2 .
We will show that a single use of Uf suffices to answer the question, i.e. using exponentially less resources
that in the classical setting. The quantum circuit in this case is

|00 . . . 0i H ⌦n H ⌦n
Uf
|1i H

| 1i | 2i | 3i | 4i

where the initial state on the top n qubits is |00 . . . 0i and the lower qubit is in state |1i. The symbol H ⌦n
denotes the tensor product of n Hadamard gates, each applied to one of the n qubits separately.
p The additional
Hadamard gate on the lower qubit ensures that its state is changed to (|0i |1i)/ 2 which was the initial
state in the previous case.
The output state can be obtained by successively applying the gates to the initial state | 1i = |00 . . . 0i ⌦ |1i
and computing the intermediary states | 1 i, . . . , | 3 i. We have (verify this!)

3
X |ai |0i |1i
| 2i = p ⌦ p
2n 2
a2{0,1}n
X |ai |0i |1i
| 3i = ( 1)f (a) p ⌦ p
2 n 2
a2{0,1}n
X |xi |0i |1i
| 4i = ( 1)x·a+f (a) ⌦ p
2n 2
a,x2{0,1}n

In the last line we used the fact (verify this!) that


1 X
H ⌦n |ai = p ex·a |xi
2n x2{0,1}n

where x · a = x1 a1 + · · · + xn an .
Finally, the register of the n top qubits is measured in the standard basis. The outcome distribution depends
on whether f is constant or balanced in the following way (verify this!):
1) if f is constant then the coefficient of |xi is
X ⇢
( 1)f (0) x·a ( 1)f (0) if x = (0, 0, . . . , 0)
c(x) = ( 1) =
2n 0 if x 6= (0, 0, . . . , 0)
a2{0,1}n

so the outcome is x = (0, 0, . . . , 0) with probability one.


2) if f is balanced then the coefficient of |00 . . . 0i is equal to 0 so the outcome is different from x =
(0, 0, . . . , 0) with probability one.
Therefore with a single query to the “function” Uf we can decide whether it is constant or balanced, in
contrast to the classical case where at lest 2n 1 + 1 queries are required.

References:
[1] D. Deutsch and R. Josza, Rapid solutions of problems by quantum computation, Proceedings of the Royal
Society of London A. 439 553 (1992).
[2] R. Cleve, A. Ekert, C. Macchiavello and M. Mosca , Quantum algorithms revisited, Proceedings of the
Royal Society of London A. 454 339?354 (1998).

4
Introduction to Quantum Information Science

Lecture 16: The Grover algorithm

Abstract: The Grover algorithm is motivated by the problem of searching an unsorted database. The sought
for item is encoded as a quantum function, seen as a “black box” unitary. While the classical search problem
requires a number ofpqueries of the order of the database size N , Grover’s algorithm solves the quantum
search problem with N uses of the black box.

1 Unsorted database search problem


The database search problem is to find a particular item in a database containing N ⌧ 1 items. The com-
plexity of the problem depends significantly on whether the database is sorted or unsorted. For example a
telephone book lists all subscribers’ names and telephone numbers in alphabetical order. It is easy to find a
particular name on the list but is hard to find a particular phone number.
The mathematical formulation is the following. The database is represented by a one-to-one function

f : {0, 1, . . . , N 1} ! N

where N is a set of items. If a 2 N is a desired item, then there exists a unique ! 2 {0, 1, . . . N 1} such
that f (!) = a. The problem is to find ! by making as few calls to the function f as possible. The database
is sorted if the items are ordered (so i < j implies f (i) < f (j)). In this case the solution can be found in
about log N steps. One starts by checking the item in the middle of the array f ([N/2]); since the database
is ordered, this will tell us if the desired item is in the first [N/2] items or the last [N/2] items. We then
check the item in the middle of the corresponding half, etc. This is how we can find a particular name in a
phonebook. On the other hand, if the database is not sorted, then one cannot do better than check various
items randomly, which on average will require N/2 calls to the database to find the desired item.
An equivalent formulation can be given in terms of a black box (oracle) described by the function f! :
{0, 1, . . . , N 1} ! {0, 1} ⇢
0 if f (x) 6= a
f! : x 7!
1 if f (x) = a
such that f! (!) = 1. This function only tells us whether x is the right label for item a or not. The question
is how many times we need to use the oracle in order to find !.

2 Oracle formulation of quantum search


With the same definitions as above, let us consider that N = 2n so that the input of f! is an n bits string. As
in the Deutsch-Josza algorithm, we define the quantum function f! as the black box with unitary

Uf! : (C2 )⌦n ⌦ C2 ! (C2 )⌦n ⌦ C2


Uf! : |xi ⌦ |yi 7! |xi ⌦ |y f! (x)i

1
|xi |xi
Uf!
|yi |y f! (x)i

p
If the single (lower) qubit ins in state (|0i |1i)/ 2 then
|0i |1i |0i |1i
Uf! : |xi ⌦ p 7 ( 1)f! (x) |xi ⌦ p
!
2 2
so the state of the lower qubit is unchanged while the top qubits are transformed by the n qubits unitary
U! : (C2 )⌦n ! (C2 )⌦n
U! : |xi 7! ( 1)f! (x) |xi
So the effect of the black box is to flip the sign of a single basis vector |!i while leaving all the others
unchanged. From now on we will ignore the lower qubit and work with the unitary U! instead of Uf! . The
task is to find ! by using the minimum number of oracle black-boxes.

3 The Grover algorithm


p p
Grover proposed a quantum algorithm [1] which finds the solution with O( N ) (order of N ) uses of the
black-box compared with O(N ) for the classical unsorted database search problem.
The Grover algorithm is illustrated below, where we have ignored the additional qubit which remains in state
|0i |1i
p
2
throughout the process.

R R
|0i ⌦n
H ⌦n U! Us U! Us
Oracle Oracle

p
O( N )

Step 1. Initialise n qubits in state |0i⌦n ;


Step 2. Apply a Hadamard transformation to each qubit separately; the state is now
1 X
|si := (H|0i)⌦n = p |xi
N x

At this point the amplitudes of all basis vectors, including the “solution” |!i are equal to p1N . The aim is
to apply further transformations in order to raise the amplitude of |!i until it is very close to 1, so that a
measurement in the standard basis will reveal !.
Step 3. Apply the unitary transformation
R = Us · U!
for a number of times to be determined below. Here, Us is the unitary transformation which flips the sign of
all vectors orthogonal to |si
Us = 2|sihs| I.
Below we will show that Us can be implemented using O(n) (order of n) gates.
Step 4. Measure in the standard basis. With high probability, the result will be the sought solution !.

2
|!i

Us U! |si

2✓ |si
✓ |! ? i
U! |si

Figure 1: Grover rotation: by applying the two reflections to |si we get the vector Us U! |si which has angle
2✓ with respect to |si. Each iteration amounts to a rotation by 2✓ towards |!i.

3.1 How it works

Based on the definitions of U! and Us we make the following key observations


h |!i = 0 =) U! | i = | i
h |si = 0 =) Us | i = | i.
Therefore, R = Us · U! leaves invariant the subspace of all vectors orthogonal to |!i and |si and has a
non-trivial action only in the two dimensional subspace H0 ⇢ (C2 )⌦n spanned by |!i and |si.
Let us see what this action is; consider the orthonormal basis {|!i, |! ? i} in H0 . After step 2 the state is
|si 2 H0 which is at an angle ⇡/2 ✓ with |!i, or equivalently ✓ with |! ? i (see Figure ??) where
1
|hs|!i|2 = = sin2 ✓.
N

From the geometry we see that the first reflection produces the vector U! |si, while the second reflection gives
R|si which is rotated by an angle 2✓ towards |!i, compared to |si.
p
Example. For concreteness, consider the case N = 4 (2 qubits). Then sin ✓ = 1/ N = 1/2 so ✓ = 30 .
Therefore 2✓ = 60 so after one application of R we have
R|si = |!i.
At this moment, a standard basis measurement will reveal (with probability one) the solution !.

Whatp happens in the case when we have a large number of qubits (large database)? Then, since sin ✓ =
1/ N , the angle ✓ is small, so we will need to iterate the unitary R several times until the vector is (ap-
proximately)
p aligned with |!i. By approximating sin ✓ ⇡ ✓ we get that the number of iterations needs to be
[ ⇡4 N ] (where [·] denotes the integer part), and the probability of outcome ! is
1
p(!) = sin2 ((2T + 1)✓) = 1 O( )
N
where the small correction O( N1 ) is due to the fact that (2T + 1)✓ is not precisely ⇡/2 but differs from it by
O( p1N ).

3
p
Conclusion. We have shown that the Grover algorithm can find the solution ! using N “calls” to the oracle,
compared to N in the classical case.

3.2 Implementing the reflection Us

Let us answer now the question concerning the implementation of the unitary Us = 2|sihs| = I. By breaking
the operation down into basis gates we can show that it can be implemented with O(n) = O(logN ) gates.
First, since
|si = H ⌦n |0i⌦n
we find
Us = H ⌦n 2|0⌦n ih0⌦n | I H ⌦n .
Now, the middle term is a unitary which flips all basis vectors except |0⌦n i. This operation can be imple-
mented (up to an irrelevant minus sign) by a multiple controlled-Z gate C n 1 (Z) gate which flips only the
vector |1⌦n i, sandwiched by NOT (or X) which exchange |0i and |1i. Therefore Us can be realised with the
following circuit

H X X H
H X X H
H X X H

H X Z X H

Figure 2: Circuit implementation of the unitary Us using single qubit gates and a multiple controlled-Z
operation.

Finally, the multiple controlled-Z operation C n 1 (Z) can be broken down into O(n) simpler gates [2]. We
will not show how this works in general, but illustrate the case of 3 qubits. The circuit identity below
shows (exercise) that C 2 (Z) can be implemented using CNOT, and controlled-U operations with unitaries
the phases ✓ ◆ ✓ ◆
1 0 ⇤ 1 0
S := , S := .
0 i 0 i
such that S 2 = Z.

Z S S⇤ S

References:
[1] L. K. Grover, A fast quantum mechanical algorithm for database search, Proceedings, 28th Annual ACM
Symposium on the Theory of Computing, 212 (1996)
[2] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University
Press 2000.

4
Introduction to Quantum Information Science

Lectures 17-18: Indirect Measurements and Quantum


Channels

Abstract. In the previous lectures we started to build an operational framework for describing states of sub-
systems, and the relation between such states and the probabilities of measurements performed on composite
systems. We will now expand this framework to describe measurements and evolution of systems, which
interact with other systems (ancillas) or with the environment.

1 Indirect measurement and quantum channels


Postulate 2 says that a closed quantum system evolves unitarily. What if the system is not closed, but interacts
with other components of a larger system, or with the environment, e.g. an atom interacting with the electro-
magnetic field ? To answer this question we will assume that the system together with its environment form
a larger closed system with Hilbert space Hs ⌦ Ha . Typically, the unitary operator describing the evolution
of this larger system over a period of time t has the form U(t) = exp(itH) where H is the total hamiltonian
taking the form H = Hs ⌦ Ia + Is ⌦ Ha + Hint , with Hint the interaction Hamiltonian. If the initial joint
state is | i := | i ⌦ | i, then the state at a later time U(t)| i will typically be entangled for Hint 6= 0,
which implies that the system’s partial state at time t will be mixed. Thus the system undergoes a non-unitary
evolution, changing from a pure state to a mixed state. Our goal is to understand this type of evolution, as
illustrated in Figure 1.

⇢ ⇢0 (i)
U
⌧ N X= i

Figure 1: A noisy channel modelled as unitary coupling to the environment followed by tracing out the
environment, or averaging over conditional states with respect to a measurement in the environment.

The unitary U(t) has been replaced by a generic unitary U, and for simplicity we consider first the case
where the system and the environment are initially in the pure states ⇢ := | ih | and ⌧ := | ih |. Suppose
that after the unitary transformation we perform a measurement N in the environment, with PVM consisting
of one dimensional projections onto the vectors of an ONB {P1 = |f1 ihf1 |, . . . , Pm = |fm ihfm |}. Below
we describe the probability distribution of the outcomes and the (conditional) posterior state.

1
Conditional state. The conditional joint system and environment state, given the outcome X = i is

0 (I ⌦ Pi )U| ⌦ i |Ki i ⌦ |fi i


| s+a (i)i = =
k(I ⌦ Pi )U| ⌦ ik kKi | ik

where Ki is the operator on Hs with matrix elements

hej |Ki |ek i := hej ⌦ fi |U|ek ⌦ i. (1)

The second equality can be verified by taking the inner product with a basis vectors |ej ⌦ fl i.
Note that the joint conditional state is a product of pure states, in particular the system is in state

Ki | i
| 0 (i)i = .
kKi | ik

Written in terms of density matrices, this becomes

Ki | ih |K⇤i Ki | ih |K⇤i Ki ⇢K⇤i Ki ⇢K⇤i


⇢0 (i) = = = ⇤ = (2)
kKi | ik2 kKi | ik2 Tr(Ki Ki ⇢) Tr(Mi ⇢)

where ⇢ = | ih | is the ‘input’ state and Mi := K⇤i Ki is a positive operator.

Probability distribution. The probability of outcome X = i is

P(X = i) = k(I ⌦ Pi )U( ⌦ )k2 = kKi k2 = hKi |Ki i = h |K⇤i Ki i = Tr(⇢Mi )

Note that this formula is similar to the one which prescribes the probabilities associated to a projective
measurement, with the difference that the projections are replaced by positive operators Mi . These operators
satisfy the identity
Xm Xm
Mi = K⇤i Ki = I
i=1 i=1
P
which insuresPthat the probabilities satisfy i P(X = i) = 1. Indeed, using equation (1) and inserting the
identity I = l |el ihel |, we have
X X X
hej | K⇤i Ki |ek i = hej |K⇤i |el ihel |Ki |ek i = hej ⌦ |U⇤ |el ⌦ fi ihel ⌦ fi |U|ek ⌦ i
i i,l i,l
= hej ⌦ |U⇤ U|ek ⌦ i = hej ⌦ |I ⌦ I|ek ⌦ i = hej |I|ek i.

Average posterior state. Suppose now that the measurement outcome is ignored. Then the state of the
system is the average
X X Ki ⇢K⇤i X
⇢0 := P(X = i)⇢0 (i) = Tr(Mi ⇢) = Ki ⇢K⇤i . (3)
i i
Tr(Mi ⇢) i

What if no measurement is performed? Since the measurement is performed in the environment, the partial
state of the system after the unitary transformation is the same as the average state after the measurement.
Alternatively it can be directly verified that
X X
Tra (U(⇢ ⌦ | ih |)U⇤ ) = Tra [(I ⌦ Pi ) U(⇢ ⌦ | ih |)U⇤ (I ⌦ Pi )] = Ki ⇢K⇤i .
i i

In the above arguments we assumed for simplicity that the initial state of the system is pure. However, since
any mixed state is a convex combination of pure states, formulas (2) and (3) for the conditional and mean
states hold for arbitrary input states.

2
In conclusion we found that the transformation from the input to the output of the purple box in Figure 1
(ignoring the classical outcome) is the transformation mapping states into states
m
X
E : ⇢ 7 ! ⇢0 = Ki ⇢K⇤i , (4)
i=1
P
where Ki are operators satisfying the ‘normalisation’ condition i K⇤i Ki = I, which is necessary and
sufficient for ⇢0 to be a density matrix. Conversely, it can be shown that and collection of operators satisfying
the above identity can be realised through the indirect measurement scheme for an appropriately chosen
unitary U (exercise). We collect our findings in the following definition.

Definition 1 (channels and POVMs).


P
1. Let {K1 , . . . , Km } be a collection of operators on Hs = Cd satisfying i K⇤i Ki = I. The state
transformation E : M (Cd ) ! M (Cd )
m
X
E :⇢7 ! Ki ⇢K⇤i
i=1

is called a quantum channel.


P
2. A collection of positive operators {M1 , . . . , Mm } which satisfy i Mi = I is called a positive operator
valued measure (POVM). This defines a measurement with outcomes {1, . . . , m} and probability distribution

P(X = i) = Tr(⇢Mi )

The operators Ki are often called Kraus operators for reasons which will be explained later on. The decom-
position of E with respect to Kraus operators is not unique, as it depends on the particular measurement basis,
as defined in (1).

Constructing a channelP for arbitrary Kraus operators. Until now it is not clear if given an arbitrary set of
operators Vi satisfying i Vi⇤ Vi = I, we can find an indirect measurement set-up such that the channel E
defined in (4) has Kraus operators Vi . One of the exercises will be to show that this is indeed the case, so all
maps of the form (4) represent physical transformations.

Example: projective Kraus operators. Let P0 , P1 be the projections onto the basis vectors |0i, |1i of C2 .
Then
E(⇢) = P0 ⇢P0 + P1 ⇢P1
is a channel describing the change of state due to a direct measurement in the standard basis. It’s main feature
is that it destroys the coherence between the two basis vectors
✓ ◆ ✓ ◆
⇢00 ⇢01 ⇢00 0
E: 7 !
⇢10 ⇢11 0 ⇢11

Example: random unitary channel. Let U1 and U2 be two unitary operators. Then

E : ⇢ 7 ! U1 ⇢U⇤1 + (1 )U2 ⇢U⇤2 , 0  1,


p p
is the channel with Kraus operators K1 = U1 and K2 = 1 U2 obtained by randomly rotating the
system with one of the unitaries.

3
2 Examples of noisy qubit channels
In this section we give a few examples of qubit channels and discuss their physical interpretation.
The depolarising channel. It can be described by saying that with probability 1 the qubit remains intact
while with probability an ‘error’ occurs. There are three types of errors which we consider as equally likely
1. bit flip: | i 7! X| i (swaps the basis vectors)
2. phase flip: | i 7! Z| i (changes the phase of the vector |1i)
3. both flips: | i 7! Y | i
The Kraus operators of the depolarising channel are
p p p p
K0 = 1 I, K1 = /3X, K2 = /3Z, K3 = /3Y
P ⇤
and verify Ki Ki = I. These Kraus operators can be obtained by coupling the qubit with a 4 dimensional
environment (ancilla) prepared in state |0i with action
p p
U : | i ⌦ |0i 7! 1 | i ⌦ |0i + [X| i ⌦ |1i + Z| i ⌦ |2i + Y | i ⌦ |3i]

The action of the channel on a state is


3
X
D:⇢7 ! Ki ⇢K⇤i = (1 )⇢ + (X⇢X + Y ⇢Y + Z⇢Z).
i=0
3

The channel can also be represented by its action on the Bloch vector r of the state ⇢ = ⇢r . As an exercise
you can show that this action is a uniform contraction of the Bloch sphere
✓ ◆
0 4
r7 !r = 1 r
3
n
By repeatedly applying the channel n times, the Bloch vector shrinks by a factor 1 4
3 , so in the limit
n ! 1 the state converges to the “completely unpolarised" state with r = 0.

lim Dn ⇢ = I/2.
n!1

The phase damping channel. This channel can be implemented by coupling with a 3 level system, with
initial state |0ia . The action of the unitary interaction U (on the relevant vectors | is ⌦ |0ia ) is
p p
|0is ⌦ |0ia 7 ! 1 |0is ⌦ |0ia + |0is ⌦ |1ia
p p
|1is ⌦ |0ia 7 ! 1 |1is ⌦ |0ia + |1is ⌦ |2ia

The interpretation is that the environment ‘scatters’ off the two level atom with probability and chages its
state to |1ia or |2ia depending on the state of the atom.
The Kraus operators of this channel are
✓ ◆ ✓ ◆
p p 1 0 p 0 0
K0 = h0|U|0i = 1 I, K1 = h1|U|0i = , K2 = h2|U|0i =
0 0 0 1

The action of the channel on a state ⇢ is


2
X ✓ ◆ ✓ ◆ ✓ ◆
⇢00 ⇢01 ⇢00 0 ⇢00 (1 )⇢01
P:⇢7 ! Ki ⇢K⇤i = (1 ) + =
⇢10 ⇢11 0 ⇢11 (1 )⇢10 ⇢11
i=0

4
which means that with probability p the coherence of the state is destroyed. If the channel is applied n times
then the off-diagonal elements (the “coherences”) decrease as (1 p)n ⇢01 so in the limit n ! 1 the state
becomes diagonal ✓ ◆
n ⇢00 0
lim P ⇢ = .
n!1 0 ⇢11

The action of the channel on the Bloch sphere is left as an exercise.

The amplitude damping channel. This channel describes the decay of an excited state of a two level atom
due to spontaneous emission of a photon. The environment (electromagnetic field) is modelled as a two
level system in the vacuum state | i = |0i. The action of the unitary interaction U (on the relevant vectors
| is ⌦ |0ia ) is

|0is ⌦ |0ia 7 ! |0is ⌦ |0ia


p p
|1is ⌦ |0ia 7 ! 1 |1is ⌦ |0ia + |0is ⌦ |1ia

The Kraus operators are


✓ ◆ ✓ p ◆
1 p 0 0
K0 = h0|U|0i = , K1 = h1|U|0i =
0 1 0 0

The action of the channel on the states is


1
X ✓ p ◆ ✓ ◆
A:⇢7 ! Ki ⇢K⇤i = p ⇢00 1 ⇢01
+
⇢11 0
1 ⇢10 (1 )⇢11 0 0
i=0

The second term describes the jump from state |1ia to |0ia , and the first describes the evolution when no
jump occurs.
p
If we apply the channel n times, the off-diagonal 01 element becomes ( 1 )n ⇢01 and the diagonal 11
element becomes ⇢11 . Therefore in the limit n ! 1 the state converges to |0ih0|, i.e. the atom decays to
n

the “ground state” ✓ ◆


n 1 0
lim A ⇢ = .
n!1 0 0

5
6
Introduction to Quantum Information Science

Lecture 19: Complete Positivity and the Kraus Theorem

Abstract. In this lecture we investigate the general form of state transformation allowed by quantum mechan-
ics, and study the key concept of complete positivity. The key result is the Kraus Theorem which characterises
all quantum state transformations.

1 Intermezzo on superoperators
We collect here a few mathematical definitions which will be needed below. The quantum channel
X
E : ⇢ 7! Ki ⇢K⇤i (1)
i

is a particular case of a linear map from operators to operators,

E : M (Cd ) ! M (Cd ),

for which reason it is sometimes called a superoperator. As E is a linear operator on a vector space, we can
define its matrix with respect to a basis in M (Cd ) in the usual way. Furthermore we can define the tensor
products of two superoperators F : M (Cn ) ! M (Cn ) and E : M (Cd ) ! M (Cd ) as the linear extension of

F ⌦ E : A ⌦ B 7! F(A) ⌦ E(B)

to a linear map on M (Cn ) ⌦ M (Cd ). We denote by In : the identity superoperator on M (Cn ), i.e.

In : A 7! A, A 2 M (Cn ).

The superoperator E will be called positive if E(A) 0 for any operator A 0. Surprisingly, we will see
that the tensor product of two positive superoperators is not necessarily positive.

2 Completely positive maps


Are there more quantum channels than those of the form given by equation (1) ? In this section we look at
channels from a more ‘axiomatic’ viewpoint, similarly to the way we characterised general measurements as
positive convex maps from quantum states to classical distributions. The aim is to find a general mathematical
and operational characterisation of such transformations.
Let us first note that quantum channels are analogues of classical channels (or randomisations) the latter being
positive, convex maps sending probability distributions to probability distributions. It is easy to see that such
maps are of the form T : q 7 ! q 0 where q 0 is the distribution
k
X
0
q (i) := q(j)P(i|j), i 2 {1, . . . , k}
j=1

1
with P(i|j) the probability of ‘jumping’ from state j to state i of the state space {1, . . . , k}.
Therefore it is tempting to define a quantum channel for a d-dimensional system by the minimum requirement
that it maps density matrices to density matrices

E : S d ! Sd ,

and is convex on the space of states Sd ⇢ M (Cd ). Such a map can be extended by linearity to a positive map
E : M (Cd ) ! M (Cd ). However it turns our that this definition is too broad since it allows for unphysical
transformations. To see this, suppose that our system is a component of a bipartite system with Hilbert
space Cn ⌦ Cd , and the channel E acts on the right-side system, while the left side does not undergo any
transformation. Note that the position of the system and the ‘environment’ have been swapped compared to
the previous section. This change is inessential but keeps with the standard conventions used in the literature.
If the two systems are initially independent then the joint transformation is

⌧ ⌦ ⇢ 7 ! ⌧ ⌦ E(⇢) = (In ⌦ E)(⌧ ⌦ ⇢).

where In : M (Cn ) ! M (Cn ) is the identity transformation. On physical grounds, we would expect that
In ⌦ E is itself a channel, i.e. it maps all joint states into joint states. This is clearly true for the product state
⌧ ⌦ ⇢, but we claim that it may no longer be the case if the input is an entangled state! Here is an example.
The transposition is not a physical transformation. Consider a two qubits bipartite system C2 ⌦ C2 , and
let T : M (C2 ) ! M (C2 ) be the transposition with respect to the standard basis such that
✓ ◆ ✓ ◆
⇢11 ⇢12 ⇢11 ⇢21
T : 7 ! .
⇢21 ⇢22 ⇢12 ⇢22
Since T leaves the eigenvalues
p invariant, it maps states into states. Consider now the maximally entangled
state | + i := (|00i + |11i)/ 2 whose density matrix is
1
| + ih + | = (|0ih0| ⌦ |0ih0| + |0ih1| ⌦ |0ih1| + |1ih0| ⌦ |1ih0| + |1ih1| ⌦ |1ih1|).
2
By acting with I2 ⌦ T we obtain
1
(I2 ⌦ T ) : | + ih + | 7! (|0ih0| ⌦ |0ih0| + |1ih0| ⌦ |0ih1| + |0ih1| ⌦ |1ih0| + |1ih1| ⌦ |1ih1|)
2
The matrix of the right side with respect to the basis {|00i, |10i, |10i, |11i} is
0 1
1 0 0 0
1BB 0 0 1 0 C
C
2 @ 0 1 0 0 A
0 0 0 1
which has a negative eigenvalue, and hence it is not a density matrix.
This example shows that it is not enough to require that the channel E be a positive map, but one also needs
to require that In ⌦ E is positive for all n, in order to exclude unphysical examples such as the transposition.
Definition 1 (completely positive maps/channels). Let E : M (Cd ) ! M (Cd ) be a linear map.
1. E is called completely positive if In ⌦ E is positive for all n 0;
2. E is called trace preserving if Tr(E(A)) = Tr(A) for all A 2 M (Cd );

We can now spell out the three requirements for a channel E on a d-dimensional system:
i) E : M (Cd ) ! M (Cd ) is a linear transformation
ii) E is completely positive
iii) E is trace preserving, in particular it maps states to states E : Sd ! Sd .

2
Lemma 2. The quantum channel
X X
E : ⇢ 7! Ki ⇢K⇤i = Ei (⇢)
i i

defined in (1) satisfies the requirements i)-iii).

Proof. Since properties i) and iii) have already been established, we only need to verify the complete posi-
tivity, and it suffices to prove it for a single term Ei of the sum. Let A 2 M (Cn ) ⌦ M (Cd ) be a positive
operator which can be written as
Xn
A= |fj ihfk | ⌦ Ajk
jk=1

with Ajk 2 M (Cd ). Then


n
X
(In ⌦ Ei )(A) = |fj ihfk | ⌦ Ei (Ajk )
jk=1

and the positivity follows from


* +
X
h |(In ⌦ Ei )(A)| i = |fj ihfk | ⌦ Ei (Ajk )
jk
* n
+
X
= |fj ihfk | ⌦ Ki Ajk K⇤i
jk=1
* n
+
X
= (I ⌦ Ki )(|fj ihfk | ⌦ Ajk )(I ⌦ K⇤i )
jk=1
* n
+
X
= (I ⌦ Ki ) |fj ihfk | ⌦ Ajk (I ⌦ Ki )
jk=1

= h(I ⌦ K⇤i ) |A| (I ⌦ K⇤i ) i 0

where the trick was to push the operator K⇤i towards the vectors and then use the positivity of A.

3 The Kraus Theorem


We have seen that the channels (1) are CPTP maps. We will now address the following questions: are all the
CPTP maps physical, and are they always of the form (1) ? The answer to both questions is yes, and this is
usually called the Kraus Theorem, which is a special case of a more general theorem of Stinespring.
Theorem 3 (Kraus Theorem). A linear map E : M (Cd ) ! M (Cd ) is completely positive and trace preserv-
ing if and only it is a channel, i.e. it is of the form
X
C(⇢) = Ki ⇢Ki
i
P
for some operators Ki 2 M (Cd ) satisfying i K⇤i Ki = I.

Proof. We have already shown that channels are CPTP. We will prove that the converse is also true. Let
{|e1 i, . . . , |ed i} be an ONB in of the system, and consider an additional ‘environment’ of the same dimension
d, for which we fix an ONB {|f1 i, . . . , |fd i}.

3
Define the vector
d
X
|↵i := |fi i ⌦ |ei i
i=1

which is a maximal entangled state, up to the normalisation factor. Next, we define the operator on Cd ⌦ Cd
by
:= (Id ⌦ E) (|↵ih↵|)
Physically, this would be the result of applying the transformation E to the system when system and environ-
ment are maximally entangled. The key fact which we prove below is that E is in one to one correspondence
with , that is the channel is completely characterised by a single input state on a larger system!
Note that, it is enough to prove the Kraus decomposition for pure states, since the general statement follows
from the fact that any mixed state is a convex combination of pure states. For each vector state
d
X
| i= ci |ei i
i=1

of the system, we define a corresponding state of the environment


d
X
| ˜i := ci |fi i.
i=1

In the following argument we will use the notion of partial inner product over the environment. If |ui, |vi
are vectors on the left-side of the tensor product ( the environment) then

hu|A ⌦ B|vi := hu|A|viB

which can be extended by linearity to all operators in M (Cd ) ⌦ M (Cd ). This notion is closely related to that
of partial trace: X
Tre (C) = hfj |C|fj i.
j

Using the partial inner product we have


* +
X
h ˜| | ˜i = ˜ |fi ihfj | ⌦ E (|ei ihej |) ˜
ij
X
= ci cj E(|ei ihej |)
ij
= E(| ih |),
P
Consider a decomposition of the positive operator of the form = i |si ihsi | where |si i 2 Cd ⌦ Cd are
vectors which need not be normalised. Define the system operator

Ki : | i 7 ! h ˜|si i

which is linear with respect to | i (verify!). Then


X X
Ki | ih | K⇤i = h ˜|si ihsi | ˜i = h ˜| | ˜i = E(| ih |)
i i

which proves the statement of the theorem for pure states.

A consequence of the proof is that any channel on M (Cd ) can be represented in the Kraus form with at most
d2 Kraus operators, since can always be decomposed into at most d2 projections using the spectral theorem.

4
Freedom in choosing the Kraus operators. As already noticed in the indirect measurement set-up, the
Kraus operators of a channel are not unique, since one has the freedom to measure in different ONB of the
environment. The following theorem which we state without proof gives a precise characterisation of the
relation between the different sets of Kraus operators.
Theorem 4 (unitary freedom in the Kraus decomposition). Let (K1 , . . . , Km ) and (V1 , . . . , Vn ) be the
Kraus operators of the channels E and respectively
P F, where m n. Then E = F if and only if there exists
a unitary m ⇥ m matrix such that Ki = j uij Vj , where Vn+1 = · · · = Vm = 0.

4 The no-cloning theorem


During the previous lectures we have alluded to the fact that the state of a quantum system cannot be copied,
or cloned. This statement can be formulated with various degrees of sophistication but the simplest one is
saying that there exists no physical transformation (quantum channel) E from M (Cd ) to M (Cd ) ⌦ M (Cd )
such that
E(| ih |) = | ih | ⌦ | ih |
for all states | i 2 Cd . A possible proof goes via the Kraus theorem (adapted to channels between different
systems). Since the output of the channel must be pure for any pure input, it is easy to see that the channel
must have a single Kraus operator which is an isometry V⇤ V = I. Then we must have

V| i = | i ⌦ | i

which is impossible since the map does not preserve inner products
2
h 1| 2i 6 h
= 1 ⌦ 1| 2 ⌦ 2i =h 1| 2i .

Although it is a very simple result, the no-cloning theorem shows the important difference between classical
and quantum information and is crucial in quantum cryptography.

5
5 Summary of quantum channels and complete positivity
1. The evolution of a system undergoing an indirect measurement can be described by a quantum channel of
the form X
E : ⇢ 7! Ki ⇢K⇤i
i
P
with Kraus operators Ki = hfi |U| i satisfying i K⇤i Ki = I

2. If E : M (Cd ) ! M (Cd ) is a positive superoperator (i.e. E(A) 0 for A 0), then In ⌦ E need not be
a positive superoperator. Example: the transposition.

3. A map E : M (Cd ) ! M (Cd ) is completely positive (CP) is In ⌦ E is positive for all n. A channel in the
Kraus form is CP and trace preserving (TP).

4. Kraus Theorem. Any CPTP map E : M (Cd ) ! M (Cd ) is quantum channel, i.e. it is of the form
X
E : ⇢ 7! Ki ⇢K⇤i
i
P
with Kraus operators satisfying i K⇤i Ki = I. The number of Kraus operators can be chosen to be at most
d2 .

5. Three important examples of qubit channels: depolarising channel, phase damping channel, and amplitude
damping channel.

No Cloning Theorem: There exists not quantum channel such that

E(| ih |) = | ih | ⌦ | ih |

for all | i.

6
Introduction to Quantum Information Science

Lectures 20-21: Quantum Error Correction

Abstract. Real quantum computers are affected by quantum noise and other imperfections. The ability to
perform quantum error correction is a key requirement for successfully implementing quantum algorithms
on a physical quantum computer. In these lectures we introduce the quantum error correction problem and
analyse basic methods for correcting bit flip, phase flip and other noisy qubit channels. The highlight of the
lectures is the 9 qubits code proposed by Shor which can correct one qubit errors for arbitrary noise models.

1 Classical error correction


Any computing device is affected by various forms of physical noise, e.g. current fluctuations, heating, or
even cosmic rays corrupting its memory or logical operations. To counter this, the logical operations need to
be accompanied by error correction procedures capable of keeping the effects of noise under control.
For one bit, the error process can be modelled by a classical channel whereby in each given time unit the state
0 (or 1) remains unchanged with probability 1 ✏ or flips to 1 (respectively 0) with probability ✏

1 ✏
0
<latexit sha1_base64="jV5j0H8zp902TyC8bagJvV71UI8=">AAAB83icbVBNSwMxEM3Wr1q/qh69BFvBi2W3F/VW8OKxgv2A7lKy6Wwbmk1CkhVK6d/w4kERr/4Zb/4b03YP2vpg4PHeDDPzYsWZsb7/7RU2Nre2d4q7pb39g8Oj8vFJ28hMU2hRyaXuxsQAZwJallkOXaWBpDGHTjy+m/udJ9CGSfFoJwqilAwFSxgl1klhNbgKQRnGpaj2yxW/5i+A10mQkwrK0eyXv8KBpFkKwlJOjOkFvrLRlGjLKIdZKcwMKELHZAg9RwVJwUTTxc0zfOGUAU6kdiUsXqi/J6YkNWaSxq4zJXZkVr25+J/Xy2xyE02ZUJkFQZeLkoxjK/E8ADxgGqjlE0cI1czdiumIaEKti6nkQghWX14n7Xot8GvBQ73SuM3jKKIzdI4uUYCuUQPdoyZqIYoUekav6M3LvBfv3ftYtha8fOYU/YH3+QPhWJDj</latexit>

<latexit sha1_base64="33dSXSKNIcSb/N/RSsESkQRe9/U=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWVE8wHJEfY2c8mSvb1jd08IR36CjYUitv4iO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzTRBP6IjyUPOqLHSQ9WtDsoVt+YuQNaJl5MK5GgOyl/9YczSCKVhgmrd89zE+BlVhjOBs1I/1ZhQNqEj7FkqaYTazxanzsiFVYYkjJUtachC/T2R0UjraRTYzoiasV715uJ/Xi814bWfcZmkBiVbLgpTQUxM5n+TIVfIjJhaQpni9lbCxlRRZmw6JRuCt/ryOmnXa55b8+7rlcZNHkcRzuAcLsGDK2jAHTShBQxG8Ayv8OYI58V5dz6WrQUnnzmFP3A+fwAtNY0F</latexit>
<latexit

✏ 0
<latexit sha1_base64="33dSXSKNIcSb/N/RSsESkQRe9/U=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWVE8wHJEfY2c8mSvb1jd08IR36CjYUitv4iO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzTRBP6IjyUPOqLHSQ9WtDsoVt+YuQNaJl5MK5GgOyl/9YczSCKVhgmrd89zE+BlVhjOBs1I/1ZhQNqEj7FkqaYTazxanzsiFVYYkjJUtachC/T2R0UjraRTYzoiasV715uJ/Xi814bWfcZmkBiVbLgpTQUxM5n+TIVfIjJhaQpni9lbCxlRRZmw6JRuCt/ryOmnXa55b8+7rlcZNHkcRzuAcLsGDK2jAHTShBQxG8Ayv8OYI58V5dz6WrQUnnzmFP3A+fwAtNY0F</latexit>
<latexit

<latexit sha1_base64="Ss8G+sL2mp363FfDk7LvhCLyWTQ=">AAAB8XicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWUE84HJEfY2c8mSvb1jd08IR/6FjYUitv4bO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzDRBP6IjyUPOqLHSY7WPieYiltVBueLW3AXIOvFyUoEczUH5qz+MWRqhNExQrXuemxg/o8pwJnBW6qcaE8omdIQ9SyWNUPvZ4uIZubDKkISxsiUNWai/JzIaaT2NAtsZUTPWq95c/M/rpSa89jMuk9SgZMtFYSqIicn8fTLkCpkRU0soU9zeStiYKsqMDalkQ/BWX14n7XrNc2vefb3SuMnjKMIZnMMleHAFDbiDJrSAgYRneIU3RzsvzrvzsWwtOPnMKfyB8/kDBSuQcQ==</latexit>
<latexit


1 1
<latexit sha1_base64="Ss8G+sL2mp363FfDk7LvhCLyWTQ=">AAAB8XicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWUE84HJEfY2c8mSvb1jd08IR/6FjYUitv4bO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzDRBP6IjyUPOqLHSY7WPieYiltVBueLW3AXIOvFyUoEczUH5qz+MWRqhNExQrXuemxg/o8pwJnBW6qcaE8omdIQ9SyWNUPvZ4uIZubDKkISxsiUNWai/JzIaaT2NAtsZUTPWq95c/M/rpSa89jMuk9SgZMtFYSqIicn8fTLkCpkRU0soU9zeStiYKsqMDalkQ/BWX14n7XrNc2vefb3SuMnjKMIZnMMleHAFDbiDJrSAgYRneIU3RzsvzrvzsWwtOPnMKfyB8/kDBSuQcQ==</latexit>
<latexit

1 ✏
<latexit sha1_base64="/vQQe6zQSWV+FvCQ3nJDnQ5NV7U=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWVE8wHJEfY2c8mSvb1jd08IR36CjYUitv4iO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzTRBP6IjyUPOqLHSQ9WrDsoVt+YuQNaJl5MK5GgOyl/9YczSCKVhgmrd89zE+BlVhjOBs1I/1ZhQNqEj7FkqaYTazxanzsiFVYYkjJUtachC/T2R0UjraRTYzoiasV715uJ/Xi814bWfcZmkBiVbLgpTQUxM5n+TIVfIjJhaQpni9lbCxlRRZmw6JRuCt/ryOmnXa55b8+7rlcZNHkcRzuAcLsGDK2jAHTShBQxG8Ayv8OYI58V5dz6WrQUnnzmFP3A+fwAuuo0G</latexit>
<latexit <latexit sha1_base64="/vQQe6zQSWV+FvCQ3nJDnQ5NV7U=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFhPBKtylUbuAjWVE8wHJEfY2c8mSvb1jd08IR36CjYUitv4iO/+Nm+QKTXww8Hhvhpl5QSK4Nq777RQ2Nre2d4q7pb39g8Oj8vFJW8epYthisYhVN6AaBZfYMtwI7CYKaRQI7AST27nfeUKleSwfzTRBP6IjyUPOqLHSQ9WrDsoVt+YuQNaJl5MK5GgOyl/9YczSCKVhgmrd89zE+BlVhjOBs1I/1ZhQNqEj7FkqaYTazxanzsiFVYYkjJUtachC/T2R0UjraRTYzoiasV715uJ/Xi814bWfcZmkBiVbLgpTQUxM5n+TIVfIjJhaQpni9lbCxlRRZmw6JRuCt/ryOmnXa55b8+7rlcZNHkcRzuAcLsGDK2jAHTShBQxG8Ayv8OYI58V5dz6WrQUnnzmFP3A+fwAuuo0G</latexit>
<latexit

<latexit sha1_base64="jV5j0H8zp902TyC8bagJvV71UI8=">AAAB83icbVBNSwMxEM3Wr1q/qh69BFvBi2W3F/VW8OKxgv2A7lKy6Wwbmk1CkhVK6d/w4kERr/4Zb/4b03YP2vpg4PHeDDPzYsWZsb7/7RU2Nre2d4q7pb39g8Oj8vFJ28hMU2hRyaXuxsQAZwJallkOXaWBpDGHTjy+m/udJ9CGSfFoJwqilAwFSxgl1klhNbgKQRnGpaj2yxW/5i+A10mQkwrK0eyXv8KBpFkKwlJOjOkFvrLRlGjLKIdZKcwMKELHZAg9RwVJwUTTxc0zfOGUAU6kdiUsXqi/J6YkNWaSxq4zJXZkVr25+J/Xy2xyE02ZUJkFQZeLkoxjK/E8ADxgGqjlE0cI1czdiumIaEKti6nkQghWX14n7Xot8GvBQ73SuM3jKKIzdI4uUYCuUQPdoyZqIYoUekav6M3LvBfv3ftYtha8fOYU/YH3+QPhWJDj</latexit>

The simplest error correction scheme is the repetition code. In this code, each bit of a string of logical bits
is physically encoded as 3 (or more) identical bits. For instance, the binary sequence 1011 is encoded as the
string 111 000 111 111 of physical bits. We assume that each of the physical bits is affected independently by
the bit flip noise. Focusing on a particular block of 3 identical bits, the probability of exactly one bit flipping
is 3✏(1 ✏)2 . The block 000 for instance can change as

000 ! 100
000 ! 010
000 ! 001

The error detection procedure consists of comparing the bits in each block. If they are equal, no error correc-
tion is performed. If one is different from the other two, it is deemed to be an error, and it is flipped so that all
bits are equal. Although it corrects one bit flip errors, the scheme fails if two or three of the bits flip. However,
this happens with probability 3✏2 (1 ✏) + ✏3 which is much smaller that ✏ for small error level ✏. Therefore,
the probability of an error on the logical bit level has been significantly reduced by using redundant encoding
and error correction.

1
2 Bit flip quantum error correction
Consider now the quantum analogue of the classical bit flip channel. The one qubit quantum channel

B : ⇢ 7! (1 ✏)⇢ + ✏X⇢X

can be interpreted as flipping the qubit (applying X) with probability ✏ and leaving it intact with probability
1 ✏. This transformation is irreversible: there exists no quantum channel R which reverses the action of B,
i.e. R : C(⇢) 7! ⇢, for all ⇢ (exercise).
We will show that by encoding the quantum information (logical qubit) redundantly as a state of multiple
physical qubits, we can protect it from the action of the noise channel B acting independently on each physical
qubit.
Encoding. A naive extension of the classical repetition code would be to encode a ‘logical’ qubit state
| i = a|0i + b|1i as a 3 physical qubits state

| i 7! | i ⌦ | i ⌦ | i.

However such a map is unphysical due to the no-cloning theorem.


Instead, we will encode the state as follows

a|0i + b|1i 7! a|000i + b|111i

which means that the logical qubit state belongs to the 2 dimensional subspace of C2 ⌦ C2 ⌦ C2 spanned by
the vectors |000i and |111i.
The encoding process can be implemented by the following 3 qubits circuit whose input state is | i⌦|0i⌦|0i

| i <latexit sha1_base64="B8GPU4RCzlhEkSS19GUXi3jW4f0=">AAAB9XicbVC7TsNAEFzzDOEVoKSxSJCoIjsUQBeJhjJI5CHFJjpf1skp57N1dwZFJv9BQwFCtPwLHX/DJXEBCSOtNJrZ1e5OkHCmtON8Wyura+sbm4Wt4vbO7t5+6eCwpeJUUmzSmMeyExCFnAlsaqY5dhKJJAo4toPR9dRvP6BULBZ3epygH5GBYCGjRBvpvvLkJYp5kogBx0qvVHaqzgz2MnFzUoYcjV7py+vHNI1QaMqJUl3XSbSfEakZ5TgpeqnChNARGWDXUEEiVH42u3pinxqlb4exNCW0PVN/T2QkUmocBaYzInqoFr2p+J/XTXV46WdMJKlGQeeLwpTbOranEdh9JpFqPjaEUMnMrTYdEkmoNkEVTQju4svLpFWruudV57ZWrl/lcRTgGE7gDFy4gDrcQAOaQEHCM7zCm/VovVjv1se8dcXKZ47gD6zPHyeCkj4=</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

Indeed by applying the two CNOT gates we get

(a|0i + b|1i) ⌦ |0i ⌦ |0i 7! a|0i ⌦ |0i ⌦ |0i + b|1i ⌦ |1i ⌦ |0i 7! a|0i ⌦ |0i ⌦ |0i + b|1i ⌦ |1i ⌦ |1i

We denote by P0 the projection onto this subspace, P0 = |000ih000| + |111ih111|.

Error process. Since each physical qubit is affected by the noise channel B, the action on a block of 3 qubits
is B ⌦3 : M (C2 )⌦3 ! M (C2 )⌦3 . Similarly to the classical case, we can describe its action on the encoded
state probabilistically

2
no error with probability (1 ✏)3 : a|000i + b|111i 7! a|000i + b|111i

{
with probability ✏(1 ✏)2 : a|000i + b|111i 7! a|100i + b|011i
1 error with probability ✏(1 2
✏) : a|000i + b|111i 7! a|010i + b|101i
2
with probability ✏(1 ✏) : a|000i + b|111i 7! a|001i + b|110i

{
with probability ✏2 (1 ✏) : a|000i + b|111i 7! a|110i + b|001i
2 errors with probability ✏ (1 2
✏) : a|000i + b|111i 7! a|101i + b|010i
with probability ✏2 (1 ✏) : a|000i + b|111i 7! a|011i + b|100i

3 errors with probability ✏3 :


<latexit sha1_base64="iY9iGfE9Paaw2e6rxeBUNHZ7S5M=">AAAHKnicrZVNb9MwGMe9QcIIbx0cuVhUTAO0ymkPvJwqceE4JLpNqrPKcdzWmuNEtgOqsuzrcOGrcNkBNHHlg5B0AS0ZRF2YpUhPnhf7578f2X4suDYIna2t37hp2bc2bjt37t67/6Cz+XBPR4mibEQjEakDn2gmuGQjw41gB7FiJPQF2/eP3hbx/Y9MaR7JD2YRMy8kM8mnnBKTuyab1hD7bMZlSgSfyecZhA4OiZmrMP3EzRyexCryic8FN4uTDG5tu3AHYhZrLiL57HDwBm5BhxwjhLAicibYCwj9Y9d1y1+YzxZrE8GGHIwdjMc7gzD0mhf/s3CNon8FihQvRUsVCzI3qxZcjKGswtcWa1VtaliVghrWNVCtfmRVrmpBlav1YR72q4StD7NBwyrqfwp6bcAN6tYQ/yl1W+B2fdnQDU0Kt2+Nq9wvTairCr3iLpgMfl+YzqTTRT20HPCy4ZZGF5Rjd9I5xUFEk5BJQwXReuyi2HgpUYZTwTIHJ5rFhB6RGRvnpiQh0166JMng09wTwGmk8k8auPRerEhJqPUi9PPMQmFdjxXOv8XGiZm+8lIu48QwSc8XmiYC5tIW7wYMuGLUiEVuEKp4zgrpnChCTf66FCK49S1fNvb6PXfQQ+/73eHrUo4N8Bg8AdvABS/BELwDu2AEqPXZ+mp9s77bX+xT+8z+cZ66vlbWPAKVYf/8BWa/dRk=</latexit>
a|000i + b|111i 7! a|111i + b|000i

Since we are only attempting to correct one error, we will focus on the first four state transformations above.
We find that the state either remains in the subspace with projection P0 or it is mapped into one of the
subspaces with projections P1 , P2 , P3 depending on which qubit flips

P1 := |100ih100| + |011ih011|
P2 := |010ih010| + |101ih101|
P3 := |001ih001| + |110ih110|

This is illustrated below

P1
|010i P2 <latexit sha1_base64="h9Ed4WKjzXfWSRL2d2wecInqcb8=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtwc7c/J1MQDB1prYrMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSzfPbVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8wefApd2</latexit>
<latexit sha1_base64="r4G/7qKkYIQZSQWE3ITTIrZoqI8=">AAAB7HicbVA9SwNBEJ2LXzF+RS1tFhPBKtzFQu0CNpYRvCSQHGFvs5cs2ds7dueEcOQ32FgoYusPsvPfuPkoNPHBwOO9GWbmhakUBl332ylsbG5t7xR3S3v7B4dH5eOTlkkyzbjPEpnoTkgNl0JxHwVK3kk1p3EoeTsc38389hPXRiTqEScpD2I6VCISjKKV/GqzX6/2yxW35s5B1om3JBVYotkvf/UGCctirpBJakzXc1MMcqpRMMmnpV5meErZmA5511JFY26CfH7slFxYZUCiRNtSSObq74mcxsZM4tB2xhRHZtWbif953QyjmyAXKs2QK7ZYFGWSYEJmn5OB0JyhnFhCmRb2VsJGVFOGNp+SDcFbfXmdtOo176rmPtQrjdtlHEU4g3O4BA+uoQH30AQfGAh4hld4c5Tz4rw7H4vWgrOcOYU/cD5/AIX8jcw=</latexit>

|100i
<latexit sha1_base64="V3S19DZrszAr/Z/d2fnstLToxUo=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtIXPzZzIBwdSZ2rYrMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSxfPrVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8weiOZd2</latexit>
<latexit sha1_base64="nseTrL18TOthsb1IMdfV58VVUB0=">AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTss1I7ExhITD0jgQvaWOdiwt3fZ3TMhhN9gY6Extv4gO/+NC1yh4EsmeXlvJjPzwlRwbVz32ylsbG5t7xR3S3v7B4dH5eOTlk4yxdBniUhUJ6QaBZfoG24EdlKFNA4FtsPx3dxvP6HSPJGPZpJiENOh5BFn1FjJrzb7XrVfrrg1dwGyTrycVCBHs1/+6g0SlsUoDRNU667npiaYUmU4Ezgr9TKNKWVjOsSupZLGqIPp4tgZubDKgESJsiUNWai/J6Y01noSh7YzpmakV725+J/XzUx0E0y5TDODki0XRZkgJiHzz8mAK2RGTCyhTHF7K2EjqigzNp+SDcFbfXmdtOo176rmPtQrjds8jiKcwTlcggfX0IB7aIIPDDg8wyu8OdJ5cd6dj2VrwclnTuEPnM8fhHeNyw==</latexit>

|101i
<latexit sha1_base64="6iLzwD+JG5gocIg9opQh38W6mZY=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtwcnc/J1MQDC1p44rMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSzfPbVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8wegm5d3</latexit>
|011i <latexit sha1_base64="Z+wfedPy+zjgzgGDTZQ1JuWfb6o=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtIXPzZzIBwdSeOo4rMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSxfPrVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8wejw5d3</latexit>

Flip on Flip on
qubit 2 qubit 1
|000i
<latexit sha1_base64="JWBw/Lzsei6aZtr8zNJEKNlKMaw=">AAAB9HicbVA9TwJBEJ3DL8Qv1NLmIphYkT0s1I7ExhITARO4kL1lgA17e+fuHgk5+R02Fhpj64+x89+4wBUKvmSSl/dmMjMviAXXhpBvJ7e2vrG5ld8u7Ozu7R8UD4+aOkoUwwaLRKQeAqpRcIkNw43Ah1ghDQOBrWB0M/NbY1SaR/LeTGL0QzqQvM8ZNVbyy0+EkI6iciCw3C2WSIXM4a4SLyMlyFDvFr86vYglIUrDBNW67ZHY+ClVhjOB00In0RhTNqIDbFsqaYjaT+dHT90zq/TcfqRsSePO1d8TKQ21noSB7QypGeplbyb+57UT07/yUy7jxKBki0X9RLgmcmcJuD2ukBkxsYQyxe2tLhtSRZmxORVsCN7yy6ukWa14FxVyVy3VrrM48nACp3AOHlxCDW6hDg1g8AjP8Apvzth5cd6dj0VrzslmjuEPnM8fTtGRHA==</latexit>

P0
|111i
<latexit sha1_base64="3NNnOqnxnF2WR9qB/ywZrQh01n8=">AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTss1I7ExhITD0jgQvaWOdiwt3fZ3TMhhN9gY6Extv4gO/+NC1yh4EsmeXlvJjPzwlRwbVz32ylsbG5t7xR3S3v7B4dH5eOTlk4yxdBniUhUJ6QaBZfoG24EdlKFNA4FtsPx3dxvP6HSPJGPZpJiENOh5BFn1FjJrzb7brVfrrg1dwGyTrycVCBHs1/+6g0SlsUoDRNU667npiaYUmU4Ezgr9TKNKWVjOsSupZLGqIPp4tgZubDKgESJsiUNWai/J6Y01noSh7YzpmakV725+J/XzUx0E0y5TDODki0XRZkgJiHzz8mAK2RGTCyhTHF7K2EjqigzNp+SDcFbfXmdtOo176rmPtQrjds8jiKcwTlcggfX0IB7aIIPDDg8wyu8OdJ5cd6dj2VrwclnTuEPnM8fgvKNyg==</latexit>

<latexit sha1_base64="A7AO4I9/X07ktnif2+3TWX7wvR4=">AAAB9HicbVA9TwJBEJ3DL8Qv1NLmIphYkVss1I7ExhITARO4kL1lgA17e+fuHgk5+R02Fhpj64+x89+4wBUKvmSSl/dmMjMviAXXxvO+ndza+sbmVn67sLO7t39QPDxq6ihRDBssEpF6CKhGwSU2DDcCH2KFNAwEtoLRzcxvjVFpHsl7M4nRD+lA8j5n1FjJLz8RQjqKyoHAcrdY8ireHO4qIRkpQYZ6t/jV6UUsCVEaJqjWbeLFxk+pMpwJnBY6icaYshEdYNtSSUPUfjo/euqeWaXn9iNlSxp3rv6eSGmo9SQMbGdIzVAvezPxP6+dmP6Vn3IZJwYlWyzqJ8I1kTtLwO1xhcyIiSWUKW5vddmQKsqMzalgQyDLL6+SZrVCLireXbVUu87iyMMJnMI5ELiEGtxCHRrA4BGe4RXenLHz4rw7H4vWnJPNHMMfOJ8/U3iRHw==</latexit>

Flip on
qubit 3

P3
|001i
<latexit sha1_base64="0T1sjTZaSu0KtTujHj8S9FCa04o=">AAAB7HicbVA9SwNBEJ2LXzF+RS1tFhPBKtwlhdoFbCwjeDGQHGFvs5cs2ds7dueEcOQ32FgoYusPsvPfuPkoNPHBwOO9GWbmhakUBl332ylsbG5t7xR3S3v7B4dH5eOTtkkyzbjPEpnoTkgNl0JxHwVK3kk1p3Eo+WM4vp35j09cG5GoB5ykPIjpUIlIMIpW8qutfqPaL1fcmjsHWSfeklRgiVa//NUbJCyLuUImqTFdz00xyKlGwSSflnqZ4SllYzrkXUsVjbkJ8vmxU3JhlQGJEm1LIZmrvydyGhsziUPbGVMcmVVvJv7ndTOMroNcqDRDrthiUZRJggmZfU4GQnOGcmIJZVrYWwkbUU0Z2nxKNgRv9eV10q7XvEbNva9XmjfLOIpwBudwCR5cQRPuoAU+MBDwDK/w5ijnxXl3PhatBWc5cwp/4Hz+AIeBjc0=</latexit>

<latexit sha1_base64="aSHfLuiy0kHZurawR/y6Xh0TdVQ=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtwbYzN38oExBMnakrMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSxfPbVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8weby5d2</latexit>

|110i <latexit sha1_base64="wDfrmexwATU4FfsgemjdtHr7wDI=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFvBVUnqQt0V3LisYB/QhDKZ3LRDJ5MwMxFKLLjxV9y4UMStP+HOv3GaZqGtBy4czrl37tzjJ4xKZdvfRmlldW19o7xZ2dre2d0z9w86Mk4FgTaJWSx6PpbAKIe2oopBLxGAI59B1x9fz/zuPQhJY36nJgl4ER5yGlKClZYG5lHtwXEyN38oExBM7akrMB8yqA3Mql23c1jLxClIFRVoDcwvN4hJGgFXhGEp+46dKC/DQlHCYFpxUwkJJmM8hL6mHEcgvSxfPbVOtRJYYSx0cWXl6u+JDEdSTiJfd0ZYjeSiNxP/8/qpCi+9jPIkVcDJfFGYMkvF1iwQK6ACiGITTTARVP/VIiMsMFE6tooOwVk8eZl0GnXnvG7fNqrNqyKOMjpGJ+gMOegCNdENaqE2IugRPaNX9GY8GS/Gu/Exby0Zxcwh+gPj8wedc5d3</latexit>

Error detection. In order to correct the bit flip error, we need to identify which, if any, of the 3 qubits has
flipped. From the figure above we see that this can be achieved by performing a projective measurement with
projections {P0 , P1 , P2 , P3 }. Indeed, it can be easily checked that the projections form a PVM, i.e. the are
orthogonal to each other and add up to identity.
Note that while this measurement correctly identifies one qubit errors, it preserves the coherence of the state.

3
Indeed, the conditional states of the 3 qubits are

Outcome = 0 ! a|000i + b|111i


Outcome = 1 ! a|100i + b|011i
Outcome = 2 ! a|010i + b|101i
Outcome = 3 ! a|001i + b|110i

How can the measurement {P0 , P1 , P2 , P3 } be implemented in practice ? Let Z1 = Z ⌦ I ⌦ I, Z2 =


I ⌦ Z ⌦ I, Z3 = I ⌦ I ⌦ Z be the individual Z operators, and consider the observables Z1 Z2 and Z2 Z3 . We
show that measuring the two (commuting) observables simultaneously is equivalent to performing the desired
measurement {P0 , P1 , P2 , P3 }. Indeed, note that both observables have {±1} eigenvalues and eigenvectors

Z1 Z2 : eigenvalue = +1 ! |000i, |001i, |110i, |111i


Z1 Z2 : eigenvalue = 1 ! |010i, |100i, |011i, |101i
Z2 Z3 : eigenvalue = +1 ! |000i, |100i, |011i, |111i
Z2 Z3 : eigenvalue = 1 ! |001i, |101i, |010i, |110i

It follows that ‘Z1 Z2 compares the first two bits’ while ‘Z2 Z3 compares the last 2 bits’. Indeed, the first two
bits of the eigenvectors of Z1 Z2 are equal if the eigenvalue is +1, and are different if the eigenvalue is 1.
Similarly, the last two bits of the eigenvectors of Z2 Z3 are equal if the eigenvalue is +1, and are different if
the eigenvalue is 1. Since the two observable commute, they can be measured simultaneously to produce 4
outcomes (+1, +1), (+1, 1), ( 1, +1), ( 1, 1). The 4 corresponding projections are P0 , P3 , P1 , P2 .

Error correction. In each of the 4 cases above the conditional state is obtained by applying a unitary flip
operation to one of the qubits (or none at all). Therefore, the error can be corrected by flipping the concerned
qubit, which maps the state back to a|000i + b|111i:

Outcome = 0 ! Apply I ⌦I ⌦I
Outcome = 1 ! Apply X ⌦I ⌦I
Outcome = 2 ! Apply I ⌦X ⌦I
Outcome = 3 ! Apply I ⌦I ⌦X

Decoding. At this point we can apply the reverse of the encoding transformation to map the 3 qubit state to
the corresponding 1 qubit (logical) state.

| i <latexit sha1_base64="B8GPU4RCzlhEkSS19GUXi3jW4f0=">AAAB9XicbVC7TsNAEFzzDOEVoKSxSJCoIjsUQBeJhjJI5CHFJjpf1skp57N1dwZFJv9BQwFCtPwLHX/DJXEBCSOtNJrZ1e5OkHCmtON8Wyura+sbm4Wt4vbO7t5+6eCwpeJUUmzSmMeyExCFnAlsaqY5dhKJJAo4toPR9dRvP6BULBZ3epygH5GBYCGjRBvpvvLkJYp5kogBx0qvVHaqzgz2MnFzUoYcjV7py+vHNI1QaMqJUl3XSbSfEakZ5TgpeqnChNARGWDXUEEiVH42u3pinxqlb4exNCW0PVN/T2QkUmocBaYzInqoFr2p+J/XTXV46WdMJKlGQeeLwpTbOranEdh9JpFqPjaEUMnMrTYdEkmoNkEVTQju4svLpFWruudV57ZWrl/lcRTgGE7gDFy4gDrcQAOaQEHCM7zCm/VovVjv1se8dcXKZ47gD6zPHyeCkj4=</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

4
3 Phase flip channel
Consider now the phase flip channel
P : ⇢ 7! (1 ✏)⇢ + ✏Z⇢Z
The Pauli Z is called phase flip operator due to its action on standard basis vectors
Z : |0i 7! |0i Z : |1i 7! |1i
A key observation is that Z has a ‘bit flip’ action on the eigenbasis |±i = |0i ± |1i of the bit flip operator X:
Z : |+i 7! |+i Z : | i 7! | i
Therefore, with respect to the |±i basis the phase flip error acts as a bit flip error! Consequently, the bit flip
error correction protocol can be applied to correct phase flips. The key steps are
Encoding. A logical qubit state a|0i + b|1i is encoded as a 3 qubits state
a| + ++i + b| i
by using the following encoding circuit

| i <latexit sha1_base64="B8GPU4RCzlhEkSS19GUXi3jW4f0=">AAAB9XicbVC7TsNAEFzzDOEVoKSxSJCoIjsUQBeJhjJI5CHFJjpf1skp57N1dwZFJv9BQwFCtPwLHX/DJXEBCSOtNJrZ1e5OkHCmtON8Wyura+sbm4Wt4vbO7t5+6eCwpeJUUmzSmMeyExCFnAlsaqY5dhKJJAo4toPR9dRvP6BULBZ3epygH5GBYCGjRBvpvvLkJYp5kogBx0qvVHaqzgz2MnFzUoYcjV7py+vHNI1QaMqJUl3XSbSfEakZ5TgpeqnChNARGWDXUEEiVH42u3pinxqlb4exNCW0PVN/T2QkUmocBaYzInqoFr2p+J/XTXV46WdMJKlGQeeLwpTbOranEdh9JpFqPjaEUMnMrTYdEkmoNkEVTQju4svLpFWruudV57ZWrl/lcRTgGE7gDFy4gDrcQAOaQEHCM7zCm/VovVjv1se8dcXKZ47gD6zPHyeCkj4=</latexit>
H

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>
H

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>
H

where the Hadamard gates perform the rotation from {|0i, |1i} to {|+i, | i}.
Error process. The pattern of errors is similar to the bit flip case and is summarised by the diagram

|+ +i P2
<latexit sha1_base64="r4G/7qKkYIQZSQWE3ITTIrZoqI8=">AAAB7HicbVA9SwNBEJ2LXzF+RS1tFhPBKtzFQu0CNpYRvCSQHGFvs5cs2ds7dueEcOQ32FgoYusPsvPfuPkoNPHBwOO9GWbmhakUBl332ylsbG5t7xR3S3v7B4dH5eOTlkkyzbjPEpnoTkgNl0JxHwVK3kk1p3EoeTsc38389hPXRiTqEScpD2I6VCISjKKV/GqzX6/2yxW35s5B1om3JBVYotkvf/UGCctirpBJakzXc1MMcqpRMMmnpV5meErZmA5511JFY26CfH7slFxYZUCiRNtSSObq74mcxsZM4tB2xhRHZtWbif953QyjmyAXKs2QK7ZYFGWSYEJmn5OB0JyhnFhCmRb2VsJGVFOGNp+SDcFbfXmdtOo176rmPtQrjdtlHEU4g3O4BA+uoQH30AQfGAh4hld4c5Tz4rw7H4vWgrOcOYU/cD5/AIX8jcw=</latexit>

| + +i
P1
<latexit sha1_base64="nseTrL18TOthsb1IMdfV58VVUB0=">AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTss1I7ExhITD0jgQvaWOdiwt3fZ3TMhhN9gY6Extv4gO/+NC1yh4EsmeXlvJjPzwlRwbVz32ylsbG5t7xR3S3v7B4dH5eOTlk4yxdBniUhUJ6QaBZfoG24EdlKFNA4FtsPx3dxvP6HSPJGPZpJiENOh5BFn1FjJrzb7XrVfrrg1dwGyTrycVCBHs1/+6g0SlsUoDRNU667npiaYUmU4Ezgr9TKNKWVjOsSupZLGqIPp4tgZubDKgESJsiUNWai/J6Y01noSh7YzpmakV725+J/XzUx0E0y5TDODki0XRZkgJiHzz8mAK2RGTCyhTHF7K2EjqigzNp+SDcFbfXmdtOo176rmPtQrjds8jiKcwTlcggfX0IB7aIIPDDg8wyu8OdJ5cd6dj2VrwclnTuEPnM8fhHeNyw==</latexit>

<latexit sha1_base64="DKIqvfLtgNcs8eHqvxxcU54csvs=">AAACA3icbVDLSgMxFM3UV62vqjvdBFtBKJaZKuiy4MZlBfuAzlAymds2NJMZkoxQxoIbf8WNC0Xc+hPu/BvTdhZaPXDhcM69ubnHjzlT2ra/rNzS8srqWn69sLG5tb1T3N1rqSiRFJo04pHs+EQBZwKammkOnVgCCX0ObX90NfXbdyAVi8StHsfghWQgWJ9Roo3UKx6U7yupO3snlRBMTicVVxIx4FDuFUt21Z4B/yVORkooQ6NX/HSDiCYhCE05Uarr2LH2UiI1oxwmBTdREBM6IgPoGipICMpLZ7sn+NgoAe5H0pTQeKb+nEhJqNQ49E1nSPRQLXpT8T+vm+j+pZcyEScaBJ0v6icc6whPA8EBk0A1HxtCqGTmr5gOiSRUm9gKJgRn8eS/pFWrOmdV+6ZWqp9nceTRITpCJ8hBF6iOrlEDNRFFD+gJvaBX69F6tt6s93lrzspm9tEvWB/fh4aXYw==</latexit> <latexit sha1_base64="sZxD7l8+L7DgsLGtWzRNJTSgbwU=">AAACA3icbVDLSgMxFM3UV62vqjvdBFtBKJaZKuiy4MZlBfuAzlAymds2NJMZkoxQxoIbf8WNC0Xc+hPu/BvTdhZaPXDhcM69ubnHjzlT2ra/rNzS8srqWn69sLG5tb1T3N1rqSiRFJo04pHs+EQBZwKammkOnVgCCX0ObX90NfXbdyAVi8StHsfghWQgWJ9Roo3UKx6U71N39kwqIZicTioVVxIx4FDuFUt21Z4B/yVORkooQ6NX/HSDiCYhCE05Uarr2LH2UiI1oxwmBTdREBM6IgPoGipICMpLZ8sn+NgoAe5H0pTQeKb+nEhJqNQ49E1nSPRQLXpT8T+vm+j+pZcyEScaBJ0v6icc6whPA8EBk0A1HxtCqGTmr5gOiSRUm9gKJgRn8eS/pFWrOmdV+6ZWqp9nceTRITpCJ8hBF6iOrlEDNRFFD+gJvaBX69F6tt6s93lrzspm9tEvWB/fiv+XYw==</latexit>

|
<latexit sha1_base64="YGVhO7a6gUjeRbiPTus4GBgy5RA=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFtBkJakCrosuHFZwT6gKWUyvWmHTiZhZiKUGHDjr7hxoYhbf8Kdf+M0zUKrBy4czrl37tzjRYxKZdtfRmFpeWV1rbhe2tjc2t4xd/faMowFgRYJWSi6HpbAKIeWoopBNxKAA49Bx5tczfzOHQhJQ36rphH0Azzi1KcEKy0NzIPKfTVxs3cSAcP0NK26AvMRg8rALNs1O4P1lzg5KaMczYH56Q5DEgfAFWFYyp5jR6qfYKEoYZCW3FhChMkEj6CnKccByH6S7U6tY60MLT8UuriyMvXnRIIDKaeBpzsDrMZy0ZuJ/3m9WPmX/YTyKFbAyXyRHzNLhdYsEGtIBRDFpppgIqj+q0XGWGCidGwlHYKzePJf0q7XnLOafVMvN87zOIroEB2hE+SgC9RA16iJWoigB/SEXtCr8Wg8G2/G+7y1YOQz++gXjI9viriXZQ==</latexit>
+ i |+
<latexit sha1_base64="VN2zTnyP5X23b2qu209ymHZQwp4=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFtBkJakCrosuHFZwT6gKWUyvWmHTiZhZiKUGHDjr7hxoYhbf8Kdf+M0zUKrBy4czrl37tzjRYxKZdtfRmFpeWV1rbhe2tjc2t4xd/faMowFgRYJWSi6HpbAKIeWoopBNxKAA49Bx5tczfzOHQhJQ36rphH0Azzi1KcEKy0NzIPKfeJmzyQChulpWq26AvMRg8rALNs1O4P1lzg5KaMczYH56Q5DEgfAFWFYyp5jR6qfYKEoYZCW3FhChMkEj6CnKccByH6SLU+tY60MLT8UuriyMvXnRIIDKaeBpzsDrMZy0ZuJ/3m9WPmX/YTyKFbAyXyRHzNLhdYsEGtIBRDFpppgIqj+q0XGWGCidGwlHYKzePJf0q7XnLOafVMvN87zOIroEB2hE+SgC9RA16iJWoigB/SEXtCr8Wg8G2/G+7y1YOQz++gXjI9vjhOXZQ==</latexit>
i

Phase flip Phase flip


on qubit 2 on qubit 1
| + ++i <latexit sha1_base64="sOsljU2ujeAzjc0li6c+8VQ4G/U=">AAAB9HicbVBNS8NAEJ3Ur1q/qh69BFtBKJSkCnosePFYwbZCG8pmO2mXbjZxd1Mosb/DiwdFvPpjvPlv3LY5aOuDgcd7M8zM82POlHacbyu3tr6xuZXfLuzs7u0fFA+PWipKJMUmjXgkH3yikDOBTc00x4dYIgl9jm1/dDPz22OUikXiXk9i9EIyECxglGgjeeWnSqXSlUQMOJZ7xZJTdeawV4mbkRJkaPSKX91+RJMQhaacKNVxnVh7KZGaUY7TQjdRGBM6IgPsGCpIiMpL50dP7TOj9O0gkqaEtufq74mUhEpNQt90hkQP1bI3E//zOokOrr2UiTjRKOhiUZBwW0f2LAG7zyRSzSeGECqZudWmQyIJ1SanggnBXX55lbRqVfei6tzVSvXLLI48nMApnIMLV1CHW2hAEyg8wjO8wps1tl6sd+tj0Zqzsplj+APr8wc2DZEI</latexit>

P0
| i
<latexit sha1_base64="HtToS2IKRMTWrVFZbaVQTvuCSzY=">AAAB9HicbVBNT8JAEJ3iF+IX6tFLI5h4gbRookcSLx4xETCBhmyXKWzYbuvuloRUfocXDxrj1R/jzX/jAj0o+JJJXt6bycw8P+ZMacf5tnJr6xubW/ntws7u3v5B8fCopaJEUmzSiEfywScKORPY1ExzfIglktDn2PZHNzO/PUapWCTu9SRGLyQDwQJGiTaSV36qVCpdScSAY7lXLDlVZw57lbgZKUGGRq/41e1HNAlRaMqJUh3XibWXEqkZ5TgtdBOFMaEjMsCOoYKEqLx0fvTUPjNK3w4iaUpoe67+nkhJqNQk9E1nSPRQLXsz8T+vk+jg2kuZiBONgi4WBQm3dWTPErD7TCLVfGIIoZKZW206JJJQbXIqmBDc5ZdXSatWdS+qzl2tVL/M4sjDCZzCObhwBXW4hQY0gcIjPMMrvFlj68V6tz4WrTkrmzmGP7A+fwA/W5EO</latexit>
<latexit sha1_base64="3NNnOqnxnF2WR9qB/ywZrQh01n8=">AAAB7HicbVA9TwJBEJ3DL8Qv1NJmI5hYkTss1I7ExhITD0jgQvaWOdiwt3fZ3TMhhN9gY6Extv4gO/+NC1yh4EsmeXlvJjPzwlRwbVz32ylsbG5t7xR3S3v7B4dH5eOTlk4yxdBniUhUJ6QaBZfoG24EdlKFNA4FtsPx3dxvP6HSPJGPZpJiENOh5BFn1FjJrzb7brVfrrg1dwGyTrycVCBHs1/+6g0SlsUoDRNU667npiaYUmU4Ezgr9TKNKWVjOsSupZLGqIPp4tgZubDKgESJsiUNWai/J6Y01noSh7YzpmakV725+J/XzUx0E0y5TDODki0XRZkgJiHzz8mAK2RGTCyhTHF7K2EjqigzNp+SDcFbfXmdtOo176rmPtQrjds8jiKcwTlcggfX0IB7aIIPDDg8wyu8OdJ5cd6dj2VrwclnTuEPnM8fgvKNyg==</latexit>

Phase flip
on qubit 3

P3
|++ i
<latexit sha1_base64="0T1sjTZaSu0KtTujHj8S9FCa04o=">AAAB7HicbVA9SwNBEJ2LXzF+RS1tFhPBKtwlhdoFbCwjeDGQHGFvs5cs2ds7dueEcOQ32FgoYusPsvPfuPkoNPHBwOO9GWbmhakUBl332ylsbG5t7xR3S3v7B4dH5eOTtkkyzbjPEpnoTkgNl0JxHwVK3kk1p3Eo+WM4vp35j09cG5GoB5ykPIjpUIlIMIpW8qutfqPaL1fcmjsHWSfeklRgiVa//NUbJCyLuUImqTFdz00xyKlGwSSflnqZ4SllYzrkXUsVjbkJ8vmxU3JhlQGJEm1LIZmrvydyGhsziUPbGVMcmVVvJv7ndTOMroNcqDRDrthiUZRJggmZfU4GQnOGcmIJZVrYWwkbUU0Z2nxKNgRv9eV10q7XvEbNva9XmjfLOIpwBudwCR5cQRPuoAU+MBDwDK/w5ijnxXl3PhatBWc5cwp/4Hz+AIeBjc0=</latexit>

<latexit sha1_base64="4F2GP0nL6XLuiH8jC3pVwbdM7Xs=">AAACA3icbVDLSgMxFM3UV62vqjvdBFtBKJaZKuiy4MZlBfuAzlAymds2NJMZkoxQxoIbf8WNC0Xc+hPu/BvTdhZaPXDhcM69ubnHjzlT2ra/rNzS8srqWn69sLG5tb1T3N1rqSiRFJo04pHs+EQBZwKammkOnVgCCX0ObX90NfXbdyAVi8StHsfghWQgWJ9Roo3UKx6U7yuV1J09lEoIJqcTVxIx4FDuFUt21Z4B/yVORkooQ6NX/HSDiCYhCE05Uarr2LH2UiI1oxwmBTdREBM6IgPoGipICMpLZ6sn+NgoAe5H0pTQeKb+nEhJqNQ49E1nSPRQLXpT8T+vm+j+pZcyEScaBJ0v6icc6whPA8EBk0A1HxtCqGTmr5gOiSRUm9gKJgRn8eS/pFWrOmdV+6ZWqp9nceTRITpCJ8hBF6iOrlEDNRFFD+gJvaBX69F6tt6s93lrzspm9tEvWB/fhA2XYw==</latexit>

| +i <latexit sha1_base64="yoOCb7CrCVrg1z+/EnCI344n4N0=">AAACA3icbVDLSsNAFJ3UV62vqDvdBFtBkJakCrosuHFZwT6gKWUyvWmHTiZhZiKUGHDjr7hxoYhbf8Kdf+M0zUKrBy4czrl37tzjRYxKZdtfRmFpeWV1rbhe2tjc2t4xd/faMowFgRYJWSi6HpbAKIeWoopBNxKAA49Bx5tczfzOHQhJQ36rphH0Azzi1KcEKy0NzIPKfbWauNlDiYBhepq6AvMRg8rALNs1O4P1lzg5KaMczYH56Q5DEgfAFWFYyp5jR6qfYKEoYZCW3FhChMkEj6CnKccByH6SrU6tY60MLT8UuriyMvXnRIIDKaeBpzsDrMZy0ZuJ/3m9WPmX/YTyKFbAyXyRHzNLhdYsEGtIBRDFpppgIqj+q0XGWGCidGwlHYKzePJf0q7XnLOafVMvN87zOIroEB2hE+SgC9RA16iJWoigB/SEXtCr8Wg8G2/G+7y1YOQz++gXjI9vh12XZQ==</latexit>

Error detection. The error can be identified by performing a measurement with projections {P0 , P1 , P2 , P3 }.
This can be implemented by comparing the sign of neighbouring qubits. Since |±i are eigevectors of X, this
can be achieved by measuring X1 X2 and X2 X3 , similarly to the bit flip case.

5
Error correction. Once we have identified the qubit affected by the phase flip error (if any), we can apply
the corresponding phase flip unitary to correct the error.
Decoding. The state can be mapped back to a|0i + b|1i by the inverse of the encoding circuit.

4 The Shor algorithm


We are now ready to answer the general question: how can we perform error correction for arbitrary qubit
channels? We will describe a code proposed by Shor [1] which combines the techniques of the bit flip and
phase flip code by employing the principle of concatenation. This means that the logical qubit will be first
encoded using the bit flip code
|0i 7! | + ++i, |1i 7! | i,
and then each of the 3 qubits will themselves be encoded using the bit flip code
1 1 1 1
|+i = p (|0i + |1i) 7! p (|000i + |111i), | i = p (|0i |1i) 7! p (|000i |111i)
2 2 2 2
Therefore, the overall code uses 9 qubits and the code space is defined by the vectors
1
|0L i : = p (|000i + |111i)(|000i + |111i)(|000i + |111i)
2 2
1
|1L i : = p (|000i |111i)(|000i |111i)(|000i |111i)
2 2
The encoding circuit for can be constructed by applying the phase flip encoding followed by bit flip encoding
on each of the 3 qubits.

| i H
|0i
<latexit sha1_base64="B8GPU4RCzlhEkSS19GUXi3jW4f0=">AAAB9XicbVC7TsNAEFzzDOEVoKSxSJCoIjsUQBeJhjJI5CHFJjpf1skp57N1dwZFJv9BQwFCtPwLHX/DJXEBCSOtNJrZ1e5OkHCmtON8Wyura+sbm4Wt4vbO7t5+6eCwpeJUUmzSmMeyExCFnAlsaqY5dhKJJAo4toPR9dRvP6BULBZ3epygH5GBYCGjRBvpvvLkJYp5kogBx0qvVHaqzgz2MnFzUoYcjV7py+vHNI1QaMqJUl3XSbSfEakZ5TgpeqnChNARGWDXUEEiVH42u3pinxqlb4exNCW0PVN/T2QkUmocBaYzInqoFr2p+J/XTXV46WdMJKlGQeeLwpTbOranEdh9JpFqPjaEUMnMrTYdEkmoNkEVTQju4svLpFWruudV57ZWrl/lcRTgGE7gDFy4gDrcQAOaQEHCM7zCm/VovVjv1se8dcXKZ47gD6zPHyeCkj4=</latexit>

<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i H
|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i <latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i H
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

|0i
<latexit sha1_base64="iA0zUaXIEEhr+HefoAhaNaao2Xo=">AAAB8nicbVA9TwJBEJ3DL8Qv1NJmI5hYkTsstDIkNpaYCJjAhewtc7Bhb++yu2dCTn6GjYXG2Ppr7Pw3LnCFgi+Z5OW9mczMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+P2jpOFcMWi0WsHgKqUXCJLcONwIdEIY0CgZ1gfDPzO4+oNI/lvZkk6Ed0KHnIGTVW6laf3J6iciiw2i9X3Jo7B1klXk4qkKPZL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufPCVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhFd+xmWSGpRssShMBTExmf1PBlwhM2JiCWWK21sJG1FFmbEplWwI3vLLq6Rdr3kXNfeuXmlc53EU4QRO4Rw8uIQG3EITWsAghmd4hTfHOC/Ou/OxaC04+cwx/IHz+QNwAZCt</latexit>

Phase flip encoding Bit flip encoding

Error detection and correction. We will show that the Shor code protects against both bit and phase flip
errors, and indeed any type of errors, as long as it affects at most one of the qubits.
1) Consider first the case of bit flip error. We identify the error syndrome by applying the same measurements
as before, but this time for each of the 3 blocks of 3 qubits, that is we measure 3 pairs of observables
(Z1 Z2 , Z2 Z3 ), (Z4 Z5 , Z5 Z6 ), (Z7 Z8 , Z8 Z9 )
The first two outcomes identify whether one of the first 3 qubits has flipped (and which), and similarly for
the other pairs.
2) Consider now the case of phase flip errors. Note that a phase flip on either of the 3 qubits in each block has
the same effect; for instance a phase flip on the qubit 1 or 2 or 3 is described by the following transformations
on the first 3 qubits (while ignoring the other qubits which are not affected)
|000i + |111i 7! |000i |111i
|000i |111i 7! |000i + |111i

6
Instead of trying to identify which qubit has flipped phase, we construct a syndrome measurement to identify
in which block the error occured. For this, we need to compare the signs of the different blocks in order to
identify which block has a different sign from the other two. The corresponding observables are

S12 = X1 X2 X3 X4 X5 X6 , S23 = X4 X5 X6 X7 X8 X9

and can be measured simultaneously with the other 6 observables above (check that they indeed commute
with each other). To understand how this works, it is helpful to think in terms of blocks of 3 qubits rather than
individual qubits.
Indeed let us denote the ‘block ± states’
|000i + |111i |000i |111i
|+i := p , | i := p
2 2
and further, the ‘block X operators’

X(1) = X1 X2 X3 , X(2) = X4 X5 X6 , X(3) = X7 X8 X9

and note that the latter act as ‘X operators’ on the block states:

X(1) |+i = |+i, X(1) | i = | i.

With this notation the code state are |0L i = | + ++i and |1L i = | i, which brings us back to the phase
flip error setting of the previous section. As before, we can compare the signs of the blocks by measuring
S12 = X(1) X(2) and S23 = X(2) X(1) . Indeed, suppose that an sign flip error occured in the first block. Then
the code state changes as

a| + ++i + b| i 7 ! a| ++i + b| + i

The left side is an eigenvector with eigenvalue 1 for S12 and +1 for S23 . Therefore, the pattern of outcomes
identifies in which block the error occurred.
The error correction consist of flipping the sign of the identified block. This can be done by applying any of
the phase flip operators Z within the affected block.
Correcting Y errors. Besides bit and phase flip errors on single qubits, the Shor code is also capable of
correcting individual Y errors. Let us check how this works in the case of a Y error on the first qubit.
Focusing on the first block the error action

|000i + |111i 7! |100i |011i


|000i |111i 7! |100i + |011i

We can see that the bit flip location is identified as usual from the fact that the measurement of (Z1 Z2 , Z2 Z3 )
gives outcomes ( 1, +1). Similarly, the fact that a phase flip has occurred in block 1 can be deduced from
the measurement of (X(1) X(2) , X(2) X(3) ). After identifying the ‘faulty’ qubit we can apply Y to correct the
error.
Correcting any errors. How about errors which are different from X, Y, Z ? A general qubit channel has
the form X
E : ⇢ 7! E(⇢) = Ki ⇢Ki⇤
i
P
where Ki are arbitrary Kraus operators satisfying i Ki⇤ Ki = I. The space of all channels contains a
continuum of possible error patterns e.g. including small rotations, and at first sight is seems unlikely that a
single code can correct all such error patterns. However, as we will see below, the Shor code can correct an
individual qubit errors for arbitrary noise model! The reason behind this surprising fact is that errors can be
discretised into the 3 patterns (X, Y, Z) which are corrected by the Shor code.

7
For clarity, let us first assume that only the first qubit is affected by the noise E. Since I, X, Y, Z form a basis
in M (C2 ), we can expand Ki as
Ki = cIi I + cxi X + cyi Y + czi Z.
Now, the effect of the channel is to produce a mixture of states corresponding to applying different Kraus
operators on the code state | i = a|0L i + b|1L i:

| i 7! Ki | i = cIi | i + cxi X1 | i + cyi Y1 | i + czi Z1 | i

Note that each of these transformed states is a superposition of components corresponding to the discrete
errors X, Y, Z analysed earlier. Therefore, by performing the syndrome measurement will project the super-
position into one of the components, thus discretising the error. After this, the error correction procedure
transforms the state back to the initial one | i.
More generally, we can consider channels acting on all qubits, but in such a way that the dominant effect
comes from single qubit error. A possible model for this is that independent errors in which each qubit is
affected by the same noisy channel independently of the others, so that the ‘full’ channel is E ⌦9 . If E is
close to the identity transformation (which corresponds to the idea of small error probabilities), then a similar
argument can be applied to the Kraus operators of the full channel to show that the error state is a superposition
of X, Y, Z errors on individual qubits, plus contributions from errors on two qubits or more which can be
ignored as they are of order ✏2 . Therefore, the syndrome measurement will project the superposition into one
of the components, after which we can apply the usual error correction map.

5 Outlook
A real quantum computer is affected by quantum noise, whether in its memory, gate implementation or
measurement process. Therefore, the ability to perform quantum error correction is a key requirement for
successfully implementing quantum algorithms on physical quantum computers. Many other error correction
schemes have been developed uncovering strong connections with classical error correction but also shed-
ding light on deep structures within quantum theory [2,3]. In particular, the area of fault tolerant quantum
computation deals with the question of implementing error correction when all basic operations are noisy.
Quantum error correction has also found applications in quantum metrology, helping to improve estimation
accuracy by countering the effects of noise [4].
Curently, one of the most advanced quantum computers is Google’s Sychamore device which contains 54
qubits. With this device, Google claims to have reached ‘quantum supremacy’, a a threshold in quantum
computation where a quantum computer can perform a task which is impossible on a classical computer
within a reasonable time [5]. However, Sychamore’s qubits are very noisy and no quantum error correc-
tion is performed. Therefore, implementing quantum error correction on a large scale remains an immense
technological challenge.

References.
[1] P. W. Shor, Scheme for reducing decoherence in quantum computer memory, Phys. Rev. A 52 R2493(R)
(1995)
[2] M. A. Nielsen, I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University
Press (2010)
[3] T. A. Brun, D. A. Lidar (Editors), Quantum Error Correction, Cambridge University Press (2013)
[4] S. Zhou et al, Achieving the Heisenberg limit in quantum metrology using quantum error correction,
Nature Communications 9 78 (2018)
[5] F. Arute et al, Quantum supremacy using a programmable superconducting processor, Nature 574 505-
510 (2019)

8
Introduction to Quantum Information Science

Lecture 22: Quantum Statistics and Metrology

Abstract. In this lecture we give a quick introduction to the main problems in Quantum Statistics: state
estimation and state discrimination. We then discuss the use of correlated quantum systems as highly sensitive
sensors in metrology.

1 State estimation
In a previous lecture we have shown that a general measurement is an convex map from quantum states to
probability distributions over a space of outcomes {1, . . . , k}

M : ⇢ 7 ! P⇢ . (1)

In practice, this means the following. Suppose that we (as experimenters) can prepare a quantum system in
the state ⇢ and know how the measurement device works, i.e. we know M . Then the direct map (1) predicts
that the outcome of the measurement M is a random variable with probability distribution P⇢ (1), . . . , P⇢ (k).
So by repeating the measurement n times with the system prepared in the same state we obtain independent,
identically distributed (i.i.d.) outcomes X1 , . . . , Xn with probability distribution P⇢ . The quantum mechani-
cal prediction can be verified by comparing the frequencies of the different outcomes with the probabilities,
by appealing to the law of large numbers which says that the two must be close to each other for large n

]{a : Xa = i}
fn (i) := ⇡ P⇢ (i), i = 1, . . . k.
n
Until the 80’s the measurements were typically not performed on individual systems but on very large en-
sembles and the actual result of the experiment would have been the set of frequencies {fn (1), . . . , fn (k)}
rather than the sequence X1 , . . . , Xn . In spectroscopy for instance, we don’t observe the individual photons
emitted by each atom, but the intensity of the light coming from the whole ensemble. In fact Schrödinger [1]
was among the many physicists who did not even believe that we can talk about individual quantum systems:
“we never experiment with just one electron or atom or (small) molecule. In thought experiments we some-
times assume that we do; this invariably entails ridiculous consequences... we are not experimenting with
single particles any more than we can raise Ichthyosauria in the zoo. ”
Nevertheless, in the last decennia it has become possible to prepare, manipulate and measure individual
quantum systems with a high degree of control. A great variety of quantum devices like ion traps, quantum
dots, optical cavities, nanomechanical systems, superconducting qubits, are engineered and used to systemat-
ically probe and exploit quantum theory. This paradigm shift raises new statistical problems stemming from
the probabilistic nature of quantum mechanics. One of these problems is that of state estimation or state
tomography which we briefly illustrate by the following example.

1
The ‘8 ions experiment’ In 2005 the ion trap group in Innsbruck succeeded in creating a ‘W’ entangled state
of 8 qubits
1
|W i = p (|10000000i + |01000000i + · · · + |00000001i)
8
in a landmark experiment aiming to demonstrate the ability of ion trap technology to implement scalable
quantum computation [2]. To prove their claim, the experimenters performed a large number of repeated
measurements of different observables, and from the measurement data they statistically reconstructed the
state and showed that it is close to the ideal target state |W i. This amounts to solving the inverse problem of
estimating ⇢ given the measurement data
X1 , . . . , Xn 7 ! ⇢ˆ(n) := ⇢ˆ(n) (X1 , . . . , Xn ) ⇡ ⇢
The experiment highlighted the difficulty of statistical problems arising in quantum engineering as reflected
by the following facts
- the total number of parameters needed to characterise an 8 qubits density matrix is 48 1 = 65535;
- the total number of different preparation-measurement repetitions is 100 ⇥ 38 = 656100;
- the computation of the estimator and of the statistical errors took weeks of computer time.

Identifiability and mean square error. Let us return to the measurement M described by the direct map (1)
and consider that the state ⇢ is unknown. In order to estimate it we measure many identically prepared copies
of ⇢ and collect the i.i.d. results X1 , . . . , Xn having the distribution P⇢ . As discussed above, for very large n
the frequencies of the outcomes are roughly equal to the probabilities, so the data can be used to estimate the
distribution P⇢ . If the correspondence between ⇢ and P⇢ is one to one, then we say that ⇢ is identifiable and
its estimation is essentially an analytic problem of inverting the direct map (1).
However, in many quantum engineering experiments the number of i.i.d. samples from P⇢ may only be mod-
erately large, so that the frequencies fn (i) are close but not identical to the probabilities P(X = i). By apply-
ing the inverse map to the frequencies, the small errors may be amplified to produce an unreliable estimator
⇢ˆ(n) . Hence, the inverse problem has changed from an analytic to a statistical one, and needs to be treated in
the framework of statistical inference. The focus is now to devise estimators ⇢ˆ(n) := ⇢ˆ(n) (X1 , . . . , Xn ) for
which the difference ⇢ˆ(n) ⇢ is small and has a ‘predictable’ behaviour, e.g. it has a Gaussian distribution.
A simple measure of the ‘goodness’ of the estimator is the mean square error
⇢(n) , ⇢) = E(kˆ
Rn (ˆ ⇢(n) ⇢k22 ).
where kˆ
⇢(n) ⇢k2 is the norm-two distance
0 11/2
h⇣ ⌘i1/2 X (n)
⇢(n)
kˆ ⇢k2 := Tr ⇢ˆ(n) ⇢)2 =@ |ˆ
⇢ij ⇢ij |2 A .
ij

Typically Rn (ˆ
⇢(n) , ⇢) decreases as C/n where the constant C depends on the state ⇢, the measurement M ,
and the estimator ⇢ˆ(n) , and a large part of the state estimation literature deals with the optimisation of the
mean square error over measurements and estimators.

Qubit state estimation. Let us take a closer look at the state estimation problem in the simplest case of
two-dimensional systems. Recall that any qubit state can be written as
1
⇢r = (I + rx x + ry y + rz z)
2
where r = (rx , ry , rz ) 2 R3 is the Bloch vector which completely characterises ⇢. The probability distribu-
tion for the measurement of x is
1 + rx 1 rx
P(Sx = +1) = , P(Sx = 1) = (2)
2 2

2
and similarly for y and z . This means that measuring x gives information about the parameter rx but not
about ry and rz since any change in the latter two does not affect the distribution of Sx . In general, measuring
the spin in an arbitrary direction gives information about the component of r along that particular direction,
so in order to identify r we need to measure the spin component in (at least) three linearly independent
directions.
Based on these observations we devise the following simple state estimation procedure. Given n identically
prepared systems, we measure the observable x separately on a batch of m := n/3 systems and obtain the
results Sx,1 , . . . , Sx,m , and similarly for y and z . Then we construct estimators of the three parameters
rx , ry , rz :
m m m
1 X 1 X 1 X
r̂x = Sx,i , r̂y = Sy,i , r̂z = Sz,i .
m i=1 m i=1 m i=1
and define the estimator of ⇢ by
1
⇢ˆ := ⇢r̂ = (I + r̂x x + r̂y y + r̂z z) .
2

Using the properties of the Pauli matrices, its square error is


h i kr̂ rk2 1
2

⇢ ⇢k22 = Tr (ˆ
⇢ ⇢) = = (rx r̂x )2 + (ry r̂y )2 + (rz r̂z )2 .
2 2
To compute the mean square error, note first that r̂ := (r̂x , r̂y , r̂z ) is unbiased in the sense that its average is
the true Bloch vector r, e.g.
m
1 X 1 + rx 1 rx
E(r̂x ) = E(Sx,i ) = E(Sx ) = (+1) + ( 1) = rx .
m i=1 2 2

Then the mean square error of r̂x is (exercise)


1 1
E((r̂x rx )2 ) = Var(r̂x ) = E(Sx2 ) E(Sx )2 = (1 rx2 )
m m
and the total mean square error of ⇢ˆ is
3

⇢ ⇢k22 = (3 krk2 ),
n
which decreases as n 1
with a constant depending on the length of the Bloch vector.

Let us finish by briefly mentioning a few interesting questions which arise from this analysis.
1. Are the chosen estimators optimal (for the give measurement procedure) ?
2. Can we devise better measurement procedures ?
3. Do we gain in estimation precision if we measure the qubits collectively rather than individually?
4. Can the one qubit measurement scheme be extended to the multiple qubits set-up of the 8 ions experiment?
The answer to all questions is YES, and you can find more about it in the MAGIC lectures [3].

2 Quantum metrology
Another important topic at the intersection of quantum theory and statistics is Quantum Metrology. Generally
speaking, metrology deals with devising techniques for the precise measurement of fundamental quantities

3
such as time, distance, weight, fields, etc. High precision is very important in modern technology, for in-
stance the global positioning system (GPS) relies on the accurate measurement of time and takes into account
relativistic effects due to the motion of the communication satellites. In this section we will show how the use
of quantum correlations can greatly increase the precision of phase estimation as a function of the number of
available probe systems.

2.1 The quantum Cramér-Rao bound for phase estimation

We will first establish an important bound on the precision obtained by measuring an observable of a system.
The general set-up is the following. We would like to estimate an unknown parameter ✓ 2 R (e.g. magnetic
field) by using a quantum system whose dynamics is sensitive to ✓, e.g. for simplicity is given by the unitary
U✓ = exp(i✓G) where G is a selfadjoint ‘generator’. The initial state of the system is | 0 i 2 Cd , so after
applying the unitary, the parameter is ‘imprinted’ into the state which becomes
| ✓i := U✓ | 0i = exp( i✓G)| 0 i.

Without loss of generality we can assume that G has zero expectation with respect to the initial state
h 0 |G| 0 i = 0. Indeed this can always be achieved by changing G by a constant which only affects
the phase of the state | ✓ i. Since most interesting cases involve the use of a large number n 1 of probe
systems, and the estimation error decreases with n, we assume for simplicity that |✓| ⌧ 1 and we analyse the
performance of measurements in the first order with respect to ✓. In particular, the state | ✓ i can be expanded
as
| ✓ i ⇡ | 0 i i✓G| 0 i.

Suppose now that we measure an observable A which, as G, can be chosen such that h 0 |A| 0 i = 0. Then,
in the first order the expectation of A is
h ✓ |A| ✓ i ⇡h 0 |A| 0 i + ✓h 0| i[A, G]| 0i = i✓h 0 |[A, G]| 0 i := ✓µ.
If X1 , . . . , Xn are results of repeated measurements of A then we can construct the unbiased estimator (for
small ✓)
n
1 X
✓ˆn := Xi , E(✓ˆn ) ⇡ ✓.
nµ i=1
Its mean square error at ✓ = 0 is then
h i 1 E(X 2 ) 1 h 0 |A2 | 0 i
E (✓ˆn ✓)2 = 2
= (3)
n µ n |h 0 |[A, G]| 0 i|2
The following inequality known as Heisenberg’s uncertainty principle is a direct consequence of the Cauchy-
Schwarz inequality
1
h 0 |A2 | 0 i · h 0 |G2 | 0 i |h 0 |[A, G]| 0 i|2 .
4
By inserting in equation (3) we obtain the following lower bound on the MSE of any unbiased estimator,
known as the quantum Cramér-Rao bound (C-R) [4]:
h i 1 1
E (✓ˆn ✓)2 .
n 4h 0 |G2 | 0 i
Note that
2
2 d ✓
h 0 |G | 0i =
d✓ ✓=0
so the interpretation of the C-R bound is that the the smallest possible MSE is the inverse of the generator’s
variance (also known as quantum Fisher information), which measures the speed with which the state changes
with ✓. Moreover, the bound is achievable (inequality becomes equality) for the observable
d| ✓ ih ✓ |
A := = G| 0 ih 0 | +| 0 ih 0 |G.
d✓ ✓=0

4
2.2 Correlated versus independent probes in phase estimation

The upshot of the C-R bound in the previous section is that in order to have high estimation precision, we
need that the generator G has a large variance with respect to the probe state | 0 i. Let us consider now that
instead of a single system undergoing the unitary transformation U✓ = exp(i✓G), we have an ensemble
of N systems, each being rotated independently by the same unitary. If the initial state of the ensemble is
(N ) ⌦n
| 0 i 2 Cd , then the final state is
(N ) (N ) (N )
| ✓ i = U⌦N
✓ | 0 i = exp(i✓G(N ) )| 0 i,
PN
where G(N ) = i=1 Gi is the ‘total generator’, i.e. the sum of the individual generators with Gi acting
on system i. By the C-R bound, the best possible precision is given by the inverse of the quantum Fisher
information
(N ) (N )
F (N ) = 4h 0 |G(N )2 | 0 i.

We consider now the following two scenarios, with different initial states: product and entangled states. For
concreteness we consider the case of a qubit (d = 2) with phase generator G = z .
(N )
1) Standard (classical scaling). In this case the input state is a product (uncorrelated) state | 0 i =| 0i
⌦n
.
The quantum Fisher information is
N
X N
X X
⌦n ⌦n ⌦n 2 ⌦n (N ) (N )
F (N ) = 4h 0 |( z,i )
2
| 0 i =4 h 0 | z,i | 0 i +4 h 0 | z,i z,j | 0 i
i=1 i=1 i6=j
X
2
= 4N h 0| z | 0i +4 h 0 | z,i | 0 ih 0 | z,j | 0 i = 4N
i6=j

(N ) (N )
where the second term is zero since 0| z | 0i =h 0 |G| 0 i/N = 0.
Therefore, the quantum Fisher information scales linearly in N , so the MSE scales like 1/N as in the case of
classical estimation given N independent data samples.
1) Heisenberg scaling. In this case the input state is an entangled state, for instance the GHZ state

(N ) 1
| 0 i := p (|0i ⌦ · · · ⌦ |0i + |1i ⌦ · · · ⌦ |1i) .
2
The total generator has variance
N
X N
X X
(N ) 2 (N ) (N ) 2 (N ) (N ) (N )
h 0 |( z,i ) | 0 i = h 0 | z,i | 0 i + h 0 | z,i z,j | 0 i = N 2.
i=1 i=1 i6=j

This can be also seen by noting that |0 . . . 0i are |1 . . . 1i are eigenvectors of G with eigenvalues N and
(N )
respectively N ; by measuring G in the state | 0 i we obtain ±N with equal probabilities 1/2, so that the
variance of G is N 2 .
In conclusion, the MSE scales as 1/N 2 , which greatly outperforms the estimation precision achievable by
product states, and classical sensing devices.
Quantum metrology is currently a very active research topic, both theoretically and experimentally, which
aims to exploit the quantum effects in high precision sensing devices. Although the above arguments indicate
that quantum correlations play an important role in enhancing the estimation precision, there are many open
problems which concern the practical implementation of these ideas. For instance, it is not entirely clear
which probe states are robust against various noises occurring in realistic experimental setups.

5
3 State discrimination
Another important quantum statistical problem is that of state discrimination or quantum hypothesis testing
which can be formulated as follows. Suppose that we are given a quantum system which is randomly prepared
in one of two possible states ⇢0 and ⇢1 , with probabilities ⇡0 and ⇡1 . The expressions of ⇢0 and ⇢1 are known,
but we don’t know which of the two states has been prepared and we would like to make a ‘good’ guess at it.
For that we perform a measurement M whose outcome H 2 {0, 1} represents our guess. It’s performance is
measured by the average error probability

Pe = ⇡0 P(H = 1|⇢0 ) + ⇡1 P(H = 0|⇢1 ) = ⇡0 Tr(⇢0 M1 ) + ⇡1 Tr(⇢1 M0 ) (4)

where M0 , M1 are the POVM elements of M . The goal is to find the measurement with the smallest error
probability.
Before stating the main result we need to introduce the concepts of positive and negative part of a selfadjoint
operator, and that of the trace-norm. Let A be such an operator having the spectral decomposition
X
A= i |ei ihei |.
i

where {|e1 i, . . . , |ed i} is an ONB of eigenvectors of A. By grouping the positive and negative eigenvalues
into two separate sums we can write A as the sum of its positive and negative parts
X X
A= i |ei ihei | + i |ei ihei | = P+ AP+ + P AP := A+ + A
i: i 0 j: j <0

where P+ and P are the projections onto the eigenspaces with positive and respectively negative eigenval-
ues X X
P+ = |ei ihei |, P = |ei ihei |.
j: j 0 j: j <0

From these definitions it follows that the absolute value of A can be written as
X
|A| = | i | · |ei ihei | = A+ A .
i

We define the trace-norm of A by


X
kAk1 = Tr(|A|) = | i | = Tr(A+ ) Tr(A ) = 2Tr(A+ ) Tr(A). (5)
i

Theorem 1 (Helstrom). Let ⇢0 , ⇢1 and ⇡0 , ⇡1 be as above. Then the measurement with the smallest average
error probability is given by the projective measurement {P0 , P1 } where P0 is the projection onto eigenspace
of positive eigenvalues of ⇡0 ⇢0 ⇡1 ⇢1 .
The optimal error probability is
1
P⇤e = (1 k⇡0 ⇢0 ⇡1 ⇢1 k1 ).
2

Proof. Let M be an arbitrary measurement with POVM {M0 , M1 }. We rewrite its error probability as

Pe = ⇡0 Tr(⇢0 (I M0 )) + ⇡1 Tr(⇢1 M0 ) = ⇡0 Tr [(⇡0 ⇢0 ⇡1 ⇢1 ) M 0 ] .

Since ⇡0 is fixed, in order to minimise Pe we have to find the maximum of

Tr [(⇡0 ⇢0 ⇡1 ⇢1 ) M 0 ]

under the constraint that 0  M0  I. Let us decompose ⇡0 ⇢0 ⇡1 ⇢1 into its positive and negative parts

⇡0 ⇢0 ⇡1 ⇢1 = P0 (⇡0 ⇢0 ⇡1 ⇢1 )P0 + P1 (⇡0 ⇢0 ⇡1 ⇢1 )P1 .

6
Then

Tr [(⇡0 ⇢0 ⇡1 ⇢1 ) M0 ] = Tr [(⇡0 ⇢0 ⇡1 ⇢1 ) (P0 + P1 )M0 ]


= Tr [P0 (⇡0 ⇢0 ⇡1 ⇢1 ) P 0 · P0 M 0 P 0 ]
+ Tr [P1 (⇡0 ⇢0 ⇡1 ⇢1 ) P 1 · P1 M 0 P 1 ]
 Tr [P0 (⇡0 ⇢0 ⇡1 ⇢1 ) P 0 · P0 M 0 P 0 ]
 Tr [P0 (⇡0 ⇢0 ⇡1 ⇢1 ) P 0 ] .

In the first line we have inserted a decomposition of the identity I = P0 + P1 and then used the fact that P0
and P1 commute with ⇡0 ⇢0 ⇡1 ⇢1 to pass them on both sides of the product. The first inequality follows
from the fact that the second term of the sum is negative since P1 (⇡0 ⇢0 ⇡1 ⇢1 ) P1  0 and M0 0. The
last inequality follows from M0 0.
Note that all the inequalities are saturated for M0 = P0 , which means that {P0 , P1 } is the optimal measure-
ment. The optimal error is then

P⇤e = ⇡0 Tr(P0 (⇡0 ⇢0 ⇡1 ⇢1 )P0 )


1
= ⇡0 (k⇡0 ⇢0 ⇡1 ⇢1 k1 + ⇡0 ⇡1 )
2
1
= (1 k⇡0 ⇢0 ⇡1 ⇢1 k1 )
2

where in the last step we used (5) to express Tr(P0 (⇡0 ⇢0 ⇡1 ⇢1 )P0 ) in terms of the trace norm.
In the exercises we will show that the probability of error is equal to zero if and only if the states ⇢0 and ⇢1
have orthogonal supports, or equivalently ⇢0 ⇢1 = 0.

REFERENCES
[1] E. Schrödinger, Are there quantum jumps ?, British Journal of the Philosophy of Sciences, 3 233-242
(1952)
[2] H. Häffner et al, Scalable multiparticle entanglement of trapped ions, Nature, 438 643 (2005).
[3] Quantum Statistics course on MAGIC. http://maths.dept.shef.ac.uk/magic/course.php?id=181
[4] S. L. Braunstein and C.M. Caves, Statistical distance and the geometry of quantum states, Phys. Rev. Lett.
72, 3439 (1994).

7
8
Introduction to Quantum Information Science

Lecture 23: Entropy and entanglement

1 Source coding in classical information theory


Information theory deals with the general mathematical framework describing the quantification, and manip-
ulation of information, e.g. compression, encoding, transmission, and decoding. The subject was founded in
1948 by Claude Shannon with his seminal paper [1] “A Mathematical Theory of Communication”, dealing
with two key information theory problems: source coding and channel coding.
Coding means representing information (e.g. pictures, sounds, books) as a string of symbols from finite set,
e.g. {0, 1} for binary representation. The source coding problem is how to code information in the most
efficient way (i.e. using a small number of bits), while keeping the probability of error upon retrieving the
information, small. For instance one can convert an English text into binary by representing each of the 26
letters as a bit string of a fixed length. Since log 26 ⇡ 4.7, we need 5 bits for each letter. However, this
representation does not take into account the frequencies of different letters in an English text; while the
letter ‘e’ has frequency 12.7%, the letter ‘z’ appears only 0.07% of the times. This difference in frequency
can be exploited in a variable length code, where frequent letters are coded using a small number of bits
(thus saving space) while infrequent ones are coded with more bits (which is fine since they occur rarely).
Similarly, patterns in images can be exploited in order to significantly compress the information with with
limited quality loss.
But how do we quantify information, and what is the theoretical limit of information compression ? Below
we answer these questions using the key concepts of typical sequences and entropy.
A simple but efficient mathematical model is to consider that the messages to be encoded are random se-
quences of symbols emitted by a source (see Figure 1). We assume that the source output consists of a se-
quence of independent random variables X1 , . . . , Xn taking values in a finite “alphabet”, which for simplicity
is chosen to be {0, 1}; we also assume that all variables have the same distribution P(Xi = 0) = p(0) = p
and P(Xi = 1) = p(1) = 1 p. The problem is to find an encoding map En : {0, 1}n ! {0, 1}m
implementing source coding, and a decoding map Dn : {0, 1}m ! {0, 1}n such that

P [Dn En (X1 , . . . Xn ) 6= (X1 , . . . Xn )] < .

for some fixed, small error probability > 0.

Information Source
Decoding
Source Coding

Figure 1: Source coding problem. An information source produces a (random) message; The message is
encoded using a source coding map En . The message can then be decoded with small probability of error
using the decoding map Dn .

1
1.1 Typical sequences and entropy

A message of length n is a sequence x1 x2 . . . xn 2 {0, 1}n of possible realisations of X1 , . . . , Xn , and its


probability is
p(x1 , . . . , xn ) = p(x1 ) · p(x2 ) · · · · · p(xn ).

In order to compress the message, we need to know how a “typical” message looks like. If p > 1/2 then
the message with the highest probability is 00 . . . 0, but intuitively this is not what one would call a typical
message. By the law of large numbers we rather expect that for large n the proportions of different letters in
a “typical” message should be close to their probabilities, i.e. the number 0s is n0 ⇡ np and the number of
1s is n1 ⇡ n(1 p).
For a typical sequence the probability is then approximated as

p(x1 , . . . , xn ) = pn0 (1 p)n1 ⇡ pnp (1 p)n(1 p)


=2 nH(p)
.

On the right side the quantity H(p) is the entropy of the probability distribution (p, 1 p)

H(p) = p log p (1 p) log(1 p)

which can be seen to quantify the “degree of randomness” of the distribution. For more clarity we used the
base 2 logarithm, but the same formula holds for natural logarithm up to a constant factor. The entropy takes
the value zero for the “deterministic” distributions (p = 0 and p = 1) and is maximum is equal to 1 and
occurs at p = 1/2, the most random distribution.
Assuming that the total probability of non-typical sequences is negligible, we then find that there must be
about 2nH(p) typical sequences, each with probability roughly 2 nH(p) (see Figure 2). Unless p = 1/2
(completely random source), the number 2nH(p) is exponentially smaller than the total number 2n of se-
quences of length n. In terms of number of bits needed to label these sequences, this is m = nH(p) which
means that the typical sequences can be compressed with a rate H(p) ! The simple fact that many “natural”
objects such as writing, speech, music, and images, are not completely random but have a reduced entropy, is
at the basis of information compression techniques used for storing and transmitting information. A source
coding map En would be any map which puts the 2nH(p) typical sequences in one-to-one correspondence
with m bit sequences, while the non-typical sequences are represented as a single special string. The decod-
ing map Dn is the inverse of En which by definition recovers all typical sequences. The probability of error
is then the probability that a sequence is non-typical.

Non-typical sequences

Typical sequences

Figure 2: Typical sequences have approximately np zeros and n(1 p) ones. The probability of a typical
sequence is roughly e nH(p) and the size of set of typical sequences is roughly 2nH(p) . Although the number
of non-typical sequence is much larger (if p 6= 1/2) the total probability of such sequences is negligible, at
most .

To better understand typical sequences, let us consider the logarithm of the probability of a sequence
n
X
log p(x1 , . . . xn ) = log p(xi ).
i=1

2
The law of large numbers says that for any ✏, > 0 (which should be thought of as being small), and for n
large enough the following inequality holds with probability at least 1 (which is close to one for small )
n
1X
log p(Xi ) H(p)  ✏ (1)
n i=1

The entropy H(p) appears here as the expected value of log p(X)
E log p(X) = p · log p (1 p) log(1 p) = H(p).
We call a sequence x1 , . . . , xn typical if
n(H(p)+✏) n(H(p) ✏)
e  p(x1 , . . . , xn )  e
or equivalently if
| log p(x1 , . . . xn ) H(p)|  ✏.
By equation (1), the set T (n, ✏) of typical sequences has total probability at least 1 , which means that
we can neglect the non-typical sequences as long as we choose an acceptable level of error probability .
Moreover, the number of typical sequences is bounded from above and below as
(1 )2n(H(p) ✏)
 |T (n, ✏)|  2n(H(p)+✏) .
This means that we can encode (label) all the typical sequence using n(H(p) + ✏) bits. We say that we can
achieve any rate of compressing R > H(p). On the other hand it can be shown that it is not possible to
compress with a rate R < H(p), i.e. there is no subset of sequences of size 2nR and probability 1 .
This is the content of Shannon’s Source Coding Theorem [1]. Together with the Channel Coding Theorem,
which describes the optimal encoding for messages passing through a noisy channel, it forms the foundation
of classical Information Theory.

2 Entanglement entropy
In previous lectures we have seen that in Quantum Information Theory entanglement is seen as an enabling
resource for quantum protocols such as teleportation, superdense coding, cryptography, metrology, and quan-
tum computation. Can this resource be quantified ?
Before addressing this question we need to identify the key properties that entanglement is expected to satisfy,
based on sound physical criteria. The general set-up is that of two parties (labs) Alice and Bob who share a
pair of quantum systems with joint quantum state | AB i 2 HA ⌦ HB . The two parties can only act locally
on their system (e.g. they are spatially separated and can only perform unitary operations the type UA ⌦ UB ),
but they may use classical communication in their protocols (see Figure 3).
We would like to define the amount of entanglement E( AB ) 0 to quantify the strength of the quantum
correlations between the two parties, which can then be used in implementing specific quantum protocols.
The following requirements are natural:

1. A state | AB i has zero entanglement E( AB ) = 0, if and only if it is a product state


| AB i =| Ai ⌦| B i.

2. Entanglement is invariant under local unitary transformations, i.e. E( AB ) = E((UA ⌦ UB ) AB ).


If | AB i has Schmidt decomposition
r
X p
| AB i = µi |ei i ⌦ |fi i
i=1

this property implies that its entanglement depends only on its Schmidt coefficients
E( AB ) = E({µ1 , . . . µr }).

3
that the ability to perform classical communication is
vital for many quantum information protocols - a promi- correlations of entanglement are required to im
nent example being teleportation. These considerations general, and therefore nonlocal, quantum oper
are the technological reasons for the key status of the two or more parties [13, 14]. As LOCC-operati
Local Operations and Classical Communication ‘LOCC’ are insufficient to achieve these transformations
paradigm, and are a major motivation for their study. clude that entanglement may be defined as th
However, for the purposes of this article, the fundamen- correlations that may not be created by LOCC

Allowing classical communication in the set


(CC) operations means that they are not completely
Classical Communication can actually have quite a complicated structure
to understand this structure more fully, we must
Alice Bob a closer look at the notion of general quantum o
and their formal description.

Quantum Operations – In quantum inform


ence much use is made of so-called ‘generalised
ments’ (see [10] for a more detailed account o
lowing basic principles). It should be emphas
Local Quantum Operations such generalised measurements do not go bey
(LO) dard quantum mechanics. In the usual approach
tum evolution, a system is evolved according t
FIG. 1: In a standard quantum communication setting two
operators, or through collapse caused by projec
surements. However, one may consider a mor
Figure 3: Setup
partiesof local
Alice andoperations and classical
Bob may perform communication
any generalized measure- (from [2])
ment that is localized to their laboratory and communicate setting where a system evolves through interact
classically. The brick wall indicates that no quantum particles other quantum particles in a sequence of three
3. The amount of entanglement cannot be
may be exchanged increased
coherently by performing
between localThis
Alice and Bob. operations andfirst
first we usingaddclassical
ancilla particles, (2) then we
set of operations is generally referred to as LOCC. jointtransform
unitaries and measurements on both the sy
communication (LOCC). In particular is is clear that LOCC operations cannot separable
ancillae, and finally (3) we discard some partic
states into entangled states.
tal motivations of the LOCC paradigm are perhaps more basis of the measurement outcomes. If the anc
important than these technological considerations. p We in this processthe
areunit
originally uncorrelated with
4. The maximally entangled qubit state (e.g. | AB i = (|00i + |11i)/ 2) are by definition of
have loosely described entanglement as the quantum cor- tem, then the evolution can be described by
entanglement, i.e. E(relations
AB ) =that1. can occur in many-party quantum states. Kraus operators. If one retains total knowled
This leads to the question - how do we define quantum outcomes obtained during any measurements,
5. Entanglement is additive. If the two labs share an additional pair of systems with state | AB i 2
correlations, and what differentiates them from classical state corresponding to measurement outcomes
KA ⌦ KB then the total entanglement of their two pairs is

E( AB ⌦ AB ) = E( AB ) + E( AB )

It turns out that there exists a unique function which satisfies the 5 properties. Moreover, there is a deep
connection with the source coding theory discussed above.

2.1 von Neumann entropy and entanglement entropy

Before defining our measure of entanglement we need to introduce the quantum analogue of the classical
entropy.
P
Definition 1. Let ⇢ be a density matrix on Cd with spectral decomposition ⇢ = i µi |ei ihei |. The von
Neumann entropy of ⇢ is defined as
X
S(⇢) = Tr(⇢ log ⇢) = µi log µi = H ({µ1 , . . . µd })
i

The von Neumann entropy is thus the classical entropy of the probability distribution consisting of the eigen-
values of the density matrix. It indicates the degree of “mixedness” of the state: pure states have von Neumann
entropy equal to zero, while the fully mixed state I/d has maximum entropy log d.
We can now define the entanglement measure for of a pure bipartite state.
Definition 2. Let | AB i be a bipartite state on HA ⌦ HB with Schmidt decomposition
r
X p
| AB i = µi |ei i ⌦ |fi i.
i=1

The entanglement entropy of | AB i is defined as

E(| AB i) = S(⇢A ) = S(⇢B ).

4
Let us now investigate the properties of the entanglement entropy.
1. E(| AB i) = 0 if and only if ⇢A and ⇢B are pure states which means that | AB i is a pure state | A i⌦| B i.

2. By definition, the entanglement entropy depends only on the Schmidt coefficients, so it is invariant under
local unitary transformation.

3. It is not directly evident from the definition that LOCC operations cannot increase the amount of entagle-
ment, but below we come back to this when discussing how entanglement can be converted between multiple
copies of states.
4. The maximally entangled qubits pair is the unit of entanglement with E(| AB i) = 1, the entropy of a fair
coin distribution. We will call such a bipartite state, an ebit.
5. Entanglement is additive. If two pure bipartite states have Schmidt decompositions
r
X s
X
p p
| AB i = µi |ei i ⌦ |fi i, | AB i = ⌫j |aj i ⌦ |bj i
i=1 j=1

then the tensor product has decomposition


Xp
| AB i ⌦| AB i = µi ⌫j (|ei i ⌦ |aj i) ⌦ (|fi i ⌦ |bj i)
i,j

where |ei i ⌦ |aj i and |fi i ⌦ |bj i are the Schmidt vectors on Alice’s and respectively Bob’s side. The marginal
states are
⇢A = ⇢A ⌦ ⇢A , ⇢B = ⇢B ⌦ ⇢B
where the ⇢A is the marginal of | AB i and ⇢A is the marginal of | AB i. Therefore, the entanglement entropy
is

E( AB i ⌦| AB i) = S(⇢A ⌦ ⇢A ) = Tr(⇢A ⌦ ⇢A log(⇢A ⌦ ⇢A ))


= Tr(⇢A log ⇢A ) + Tr(⇢A log ⇢A ) = E( AB i) + E(| AB i).

2.2 Entanglement distillation and dilution

Property 5. shows that the entanglement grows linearly with the number of copies of | AB i. This means that
even if a state has small entanglement, several copies of it can have at least the entanglement one, that is the
entanglement of a maximally entangled state. Does this mean that the multiple copies can be used to perform
teleportation or some other protocol requiring a maximally entangled qubits pair?
The following Theorem shows that the answer is positive, at least in an asymptotic sense. It describes two
key protocols of quantum information. entanglement distillation and entanglement dilution, dealing with the
conversion of multiple copies of entangled states into maximally entagled qubits, and conversely.

Theorem 3. Let | AB i be a state with entanglement E = E( AB ). Let ✏, > 0 be small parameters.


1. For n large enough there exist an LOCC operation mapping | AB i into n(E ✏) independent ebits
(maximally entangled qubits), with probability 1 .
2. For n large enough there exist an LOCC operation mapping n(E + ✏) ebits into n copies of | AB i, with
probability 1 .

Heuristic proof. A rigours proof is beyond the scope of this lecture, and we refer to [3] for more details.
However the main idea is quite simple. We explain this in the case when
p p
| AB i = p|e0 i ⌦ |f0 i + 1 p|e1 i ⌦ |f1 i

5
is bipartite qubit state.
a) The marginal state of the tensor product | AB i
⌦n
is ⇢⌦n
A and has eigenbasis

|ex1 ...xn i := |ex1 i ⌦ · · · ⌦ |exn i, x1 . . . xn 2 {0, 1}n

with corresponding eigenvalues given by

p(x1 . . . xn ) = p(x1 ) · · · · · p(xn ).

b) At this point we make a connection with typical sequences and source coding. Recall that the set of typical
sequences T (n, ✏) has total probability 1 and contains about 2nE sequences, each with probability roughly
2 nE
.
Let
H(n, ✏) = Lin{|ex1 ...xn i : x1 . . . xn 2 T (n, ✏)}
be the typical subspace spanned by vectors indexed by typical sequences, and let H? (n, ✏) be its orthogonal
complement.
c) We perform a measurement consisting of two projections onto subspaces H(n, ✏) and H? (n, ✏). By typi-
cality, the state will be projected onto the typical subspace with probability 1 .
d) Again by typicality, the conditional state is approximately equal to
X 1
| 0i = p |ex1 ...xn i ⌦ |fx1 ...xn i.
x1 ...xn 2T (n,✏)
2nE

By local unitary transformations this state can be mapped to that of n ebits


X 1
⌦n
| +i = p |i1 . . . in i ⌦ |i1 . . . in i.
i1 ,...in 2nE

where ik 2 {0, 1} is the index of the kth qubit.


This concludes the proof of the fact that n bipartite state with entanglement E can be converted in m = nE
ebits. A similar argument can be used to show that nE ebits can be “diluted” into n copies of the entangled
state.
A separate argument shows that if a state | AB i can be converted into | AB i by LOCC operations then
E( AB ) E( AB ) [3].

References:
[1] C. E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, 27 379-423
& 623-656 (1948)
[2] S. Virmani and M. Plenio, An introduction to entanglement measures, Quantum Information and Compu-
tation 7 1-51 (2007)
[3] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University
Press (2000)

You might also like