[go: up one dir, main page]

0% found this document useful (0 votes)
6 views41 pages

PH2113 Notes-Exercises Lectures 1-6

Uploaded by

Pranav M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views41 pages

PH2113 Notes-Exercises Lectures 1-6

Uploaded by

Pranav M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

IISER Pune

Introductory Quantum Physics, PHY 2113


August 2024 Semester

August 21, 2024


Contents

1 The Quantum World 1


1.1 Classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Stability of atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Double-slit experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Photons and Planck’s constant . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Matter waves and de Broglie relation . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Photoelectric effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Black-Body Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Contradiction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Waves in classical and quantum physics 27


2.1 Wave packets and uncertainty relation . . . . . . . . . . . . . . . . . . . . . 27
2.2 Stationary phase approximation . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Properties of waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Wave function and Schrödinger equation . . . . . . . . . . . . . . . . . . . . 37

1 The Quantum World

1.1 Classical mechanics

In classical mechanics we study the laws governing the motion of bodies. Experiments
indicate that the laws are the same for a wide range of bodies: masses from 1 gram to
thousands of kg, spatial sizes from 1 mm to 104 km, and velocities from 0 to around 104 m/s.
We call such numbers “macroscopic”. Roughly they are tied to what the human senses can
perceive.
The above numbers were chosen somewhat arbitrarily. The point is that there is a range of
masses, sizes and velocities both below and above the ranges given above, where the laws of
nature were not tested for most of history. We need powerful microscopes to study small
objects, of a size – say – 10−6 m, and telescopes to study the sizes and distances of objects

1
far from the earth. Similarly we require sophisticated instruments to weigh an object having
a mass around 10−6 gm, or study the motion of an object moving close to the speed of light.
It is when such instruments came into being, and were perfected over the centuries, that we
began to understand the world beyond what we can directly experience.
Such values, and smaller ones, can be called “microscopic”. Instruments to measure very
microscopic quantities accurately did not exist until the late 19th/early 20th century. Until
that time, all we could say was that classical mechanics seemed to be valid for all macroscopic
objects.
Mass, distance and velocity are not the only things we can study. We also measure momenta,
angular momenta, energies, charges, magnetic fields etc. The comments above are applicable
to all such quantities: there is a range of the quantity that is familiar to us, as well as a range
that lies outside personal experience. Note that this will always be true – even with the best
instruments today, there are limits to the values of observable quantities that we can study
with these instruments. Understanding Physics better requires better instruments.
Probing the laws of physics accurately beyond the macroscopic domain became possible
only by the start of the 20th century. Some of the developments that made this possible
took place in the previous century, the 19th. This century saw the gradual discovery of
atomic structure, indirectly through chemical reactions, Brownian motion etc. This was
confirmed in detail only in the 20th century, when the electron was discovered and scattering
experiments were done on materials. All these developments took place so late in history for
two reasons: (i) good instruments were lacking previously and technologies to make them
were continuously being developed, (ii) researchers took time to distinguish between many
different types of novel behaviour. For example, atoms exhibit “strange behaviour” like
fluorescence, phosphorescence, x-ray emission and radioactive decay. While the first two are
visible to the naked eye, “invisible” radiation was rendered visible much later by the invention
of photography. This itself developed slowly over centuries until it became a practical tool –
again, in the 19th century. And another key development in the 19th century, which played
a key role in 20th century developments, was the formulation of a complete set of laws of
electromagnetism.
When researchers started to think about atoms, they naturally assumed that classical me-
chanics applies. But in many different ways, experiment told them otherwise. Sometimes it
was a new experiment that flatly contradicted what was expected, and we will see examples.
Other times the experiment had in some sense already been done, and people knew what

2
the result was. Yet it took time to realise that this result disagreed with existing laws.
For concreteness let us recall some of the simplest laws of classical physics. We have Newton’s
law: 2
d ⃗x
F⃗ = m⃗a = m 2 (1.1)
dt
which, for constant external force, can be integrated to give:
F⃗
⃗v = t (1.2)
m
for a particle starting at rest. This equation predicts that if we apply a constant force to
a particle then after a sufficiently long time, it will attain any velocity – however large.
However in the 20th century it was realised that the speed of light is the maximum velocity
attainable by a particle (This is the subject of Special Relativity: the study of mechanics of
particles at large velocities, which we can only discuss briefly here).. Thus the above formula
must be wrong.
Indeed, it is wrong, or rather incomplete. There are corrections to it of order ⃗v 2 and higher,
which form an infinite series. The correct formula is the one from special relativity (we
assume motion in one dimension for simplicity):
1 d2 x
F = 3 m (1.3)
1− v2 2 dt2
c2
dx
where v = dt
and c is the speed of light. Notice that for v ≪ c, this formula reduces to the
familiar one from classical physics.
Integrating the above formula, we get:
1 F
v(t) = q t (1.4)
Ft 2 m

1+ mc

and we see that the velocity of a particle does not increase without limit. Instead, as v
increases, the increase in velocity is less and less rapid and at very large time the velocity
just approaches c. Notice that if Fmt ≪ c, then the above formula reduces to the classical one
with corrections:  2
F 1 Ft
v(t) = t − + ··· (1.5)
m 2 mc
These corrections are clearly small if Fmt ≪ c. For example take a car of mass 1000 kg and a
force F = 103 N, which is the typical force generated by its engine. Then:
Ft
≃ 3 × 10−9 t (1.6)
mc
3
where t is given in seconds. So, if such a car managed to keep up the acceleration continuously
for 10 years, and if friction, air resistance etc were completely negligible, only then would
the effects of relativity become important for it 1 .
The above discussion illustrates how physical laws can require modification when we go
outside familiar domains of parameters. However, certain experiments in the early 20th
century revealed that the laws of classical physics completely fail to give correct answers for
atoms and more generally for microscopic length/energy scales. Despite attempts, it was not
possible to “repair” this failure by modifying the existing equations. Thus we were forced
to introduce an entirely new set of laws of physics that replace classical laws. These are
the laws of quantum mechanics. It took many years (and many wrong steps) before this
conclusion was reached and the underlying structure of quantum mechanics was uncovered.
In quantum theory, one is forced to give up classical concepts like position and velocity of
a particle. These are replaced by new concepts. In this sense, QM is more radical than
special relativity. The structure of the quantum theory is very surprising and unfamiliar at
first, if one is trained in classical physics. But once we understand it well, it proves to be
a highly consistent structure with its own logical rules. And unlike the classical theory, it
gives excellent agreement with experiments in the microscopic domain.
There remains a puzzle. Imagine a bridge that was built in the 17th century using concepts of
Newtonian mechanics. It is still standing. But now we have rejected Newtonian mechanics,
so how do we understand the fact that the bridge is still standing? Another way to ask this
question is the following. Where is the exact point of transition between the macroscopic and
microscopic world, such that classical mechanics works in the first case but has to be replaced
by quantum mechanics in the second? The answer is that there is no such point of transition.
Instead, quantum mechanics gradually and effectively reduces to classical mechanics in the
macroscopic world. This is called the correspondence principle. The parameter that governs
h
the transition is called Planck’s constant h (sometimes we use ℏ = 2π
). Quantum mechanics
is significant only when typical quantities are smaller than ℏ 2 .
Even basic quantum mechanics, the subject of this course, is not sufficient to solve all the
experimental puzzles of the 20th century. The formulae of quantum mechanics, including
1
This rough calculation teaches us something important. To observe relativistic effects such as the above,
we need to consider very small objects and accelerate them in a vacuum to avoid any resistance. This is the
seed of the idea of a particle accelerator! There is much more to it though, since for small objects quantum
mechanics also plays a role.
2
This is a subtle point since we need to know the dimensions of ℏ in order to compare it with typical
quantities, so we will come back to it later.

4
the famous Schrödinger equation, do not agree with experiment when we deal with particles
moving at speeds close to the speed of light, or particles (like the photon) which always
travel at the speed of light. If we are studying a domain where both sizes are microscopic
and velocities are large, quantum mechanics has to be generalised to “Relativistic Quantum
Mechanics” and ultimately to “Quantum Field Theory”. The latter is the most accurate
class of physical theories known.

1.2 Stability of atoms

Our first example of a breakdown of classical mechanics comes from a simple observation.
Experiments in the early 20th century suggested a picture of an atom as having a heavy
nucleus at the core, with electrons whirling around it in orbits due to the central Coulomb
force. But it was soon realised that something is wrong with this picture. In circular motion,
or any motion with closed orbits, a particle is constantly being accelerated. This acceleration
is due to the central force, which curves the path into an orbit (without an acceleration, the
electrons would travel in straight lines rather than being in an orbit).
Now any charged particle, when accelerated, necessarily emits electromagnetic radiation. In
1897, Larmor used classical mechanics and electromagnetism to derive a formula for the
power radiated away by a particle of charge q subjected to an acceleration a:

dE 2 q 2 a2
= (1.7)
dt 3 4πϵ0 c3

Here, c is the speed of light and the permittivity of the vacuum is ϵ0 = 8.85×10−12 farads/m.
Let us accept this formula without proof here (the proof is taught in more advanced elec-
tromagnetism courses). Note that the above energy is being lost to radiation, so the rate of
change of energy of the electron is minus the above amount.

Exercise 1.1. Check the dimensions on both sides of this formula and verify that they
agree. Dimensional analysis is extremely important in Physics!

A convenient way to handle the annoying constant ϵ0 is to realise that:

1 farad/metre = 1 (Coulomb)2 /Joule-metre (1.8)

Exercise 1.2. Verify the above result.

5
q2
Thus the ratio 4πϵ0
which occurs in many formulae, has dimensions of energy times distance,
and therefore can be expressed in units of J-m. So in the formulae we will discuss here, we
can just use the fact that:

e2 (1.6 × 10−19 )2
Q2 = = −12
≃ 2.3 × 10−28 J-m (1.9)
4πϵ0 4π × 8.85 × 10

where e = 1.6 × 10−19 Coulombs is the charge of the electron.


Hence for electrons we can replace Larmor’s formula by:

dE 2 Q2 a2
= (1.10)
dt 3 c3
and use the value of Q2 given in the above equation.
Now if classical mechanics is correct, the Hydrogen atom consists of an electron in roughly
circular motion around the proton, bound by the Coulomb attraction. But then the Larmor
formula says that the electron has to continuously lose energy and spiral in to the centre of
the atom. This seems like a contradiction, since the Hydrogen atom is stable as far as we
know.
To understand whether this problem is significant, we need to estimate how long it would
take the electron to fall to the centre of the atom. For example if this “fall time” comes out
to be longer than the age of the universe then there is no problem! We could accept the
idea that all atoms slowly collapse over billions of years. However if the timescale for this
collapse is short, then the theory would be in direct conflict with the observed stability of
atoms. So let’s estimate how much energy an electron has in the Hydrogen atom, and how
much it loses due to radiation, according to classical mechanics.
The energy in a classical orbit is the sum of kinetic and potential energies:

1 e2 1 Q2
E = mv 2 − = mv 2 − (1.11)
2 4πϵ0 r 2 r
where m is the mass of the electron. Now there is a relation between the two terms above.
In a stable orbit, that the centrifugal force balances the Coulomb force:

mv 2 Q2
= 2 (1.12)
r r
From this, we find:
1 2 1 Q2
mv = (1.13)
2 2 r

6
So the kinetic energy is half the magnitude of the potential energy. Then the total energy
for an orbiting electron in classical physics is:
1 Q2
E=− (1.14)
2 r
This energy is negative because the electron is bound – a bound particle requires energy to
unbind it and take it far away from the centre of force. We see, as expected, that the binding
energy is more negative for tighter orbits (i.e. for smaller r).
Let’s allow the radius to change with time due to radiation: r = r(t). Then, differentiating
the above formula, we find:
dE 1 Q2 dr
= (1.15)
dt 2 r2 dt
Next let us calculate the power radiated by the electron according to Larmor’s formula. The
acceleration experienced by an electron at a distance r from the proton is:
F Q2
a= = 2 (1.16)
m r m
Inserting this into Larmor’s formula, we get:
dE 2 Q6
=− 3 4 2 (1.17)
dt 3c r m

Now we equate Eq. (1.15) with Eq. (1.17):


1 Q2 dr 2 Q6
= − (1.18)
2 r2 dt 3 c3 r4 m2
from which:
dr 4 Q4
r2 = − 3 2 = −K (1.19)
dt 3c m
where K ≃ 3 × 10−21 m3 /s.

Exercise 1.3. By looking up the value of m (the electron mass) and c, the speed of light,
verify the value of K. Keep in mind that we are not looking for high precision but just for
an order of magnitude estimate.

When we integrate, we must set r = r0 at the initial time t = 0 where r0 ∼ 5.3 × 10−11 m.
This is the “Bohr radius”, whose value is deduced from experiment: the initial electron
energy is equal to the energy that has to be supplied to a Hydrogen atom to knock out the
electron (ionise it). This is 13.6 electron-volts, which translates to around 2 × 10−18 joules.
From Eq. (1.14) we then recover the above value of r0 .

7
Exercise 1.4. Calculate r0 as outlined above.

Integrating the above equation, we get:


  31
r(t) = r03 − 3Kt (1.20)

and the “fall time” for the electron to reach the origin is therefore:

r03
tfall = ≃ 1.6 × 10−11 s (1.21)
3K

If classical physics is true, the Hydrogen atom should collapse on this timescale! This clearly
contradicts experiment.

Exercise 1.5. Verify this calculation for the fall time.

Exercise 1.6. (optional, for more advanced students.) Notice that in the above arguments, I
used the non-relativistic Larmor formula. This may not have been a good idea! The electron
inside a Hydrogen atom may, in the classical picture, be whirling at relativistic speeds, if so
I should have used a relativistic generalisation of the formula. Look up this generalisation
and apply it to the present problem to see if it changes our conclusions.

Exercise 1.7. Consider the analogous phenomenon for a gravitational orbit such as that of
the moon around the earth. We might “guess” an analogue of Larmor’s formula by replacing
q2
4πϵ0
with GM1 M2 where G is Newton’s constant and M1 , M2 are the masses of the earth
and moon. We would replace r by the earth-moon distance. Now, how much is the power
radiated by the moon and how does it compare to the energy of the moon? How long would
it take the moon to spiral into the earth? 3 .

To summarise this section, stability of the Hydrogen atom shows that the classical mechanics
picture of an atom must be wrong, and needs to be replaced with something else.

1.3 Double-slit experiments

A classic set of experiments beautifully highlights the departure of the real world from clas-
sical behaviour and the need to replace classical mechanics by a new theory at the subatomic
3
Note that this is not a correct formula, since gravitational radiation differs in important ways from
electromagnetic radiation. However it never hurts to make a guess – which is at least dimensionally correct
– and thereby estimate a result.

8
scale. These are the double-slit experiments. We will consider “idealised experiments” (the
real experiments have indeed been carried out) in four different situations. Conceptually
the setup is the same in all four cases: a source, a wall with two parallel “slits” A and B,
and a detector screen at the back. However, in different situations the actually experimental
apparatus used etc may have be quite different, as historically these experiments were done
at different times and places.
These experiments are related to the question: is light a wave or a particle? Newton thought
it’s made of particles,but later on Young argued it’s a wave and showed interference fringes.
The experiments that we to describe below indicate that it is both. More precisely, it shows
wave behaviour and particle behaviour in different regimes. On the other hand, electrons
are obviously particles and no one seriously thought they may be waves. Yet, electrons also
show both wave behaviour and particle behaviour in different regimes. Today we know that
both light and electrons, and indeed all fundamental objects, are quantum fields, which are
capable of exhibiting both particle-like and wave-like behaviour under different conditions.

Macroscopic world

Figure 1: Light incident on a double-slit.

1. Light at normal intensities. Consider a source that emits a monochromatic beam of


light. This is the celebrated experiment performed by Young in 1801. Let the source have

9
a normal intensity equivalent to, say, a light bulb. The experiment is performed for three
cases: with only A open, or only B open, or both slits open.
As is well-known, we see normal intensity patterns in cases (i) and (ii), peaked at the points
behind A and B. But in case (iii) when both the slits are open, we see an interference pattern
as shown in the figure. This pattern is produced by alternating constructive and destruc-
tive interference between the light waves emanating from the two slits. The relative path
difference between them is responsible for the alternation, and the separation ∆x between
neighbouring fringes is given by the well-known formula:
λd
∆x = (1.22)
a
where a is the slit separation, d is the distance from the slits to the detector screen and λ is
the wavelength of light.
This formula is very familiar, but let’s give a simple derivation. The path length of light going

through the lower slit and landing at some point marked x is LB = d2 + x2 . Similarly,
the path length of light going through the upper slit and landing at the same point is
p
LA = d2 + (a − x)2 . By assuming x, a ≪ d, we find that the difference LA − LB is
a2
approximately 2d
− ad x. Now suppose we shift x to x + ∆x. The change in the path
difference is ad ∆x. When ∆x is a fringe width, the path difference must be precisely equal
to one wavelength. This gives the above formula.
Next let’s recall some facts about waves. The standard expression for the amplitude α of a
freely propagating light wave is:

α(t, ⃗x) = αmax cos(ωt − ⃗k · ⃗x) (1.23)

In a matter wave, the amplitude is the displacement of the associated particle from its
equilibrium position. For a light wave, it is the displacement of the electromagnetic field, a
more advanced concept. Note that we are taking the wave to be real, also a sine function is
also allowed, but we omit it for simplicity.
Here, ω is called the angular frequency of the wave, which is the rate of change of the phase
factor per unit time. ⃗k is called the wave vector, it is the rate of change of the phase factor
per unit distance. The angular frequency is related4 to the magnitude of the wave vector by
the speed of light c:
ω = c |⃗k| (1.24)
4
Keep in mind that this relation is specifically for light.

10
The maximum amplitude of the wave is determined by the coefficient αmax .
Some related quantities used in wave mechanics are:

ω c|⃗k|
frequency: ν = =
2π 2π

period: T = (1.25)
ω
2π c
wavelength: λ = =
|⃗k| ν

Now let us keep only slit A open and let the amplitude of the wave falling on the screen be
αA . Then repeat keeping only B open, leading to an amplitude αB . We have:

αA = αmax cos(ωt − ⃗kA · ⃗xA )


αB = αmax cos(ωt − ⃗kB · ⃗xB )

(it is possible for ⃗kA and ⃗kB to differ in direction even though they have the same magnitude.)
The intensity of a beam is the power delivered per unit cross-sectional area. It is proportional
to the square of the amplitude: I(t, ⃗x) = Kα2 (t, ⃗x) where K is some constant which will not
be important for us. Thus the intensities of the beams going only through slit A or slit B
are:

2
IA = KαA 2
= Kαmax cos2 (ωt − ⃗kA · ⃗xA )
IB = Kα2 = Kα2 cos2 (ωt − ⃗kB · ⃗xB )
B max

If both slits A and B are kept open then the waves are superposed. This means the amplitudes
should be added. The amplitude of the superposed wave is:
 
αtotal = αA + αB = αmax cos(ωt − ⃗kA · ⃗xA ) + cos(ωt − ⃗kB · ⃗xB ) (1.26)

It follows that
2
Itotal = Kαtotal = IA + IB + 2K 2 αA αB

The last term is responsible for interference. To see this, let’s take time averages over a full

period T = ω
. Denoting the time average by a bar, we have:
RT
0
dt cos2 (ωt − X) 1
cos2 (ωt − X) = RT = (1.27)
dt 2
0

11
for any X. Thus
1 2
IA = IB = Kαmax (1.28)
2
Then,

Itotal = IA + IB + 2K αA αB
2
= Kαmax + 2K αA αB

To calculate the last term, we use:

2
αA αB = αmax cos(ωt − ⃗kA · ⃗xA ) cos(ωt − ⃗kB · ⃗xB ) (1.29)

We have:
1 
cos X cos Y = cos(X + Y ) + cos(X − Y ) (1.30)
2
from which:
 
2 αA αB = 2
αmax cos(2ωt − ⃗kA · ⃗xA − ⃗kB · ⃗xB ) + cos(⃗kA · ⃗xA − ⃗kB · ⃗xB ) (1.31)

Using
cos(2ωt − X) = 0 (1.32)

for any X, and combining everything, we find:


 
2 ⃗ ⃗
Itotal = Kαmax 1 + cos(kA · ⃗xA − kB · ⃗xB ) (1.33)

We see that the intensity of the light beam obtained by superposing the beams through slit
A and slit B depends crucially on the second factor above, which is time-independent but
space-dependent, and varies in magnitude from −1 to +1. When it is +1, the total intensity
is
2
Itotal = 2Kαmax (1.34)

However when it is −1 the total intensity vanishes!


Thus, interference arises when we add amplitudes and then square them to get intensities. It
is the cross term which arises on squaring that gives rise to interference. This behaviour is
normal for waves in classical optics. We can say that light of a moderate amplitude behaves
just like a classical wave5 .
5
We didn’t specify what a “normal” or “moderate” amplitude is. We will get a better feeling for this
later on.

12
2. Bullets. Next we consider a source that emits identical “bullets” of some material
weighing, say, a few grams each. A gun emits these bullets at a steady rate in a random
direction towards the screen. We perform the experiment (i) with only slit A open, (ii) with
only B open, (iii) with both open.
The results are as follows. In case (i) the detector screen records a distribution of bullets
peaked at the point behind A and tapering off symmetrically on either side. In case (ii) we get
a similar distribution peaked behind B. In case (iii) we get the sum of the two distributions,
which is symmetrically placed around the point on the screen behind the mid-point of A and
B, as seen in the figure.

Figure 2: The intensity patterns for bullets incident on a double-slit.

The bullets, being individual objects, each bullet can pass either through A or through B
but not both. In our system a bullet cannot “split” into smaller objects, therefore a given
bullet cannot go partially through one slit and partially through the other. Also, we can
adjust our gun to fire just one bullet at a time. The images on the detector screen for each
of the three cases will then slowly build up over time.
If the average intensity of the image is IA when slit A is open (shown in red) and IB when
slit B is open (shown in green), then the total intensity with both slits open is found to be:

Itotal = IA + IB (1.35)

as expected. It is just the sum of the red and green curves, with one peak behind A and
another behind B.

13
The behaviour we have seen is completely normal for particles in classical mechanics. Thus,
bullets weighing a few grams behave as classical particles.
The above two experiments involve the macroscopic world. Now we will do two more exper-
iments involving the microscopic world. They will exhibit very different behaviour.

Microscopic world

1’. Light at low intensities. We now repeat the previous experiment with light, but we
gradually reduce the intensity of the light beam while keeping its wavelength fixed. To our
surprise, as we decrease the intensity we find that the detector “clicks” once in a while instead
of getting illuminated smoothly and continuously. Moreover, as the intensity decreases, the
number of clicks per second decreases. But the energy deposited with each click does not
change.
If instead we keep the intensity fixed (and very low) and increase the frequency (decrease
the wavelength) of the light, we find that the energy per click increases. Thus we have:

clicks per second ∝ intensity of light


(1.36)
energy per click ∝ frequency of light
This is very hard to explain in terms of waves, which are continuous. But even assuming the
detector needs a minimum amount of wave energy to activate its “click”, why do we not get
higher energy clicks by increasing the intensity of the wave? Why does intensity only affect
the number of clicks per second?
In fact this points to the fact that we are observing the particle nature of light. The clicks
represent the fact that it is made up of small units, or quanta, of energy. These are called
“photons”. Recall that the relation between frequency ν and wavelength λ of light is:
c
ν= (1.37)
λ
The experiment indicates that as we change the intensity, the number of photons emitted
per second changes. However, each photon carries a fixed amount of energy and this energy
is proportional to the frequency of the corresponding light.
If we continue the experiment long enough, we discover another surprise. Though we are
sending low-intensity light through the double slit and it registers as separate “clicks” on
the screen, if we keep this up for long enough we still see an interference pattern! Thus light
on one hand exhibits interference, showing that it is a wave, but also exhibits a particle-like

14
nature by appearing as tiny lumps of energy. How can something be both a wave and a
particle?
2’. Electrons. For our final experiment, we replace the bullet gun by an electron gun. The
slits are replaced by an experimental setup called a biprism which has the same effect on
electrons as slits have on light: electrons can go through either A or B and we can open or
close slit A or B independently. The screen is an electron detector.
As the gun fires electrons, they are detected as tiny lumps of energy hitting the detector.
This is what we expect. When only slit A or slit B is open, the collection of spots where
the electrons fall steadily becomes more and more continuous and smooth. But if we keep
both slits A and B open, we get a surprise. Instead of a smooth distribution of spots, we
find that they line up into interference fringes!

Figure 3: Electrons incident on a double-slit.

One might guess that these fringes are due to a mutual interaction between the electrons
being fired. To eliminate this possibility, reduce the firing rate of the gun so that only one
electron emerges at a time, say one per second, and keep the experiment running over a
long period. By the end, interference fringes appear unmistakably. This is a spectacular
experiment! You can view it here:

https://www.youtube.com/watch?v=ZJ-0PBRuthc (Hitachi experiment,


1 minute 8 seconds)

https://www.youtube.com/watch?v=zc-iyjpzzGQ (Original Italian experiment


from 1974, 13 minutes 35 seconds)

15
This experiment clearly says that in the microscopic regime, electrons behave like “matter
waves”. Since we can fire electrons one at a time, it is literally possible for the wave of that
electron to pass half on one side of the biprism and half on the other. Yet, we never see half
an electron as a particle!
Now what is the wavelength of this wave? We can determine it from the fringe width formula
a∆x
λ= d
. Now we vary the momentum p⃗ of the incident electron and plot it against this
wavelength. We find this to be an inverse relation:
1
|⃗p | ∼ (1.38)
λ
So what do we make of this experiment? It is telling us that electrons, which are known to
be tiny particles, also behave like waves.
Note that for electrons we know p⃗ from the start, and λ is deduced from the fringes. For
light, we knew λ or equivalently ν from the start, and E was deduced from the clicks. We
conclude that light, which was earlier shown to be a wave, can also behave like a particle.
Electrons, which were originally understood to be particles, also behave like waves. This
cannot be explained in classical mechanics, where something has to be either a particle or a
wave but not both. Quantum mechanics must be a theory, not of particles or waves, but of
wave functions. These functions must come with rules to calculate physical behaviour, and
these rules should be able predict both particle-like and wave-like behaviour.

1.4 Photons and Planck’s constant

The experiments we discussed suggest that what we normally think of as waves (such as
light) have a particle nature, and what we normally think of as particles (such as electrons)
have a wave nature. The consensus today is that every material object has both a wave and
a particle nature. This is called “wave-particle duality”. However it is extremely subtle to
figure out when they display one or the other nature, and they never display both at exactly
the same time.
The experiments suggested a linear relation between energy and frequency for particles of
light (which are called “photons”):
E∼ν (1.39)
This motivated Planck to propose the relation:

Ephoton = hν (1.40)

16
where h is a constant known as “Planck’s constant”. Higher frequency light is made up of
more energetic photons, while higher intensity light contains more photons of fixed energy.
We also found a relation between momentum and wavelength for matter particles such as
electrons:
1
|⃗p | ∼ (1.41)
λ
Let us calculate the dimensions of the proportionality constant in each case. Since frequency
has units of s−1 , the first constant has dimensions of J-s (Joule-seconds) which is the same as
kg-m2 /s. In the second case the LHS has dimensions of kg-m/s, so the constant must have
dimensions kg-m2 /s, which is the same! This does not mean the proportionality constants
are equal. But it does suggest they are related, and we will see that they are actually the
same. There is only one constant, denoted h and called “Planck’s constant”. It is understood
today that h is a fundamental constant of nature. It arises in all situations where quantum
physics is relevant. The experimentally determined value of Planck’s constant is:

h = 6.626 × 10−34 joule-sec (1.42)

How do we measure this? There are many different ways, but one of the simplest (in terms
of formulae) is to use a light-emitting diode or LED. The internal mechanism of this relies
on semiconductor physics, but we don’t need those details. All we need to know is that there
is a minimum (“threshold”) voltage Vt to make it emit light. We gradually increase V from
0 until it reaches the value Vt that is just enough to cause emission. Then we study the light
coming out and measure its frequency ν (in practice there is a range of frequencies, so we
look at the highest frequency in the spectrum). The energy we have supplied is eVt where e
is the electron charge, and this must be equal to the energy of an emitted photon, which is
hν. Thus we have:
eVt
h= (1.43)
ν
Exercise 1.8. What is the range of energies, in Joules, that a single photon of visible light
can have?

In terms of the angular frequency ω = 2πν, the relation Eq. (1.40) is:
h
E= ω = ℏω (1.44)

h
where ℏ = 2π
and is called “h-bar”. Nowadays one tends to use ℏ more commonly than the
original h.

17
Note that a light wave as in Eq. (1.23) depends on both the angular frequency ω and the wave
vector ⃗k. We have seen that ω determines the energy of the associated particle. Then what
does ⃗k determine? We can obtain some information using the fact that special relativity
gives the relation: q
E = p⃗ 2 c2 + m2 c4 (1.45)

for a particle of momentum p⃗ and mass m. This theory also says that only massless particles
can travel at the speed of light. It follows that for photons, m = 0 and hence:

E = c|⃗p | (1.46)

Exercise 1.9. What is the range of momenta, in MKS units, for a photon of visible light?

Now using E = ℏω = cℏ|⃗k|, we find that |⃗p | = ℏ|⃗k|. It is reasonable to suppose that this
relation between magnitudes extends to a relation between the corresponding vectors:

p⃗ = ℏ⃗k (1.47)

So far we have only justified this relation for the case of light.

1.5 Matter waves and de Broglie relation

Now let’s consider relation between momentum and wavelength for the objects that we
traditionally consider to be particles – such as electrons. De Broglie made this proposal in
his Ph.D. thesis. Following Planck’s proposal that light is made of photons of energy E = hν,
he proposed that electrons and all other matter particles can behave as “matter waves”. He
proposed that the frequency of these waves is related to the energy of the particles by the
same relation E = hν 6 . However the frequency of an electron is not something we observe
directly, rather from the interference fringes we obtain the wavelength. So we need to do a
little more work. On the way, a new insight will emerge.
Electrons have a rest mass m ̸= 0 and therefore can never travel at the speed of light. Let us
calculate the velocity v of a free electron in terms of its energy E and momentum p. Using
6
This was a huge assumption! But it has turned out to be well-supported by experiment.

18
the following formulae from special relativity:

mc2
E=q
2
1 − vc2
mv (1.48)
p= q
2
1 − vc2
p
E = p⃗ 2 c2 + m2 c4

we get:
dE dp 2 2
= p c + m2 c4
dp dp
pc2
=p
p2 c2 + m2 c4 (1.49)
pc2
=
E
=v

Next we compare this with the group velocity of a wave. This is defined by:
dω dν
vg = = 1 (1.50)
dk d( λ )

Assuming that the velocity of a particle and the group velocity of its wave are the same,
v = vg , we have:
dE dω dν
= = 1 (1.51)
dp dk d( λ )
Now E = ℏω = hν immediately implies the de Broglie relation:
h
p = ℏk = (1.52)
λ
This nicely explains the results of the electron interference experiment.

Exercise 1.10. Show that the first and second lines of Eq. (1.48) imply the third line.

Exercise 1.11. Recall the definition of phase velocity and group velocity of a wave. Why
did we choose to equate the latter and not the former to the particle velocity?

Exercise 1.12. What is the typical electron energy and fringe width in electron interference
experiments? You will need to look this up.

19
Exercise 1.13. What would be the wavelength of matter waves made up of tennis balls
moving at 10 m/sec? What about pollen particles at the same speed?

The relation p = ℏk is the same as we found for light, and we now see that it also holds for
matter waves! This makes it very natural to propose the vector relation Eq. (1.47) for all
material objects. This relation will be very useful to us in what follows. In the next Section
we will re-derive some standard results for classical waves and then, using Eq. (1.47) in these,
we will discover very important properties of quantum physics.

Exercise 1.14. Based on the above considerations, what is the relation between frequency
and wavelength for the matter wave of a particle of mass m? Notice that h drops out only
in the limit m → 0, the case of light. What does this result teach us?

To summarise, we have arrived at one of the basic postulates of quantum mechanics:


All material objects are both particle-like and wave-like. Their momentum p⃗ (a particle-like
property) is related to the wave vector ⃗k (a wave-like property) by:

p⃗ = ℏ⃗k (1.53)

where ℏ is a universal constant of nature.


So far, we have discussed the problem of atomic stability and the double-slit experiment in
order to give simple, clean expositions of how the laws of classical physics break down in the
microscopic world. For these experiments we do not need to use concepts like temperature
or thermodynamics, and we did not need any specific materials.
However, historically the experiments that gave rise to the birth of quantum theory were
different from the ones that we have discussed. These were: (i) the problem of black-body
radiation (which involves temperature), and (ii) the photo-electric effect (which involves
materials). We now summarise the key ideas of both these experiments.

1.6 Photoelectric effect

This section is for self-study.


Here we summarise the actual experiment that was done to demonstrate the particle nature
of light. It leads to exactly the same conclusions as our double-slit experiment 1’ with light
at low intensities, but in a different experimental situation.

20
When light shines on a metal, electrons are ejected. This is reasonable since light waves
carry energy, so they could transfer this energy to electrons which are moving around in the
metal and cause electrons to be ejected. But while the fact is not surprising, the behaviour
of this effect as we change (i) the intensity, (ii) the frequency of light, is puzzling if we use
classical physics.
In classical physics, a wave imparts energy to a material and if we crank up the intensity
of the wave then the energy imparted is increased. So we could keep the frequency and
intensity both low, such that no electrons are emitted. Now increase only the intensity of
the wave. We should eventually reach enough energy to start ejecting electrons, but this
does not happen! However large the intensity, emission does not take place at low frequency.
Thus we conclude that:
(i) Electrons are ejected only if the light wave has a “threshold frequency”.
In the next example we again start with low intensity, but this time the frequency is high
enough that ejection of electrons takes place. Now keeping the frequency at this value, we
increase the intensity. We find that this increases the number of electrons ejected per unit
time. However in this way we do not increase the kinetic energy of any individual electron.
This leads to the second conclusion:
(ii) Above threshold, increasing the intensity increases only the number of electrons ejected
but not the energy of each electron.
Finally, we fix the intensity and keep increasing the frequency “above threshold”. Now we
find that the number of electrons ejected per unit time remains small, but the kinetic energy
of each electron keeps increasing.
(iii) Above threshold, increasing the frequency increases only the kinetic energy of each
electron and not the number of electrons ejected.
These three properties do not seem to make sense in a classical wave picture of light! But
if we assume light is made up of photons of energy hν, and that each photon individually
interacts with an electron in the metal, then everything falls into place. To eject the electron,
a photon has to have a large enough frequency so that hν > W , the “work function” of the
metal, which is the minimum energy to knock out an electron. Increasing the intensity only
increases the number of photons, not the energy of each one which is always hν. This in turn
increases the number of photon-electron interactions, which in turn increases the number of
electrons ejected. The last point is explained by the fact that a photon transfers all its energy

21
to the electron, so a large hν results in electrons of higher kinetic energy.
If W is the amount of energy needed to eject an electron from the metal (called the “work
function”), then we have:
p⃗ 2
= hν − W (1.54)
2m
Everything in this equation can be experimentally measured, and experiments soon verified
this simple equation due to Einstein.

1.7 Black-Body Radiation

This section is for self-study.


For black-body radiation, the problem is the following. A black body is an idealised object
that can absorb all radiation (at all frequencies). The same object, when heated, will emit
radiation at all frequencies. Such a body can be modelled as a cavity inside a solid material
held at a temperature T . The radiation in the cavity will be in thermal equilibrium at this
temperature.
The question is, what is the energy density of radiation of frequency ν emitted by a black
body heated to a temperature T ? Since frequencies are continuous, we should ask how much
energy density is contained in the frequency range ν to ν + dν. We denote this as uν (T )dν.
Let us calculate it using classical physics.
We model the cavity as a cubical hole of side L inside a conductor, inside which we fit
electromagnetic waves. These standing waves must vanish on the walls, otherwise energy
will flow and we will not have standing waves. Let us first consider waves in one dimension
that vanish at x = 0, L. Such a wave vanishes twice in a single period (think of a sine wave,
it vanishes at both 0 and π). Thus with these boundary conditions, there can be any integral
number n of half-periods:

L= (1.55)
2
2π nπ
The corresponding wave number is k = λ
= L
. Note that n > 0 since a standing wave
remains unchanged up to a sign when n is replaced by −n.
Going from the one-dimensional to the three-dimensional case of interest, we see that the
wave number is replaced by a wave vector ⃗k, whose components are:
π
(kx , ky , kz ) = (nx , ny , nz ) (1.56)
L

22
2
where nx , ny , nz are integers ≥ 1. Hence ⃗k 2 = Lπ 2 (n2x + n2y + n2z ). Now the frequency is related
to the magnitude of the wave vector, |⃗k|, by ν = c |⃗k|. So we get: 2π

4L2 2 4L2
n2x + n2y + n2z = ν = 2 (1.57)
c2 λ
In the above derivation we went from the one-dimensional to the three-dimensional case
without a detailed justification. Here is a short and precise derivation starting directly from
the wave equation for the electric field, which is:
2⃗
∇ ⃗ =∂ E
⃗ 2E (1.58)
∂t2
The boundary conditions for a cubical cavity are:
⃗ = 0, y, z) = E(x
E(x ⃗ = L, y, z) = E(x,
⃗ ⃗
y = 0, z) = E(x, y = L, z)
(1.59)

= E(x, ⃗
y, z = 0) = E(x, y, z = L) = 0
These boundary conditions are satisfied by the following function:
nx πx ny πy nz πz 2πct
sin
E(⃗x, t) = E0 sin sin sin (1.60)
L L L λ
where E is any component of the field. Inserting this into the field equation, we get:
4L2
n2x + n2y + n2z = 2 (1.61)
λ
2Lν
Eq. (1.57) defines a sphere of radius R = c
in the abstract space of nx , ny , nz (just like the
2 2 2 2
equation x + y + z = R in ordinary space). Now let N (ν)dν be the number of allowed
modes in the frequency range ν to ν + dν. This is equal to the allowed values of integers
nx , ny , nz in the range R = 2L
c
ν to R + dR = 2L
c
(ν + dν). If L ≫ λ (as is the case for
light in a macroscopic cavity) this can be equated to the volume of this shell, 4πR2 dR 7 .
1
However since nx , ny , nz are all ≥ 1 they only fall in one octant of 3d space, which has 8
of
this volume, so we must divide by 8. Also there are two polarisations of light, so we have to
multiply by 2. Thus we get:
4πR2 dR L3
Nν dν = × 2 = πR2 dR = 8π 3 ν 2 dν (1.62)
8 c
We see that the volume of the cavity, L3 , appears on the RHS. Bringing it to the left we get
the density of radiation, the number of modes per unit volume, to be:
Nν ν2
nν = = 8π dν (1.63)
L3 c3
7
Note that the radius is a dimensionless number, hence so is the volume of the shell.

23
To get the energy density, we multiply this number density by the average energy of each
mode at temperature T . The probability P (E) of having energy E is given by the Boltzmann
distribution:
− k ET
e B
P (E) = R (1.64)
∞ − k ET
0
dE e B

where kB is the Boltzmann constant. Hence the average energy is:


Z ∞ R∞ − E
0
dE E e kB T
⟨E⟩ = dE E P (E) = R = kB T (1.65)
∞ − k ET
0
0
dE e B

Exercise 1.15. Calculate the above integrals and verify that their ratio is kB T .

Thus finally the energy density of radiation is given by Rayleigh-Jeans law:


ν2
uν (T ) = nν kB T = 8πkB T (1.66)
c3
This agrees very well with experiments at low frequencies, but it cannot be correct at very
high frequencies. The formula says that the density grows without limit as a function of ν,
which in particular means the total emitted energy density (integrated over all frequencies)
would be infinite! This was called the ultraviolet catastrophe.
In 1900, Planck found a simple solution. Everything in the above derivation is correct
until we reach Eq. (1.65). This probability distribution assumes that the allowed energies of
radiation form a set of continuous values E, which is a basic feature of classical mechanics.
Planck proposed that radiation of a given frequency ν is made of photons of energy hν.
In that case, the possible energies of radiation are nhν where n is the integer number of
photons. So we must replace Eq. (1.65) with:
∞ P∞ − knhνT
X
0 nhν e B hν
⟨E⟩ = nhν P (nhν) = P nhν = hν (1.67)
∞ −k T
n=0 0 e
B e kB T − 1
Exercise 1.16. Perform the sum in Eq. (1.67) and verify the result.

Putting this into the formula we get Planck’s radiation law:


ν2 hν
uν (T ) = 8π 3 hν (1.68)
c e kB T − 1
The last factor is very interesting. In the limit when hν ≪ kB T , we can expand the denom-
inator as:


e kT − 1 ∼ kB T
(1.69)

24
and we immediately see that Rayleigh-Jeans law is reproduced. However if hν > kB T then
we cannot make this approximation. As hν becomes large, the denominator grows very fast
and the energy density uν (T ) falls rapidly to 0 for large ν. One can integrate this over all
ν and the answer turns out to be finite. So there is no ultraviolet catastrophe. Everything
about the behaviour at high frequencies has changed after we assumed that the energy of
radiation comes in quanta!
We can also now understand what is the range of frequencies for which the Rayleigh-Jeans
formula agrees with experiment. At any given temperature, if we perform experiments at
frequencies ν such that hν ≪ kB T then Planck’s formula reduces to that of Rayleigh-Jeans.
It is only when hν > kB T that Rayleigh-Jeans law (derived from classical physics) breaks
down and has to be replaced by Planck’s law. For a fixed frequency, this corresponds to
the limit of low temperature. Thus, here it is temperature that tells us when classical
physics works and when it doesn’t. Experiments with high frequency radiation, or at low
temperature, confirm Planck’s formula perfectly.
R∞
Exercise 1.17. Calculate the total radiated energy density 0
uν dν for a black body using
Planck’s formula.

1.8 Contradiction?

This section is optional, for interested students.


Our common sense tells us it is not possible for something to be both a particle and a wave.
For example a wave has both crests (high points) and troughs (low points). When the crest
of a wave meets the trough of another, the result is zero net disturbance and therefore zero
energy. Thus, two waves can cancel each other out at some places, which is basically the
interference phenomenon. But two particles surely cannot cancel each other! For example
two electrons have double the electric charge of one of them. Where would the charge go if
the particles cancel?
By carefully watching a beam of light undergoing interference, or a beam of electrons un-
dergoing interference, one might hope to pin down the contradiction. For example we can
identify a spot on the detector corresponding to a dark fringe. With only slit A open this
spot would be bright which means some photons landed there through A. Also with only B
open the same spot would again be bright, meaning some photons passing through B landed
there. Now with both slits open, photons must be landing there through both A and B.

25
Then how can they possibly cancel each other out?
We can try to resolve this apparent paradox by watching the photons. Place a camera near
slit A and observe photons going through the slit. Unfortunately to observe a photon, this
camera has to absorb that photon! Thus, observing all those photons that pass through A
will prevent any of them from reaching the screen. Then only photons from slit B will reach
the screen and there will be no interference pattern.
A similar situation holds for electrons. To observe which slit any electron passed through,
we have to shine light on it. This causes photons to be incident on the electrons and impart
momentum to them in random directions. By a calculation we can show that, if the photons
are sufficiently energetic to extract precise information about which slit the electron came
through, then they also impart so much random momentum to the electrons that the fringe
pattern is destroyed. If the photons are less energetic, the fringes may survive but then it
will not be possible to tell which slit a given electron passed through. In this way, taking
into account the in-principle aspects of interactions between the observer and the apparatus,
we see that a contradiction is avoided.
This discussion has been qualitative, and obviously it is essential to know details of the
“calculation” described above if we are to believe that a contradiction is avoided. However,
going into these details will take us a little too far afield at this stage. We have not yet
learned anything significant about quantum mechanics in this course. Once we learn some
more mathematics, namely the equations of quantum mechanics, and also some concepts
like “quantum entanglement”, we will be in a better position to understand the resolution of
this puzzle. Let us just mention that it is closely related to the general study of “quantum
information” – a subject of great current interest, which we will introduce at some point of
this course.
There have also been direct experiments to test what happens when one observes which slit
the photon went through. These are called “quantum eraser” experiments. A respected
paper on this is “Quantum optical tests of complementarity” by Marian O. Scully, Berthold-
Georg Englert and Herbert Walther, Nature 351, 111–116 (1991).
I should mention here that there is a large amount of misinformation on this subject on the
internet! So be careful not to believe every crank who has his/her own theory or experiment
about this.

26
2 Waves in classical and quantum physics

2.1 Wave packets and uncertainty relation

To understand photons or electrons in quantum theory, we will first revise some properties of
2π 1
general waves. We will make use of the wave number k = ω
= ν
that was introduced earlier
and derive some interesting relations between the position and wave-number representations
of waves. At this stage our waves have nothing to do with quantum mechanics. Once
we understand waves of wave number k, we will then use de Broglie’s relation p = ℏk to
understand matter waves of momentum p. The mathematics will be almost the same, but
the physics will be quite different – as it will teach us about the quantum behaviour of matter
particles like electrons.
For simplicity we work with one-dimensional plane waves and take their coefficient αmax = 1.
Also we work at a fixed time, say t = 0. Thus, the wave is just

cos kx = Re eikx (2.1)

where Re means “real part”. In what follows, we will drop the Re and restore it whenever
needed.
What is the nature of a plane wave? It extends over all x, and in fact it looks the same for

all regions of x (repeating itself each time x is shifted by a wavelength, λ = k
). So this
does not look even a little bit like a particle. If we create an ideal plane wave within the
boundaries of a lab, it will oscillate between the same maximum and minimum amplitudes
everywhere in the lab and its intensity will be independent if location.
However we can also use waves to create lumps of energy that are somewhere rather than
everywhere. This is achieved by taking a superposition of plane waves of different frequen-
cies/wave numbers. If we do this suitably then we obtain something called a “wave packet”
which is somewhat localised. Such superpositions are used in optics when we want to create
a “sharp pulse”. Accordingly, we consider:
Z ∞
1
α(x) = √ dk β(k) eikx (2.2)
2π −∞
The above integral is called a Fourier transform. For us, it is just a linear combination of
plane waves of different wave numbers. Each one is weighted by an amount β(k).
Now we focus on a special class of Fourier transforms where β(k) is a function (in general
complex) of k that is peaked around a definite value k0 , with a “spread” from k0 − ∆k to

27
k0 + ∆k. We normalise it by requiring:
Z ∞
dk |β(k)|2 = 1 (2.3)
−∞

It can be shown that the above relation implies the analogous result:
Z ∞
dx |α(x)|2 = 1 (2.4)
−∞

Exercise 2.1. Derive the above result.

Given any shape for β(x), what can we say about the shape of α(x)? We first consider the
case when β is a real Gaussian:
1 −2 1 (k−k0 )
2

β(k) = 1 e
a2 (2.5)
(πa2 ) 4

This satisfies the normalisation condition.

Exercise 2.2. Show that Eq. (2.5) satisfies Eq. (2.3).

We see that β(k) is peaked about k = k0 . By plotting the function, we can easily convince
ourselves that its spread ∆k is of order a, since a Gaussian vanishes rapidly outside this
range (soon we will define and calculate ∆k more precisely).
Now what about α(x)? Is this function peaked too, and if so what is its spread? To answer
this let us insert Eq. (2.5) into Eq. (2.2) and evaluate the integral. The result is:
 14
a2

1 2 2
α(x) = e− 2 ax
(2.6)
π
R∞
We have already seen that if β(k) is normalised then α(x) is always normalised: −∞
dx |α(x)|2 =
1. This is easy to verify in the present example. Moreover, it is peaked about x = 0 and has
a spread ∆x of order a1 .

Exercise 2.3. Verify Eq. (2.6) as well as the fact that α(x) is normalised.

The object we have constructed is called a “Gaussian wave packet”. It is Gaussian both in
x-space and in k-space. Moreover, the spread of β(k) in k and the spread of α(x) in x are
inversely related. We may write:
∆k∆x ≃ 1 (2.7)

28
This is an “uncertainty relation” for waves, stating that the spread of the function β(k) in
the wave number k is inversely related to that of α(x) in the position x. It is worth recalling
that wave number has dimensions L−1 so the above relation makes sense.
There is nothing surprising about this relation. It tells us something very reasonable about
waves: if we try to localise them in wave number then they are very de-localised in space
(the plane wave is an extreme example). The converse is also true. Suppose we choose
β(k) = δ(k − k0 ) where δ(k) is the Dirac δ-function, satisfying:
Z ∞
dk δ(k − k0 )f (k) = f (k0 ) (2.8)
−∞

Then the Fourier transform in x-space is:


Z ∞
1 1
α(x) = √ δ(k − k0 )eikx = √ eik0 x (2.9)
2π −∞ 2π
which is a completely de-localised plane wave of momentum k0 .
The Gaussian manages to do better, localising both k and x to some extent. But there is a
limit: if the parameter a is small then we have more localisation in k but less in x. If a is
large, the opposite is true. It is impossible to localise a wave in terms of wave number and
also in terms of position.
So far we only estimated the uncertainty of position/wave number in a Gaussian wave packet.
Let us define and calculate these quantities precisely. The function |β(k)|2 , being normalised,
can be thought of as a probability measure. The average of any function f (k) with respect
to this measure is defined as:
Z ∞
⟨f (k)⟩ = dk f (k) |β(k)|2 (2.10)
−∞

Thus ⟨k⟩ gives the average wave vector, and ⟨k 2 ⟩ is the average of the square of the wave
vector. A standard measure of the uncertainty/spread of |β(k)|2 around a point is given by
the standard deviation:
p p
∆k = ⟨(k − ⟨k⟩)2 ⟩ = ⟨k 2 ⟩ − ⟨k⟩2 (2.11)

We can simplify our calculations if we take β(k) to be symmetric about k = 0 rather than
k = k0 . Thus ⟨k⟩ = k0 = 0. So we only have to calculate:
Z ∞
2 2 2 a2
⟨k ⟩ = k |β(k)| = (2.12)
−∞ 2
from which we get ∆k = √a2 . Similarly we can calculate the standard deviation for x using
α(x) in Eq. (2.6), to get ∆x = √12a .

29
Exercise 2.4. Verify Eq. (2.12) where β(k) is given by Eq. (2.5).

We thus find, in this example, that:


1
∆k ∆x = (2.13)
2
Notice that this exact answer is completely independent of the parameter a, but each of the
factors on the LHS does depend on a. This proves our statement about the inverse relation
between the uncertainties in wave vector and position.
The special choice of wave packet in Eq. (2.5) is known as a “minimum uncertainty wave
packet”. It is a mathematical theorem (which is left for more advanced courses) that Gaus-
sian wave packets minimise the uncertainty between x and k.

Calculation of relevant integrals

The basic equation for integrating a Gaussian is:


Z ∞  π  21
2
dy e−Ay = (2.14)
−∞ A
We also have the slightly more general form:
Z ∞  π  12 B2
2
dy e−Ay +By = e 4A2 (2.15)
−∞ A
which is easily derived from the previous one by completing the square:
2
B2

2 B
−Ay + By = −A y − + (2.16)
2A 4A
B
and then shifting y − 2A
to y in the integral.
Let us take the function β(k) from Eq. (2.5), square it and use Eq. (2.14):
Z Z ∞
2 1 − k2
2
dk |β(k)| = 1 dk e a
(πa2 ) 2 −∞
1 2 12 (2.17)
= 1 × (πa )
(πa2 ) 2
=1

Next, let us calculate α(x) when β(k) is a Gaussian and verify Eq. (2.6). For this we insert
Eq. (2.5) into Eq. (2.2) to get:
Z ∞
1 1 2
− 12 k2 ikx
α(x) = √ 1 dk e a e (2.18)
2π (πa2 ) 4 −∞

30
From Eq. (2.15) this becomes:
 14
a2

1 1 2 12 − 12 a2 x2 1 2 2
α(x) = √ 1 (2πa ) e = e− 2 ax
(2.19)
2π (πa2 ) 4 π

Thus Eq. (2.6) is verified.

2.2 Stationary phase approximation

This subsection is optional, for interested students.


For the general (non-Gaussian) case, let’s consider a function β(k) which is real and peaked
around k = 0 with a spread −∆k < k < ∆k, but not necessarily Gaussian. Also it need not
be symmetric under k → −k. We now try to estimate the range beyond which α(x) tends
to 0.
The integral Eq. (2.2) which gives us α(x) is (like all single-variable integrals) given by the
area under the curve of the integrand (remember that the area counts as positive above the
x-axis and negative if it is below).
Now, the integrand is the product of a peaked function β(k) and an oscillating function
eikx (since we are going to take the real part, we replace the latter by cos kx for practical
calculations). So we are going to understand the properties of α(x) not by doing any integral,
but simply by looking at the function:
1
√ β(k) cos kx (2.20)

and estimating the area under it pictorially. We will discover two key properties of this area
for different values of x:
(i) The area under the curve is largest for x = 0,
(ii) The area under the curve has a spread in x such that ∆k ∆x ≳ 12 . Once x goes much
beyond this ∆x, the area goes to zero.
In the first figure we see a sketch of some peaked function β(k) (which is not a Gaussian).
We have scaled the area under it to 1 for convenience. This area is the same as α(0). The
spread ∆k has also been chosen to be of order 1. In subsequent figures we see the same
function, but multiplied by cos(kx) for the values x = 1, 2, 5, 10, 50. Thus the area under
the curve is proportional to the value of α(x) for x = 1, 2, 5, 10, 50.

31
1.0 1.0

0.5 0.5

-2 -1 1 2 -2 -1 1 2

-0.5 -0.5

-1.0 -1.0

β(k), Area = 1 β(k) cos(k), Area = 0.37

1.0 1.0

0.5 0.5

-2 -1 1 2 -2 -1 1 2

-0.5 -0.5

-1.0 -1.0

β(k) cos(2k), Area = 0.14 β(k) cos(5k), Area = 0.007

1.0 1.0

0.5 0.5

-2 -1 1 2 -2 -1 1 2

-0.5 -0.5

-1.0 -1.0

β(k) cos(10k), Area = 0.00005 β(k) cos(50k), Area = −0.000001

Figure 4: The integrand of Eq. (2.2) for a given peaked function β(k) and x = 0, 1, 2, 5, 10, 50.
We can visually estimate that the area under the curve is maximum at the beginning, and
goes down to nearly 0 by the time x = 5 (fourth figure).

32
We see that the total area goes down steadily as x increases, due to cancellations between
neighbouring regions where the function is positive and negative, but has a similar magni-
tude. This tells us that the peak of α(x) lies at x = 0. The spread of α(x) will be the range
of x beyond which these cancellations are very effective.
For each case, the area under the integrand has been numerically evaluated and is shown in
the figure. It confirms our visual estimates. We also see that α(x) decreases to around half
its value even before x reaches 1. So we can estimate ∆x ∼ 0.75. Together with ∆k = 1,
this gives us the right order of magnitude of the uncertainty relation. We will of course get
different values for the spread depending on how we define ∆x, ∆k. As we saw, the standard
deviation is a more precise definition.

Exercise 2.5. Try replacing the peaked function in these plots with a square wave, let’s say
β(k) = 1 for −1 < k < 1 and β(k) = 0 otherwise. Verify that this analysis works in this
case too. Try non-symmetric functions. For all this, you may use software if it is accessible
to you. But a lot can be learned just by playing with functions with pen and paper. Graph
paper is helpful to estimate areas under curves that you have drawn, but if you don’t have
that then try to devise some other way.

Now let us state the general theory behind these cancellations. The phase of the integrand
in Eq. (2.2) is kx. The stationary phase approximation tells us that the area under the curve
given by the integrand is maximum for that value of x that makes the phase stationary in k
(i.e. have vanishing derivative), when evaluated at the peak value of k:
d
(kx) = 0 =⇒ x = 0 (2.21)
dk k=0

Moreover, if ∆k is the spread in k of the multiplying function, then the phase starts to
1
oscillate rapidly once x takes a value x ∼ ∆k
around the stationary point. Beyond this value
of x, rapid phase oscillations are effective in cancelling out the area under the curve. So this
value can be identified with ∆x, the spread of α(x), and the uncertainty relation is satisfied.
It is quite easy to generalise this to complex functions β(k). In this case we write β(k) in
terms of its magnitude and phase:

β(k) = |β(k)| eiθ(k)

Then, Z ∞
1
α(x) = √ dk |β(k)| ei(kx+θ(k))
2π −∞

33
This time, assuming |β(k)| is peaked near k = k0 , the stationary phase occurs at a position
x satisfying:
d 
kx + θ(k) =0 (2.22)
dk k=k0

If we call this position x0 then we have:


x0 = − (2.23)
dk
k=k0

Previously when we took β(k) to be real, we found that the peak of α(x) was always at
x = 0. We now see that allowing β(k) to have a phase part has the effect of shifting x0 away
from 0.
Now we can expand the function θ(k) near k = k0 to get:


θ(k) ≃ θ(k0 ) + (k − k0 ) = θ(k0 ) − (k − k0 )x0 (2.24)
dk k=k0

In the last step we used Eq. (2.23). So we can write the phase in the integrand as follows:

kx + θ(k) = k0 x + (k − k0 )x + θ(k0 ) − (k − k0 )x0


(2.25)
= k0 x + θ(k0 ) + (k − k0 )(x − x0 )

The first two terms are independent of k and are not of interest to us. The last term gives
us the phase factor:
ei(k−k0 )(x−x0 ) (2.26)

Notice that k0 is present because we explicitly allowed β(k) to be peaked there, while x0
came from the phase part of β as mentioned above.
We see that as the product (k − k0 )(x − x0 ) varies over 2π, the above phase oscillates from
+1 to −1. This gives us a measure of where the cancellations start to set in. In general,
we have not guaranteed that cancellations do occur beyond this point! Rather, we found a
minimum amount of phase change that needs to take place before cancellations can occur.
Thus we only find a lower bound:
∆x ∆k ≳ 1

This is the uncertainty relation for waves. As we said at the beginning of this section, it has
nothing to do with quantum mechanics. It merely says that if we try to localise a wave to
a finite region ∆x, there must be a minimum wave number spread ∆k given by the above
relation.

34
The relation is completely symmetrical between x and k. If we try to make x very sharp,
then the spread of k is quite large. If we keep k sharp, then cancellations do not start till x
has gone quite a large distance, so the wave packet is very spread out in space.

2.3 Properties of waves

We now discuss some general features of the formulae above.


(i) The relation Eq. (2.2) is symmetrical between β(k) and α(x), in the sense that we can
invert it: Z ∞
1
β(k) = √ dx α(x) e−ikx (2.27)
2π −∞
Thus, we can specify a wave packet by either specifying β(k) or specifying α(x). We cer-
tainly cannot specify both independently, as they are related to each other by Eq. (2.2) and
Eq. (2.27). We refer to α(x) as describing a wave in the “x-representation” and β(k) as
describing the same wave in the “k-representation”.
Exercise 2.6. Show that Eq. (2.27) follows from Eq. (2.2).

(ii) Notice that even though our discussion started with wave packets, Eq. (2.33) does not
really require that we have a wave-packet. Even if β(⃗k, t) is not peaked with a finite spread
at any given time, we can still write this equation to relate the “x-representation” and “k-
representation” of any given wave. These are just two different ways of representing the
same information. However the two representations will only exist if the Fourier integrals
are well-defined A minimum requirement for the integrals to be finite is that the function
inside the integral should go to 0 at ±∞ 8 . .
(iii) If we multiply β(k) by k, using Eq. (2.27) we see that the same result can also be achieved
by differentiating α(x) with respect to x:
Z ∞
1
kβ(k) = √ dx k α(x) e−ikx
2π −∞
Z ∞  
1 d −ikx
=√ dx α(k) i e
2π −∞ dx
Z ∞   Z ∞   (2.28)
1 d −ikx
 1 dα −ikx
=√ dx i α(x) e +√ dx −i e
2π −∞ dx 2π −∞ dx
Z ∞  
1 dα −ikx
=√ dx −i e
2π −∞ dx
8
This is only a necessary but not sufficient condition. The precise conditions for the Fourier transform to
exist are important, but beyond the scope of the present course.

35
The first step comes by taking k inside the integral (allowed because the integral is over
d
 −ikx
x). In the the second step, we used the fact that keikx = i dx e . In the third step we
performed “integration by parts”. This says that for any two functions A(x), B(x):
dB d dA
A= (AB) − B (2.29)
dx dx dx
Finally, in the fourth and last step we dropped the first term of the third line, because it is
a total derivative: Z ∞
d ∞
dx (AB) = AB =0 (2.30)
−∞ dx −∞

This is true for any function AB that falls off at infinity.


The above manipulation tells us that “the effect of multiplying β(k) by k is the same as the
d
effect of acting with the differential operator −i dx on α(k)”. By repeated action, we find
the more general rule:  
d
f (k) α(k) → f −i β(x) (2.31)
dx
9
where f (k) is some suitable function, such as a polynomial in k.

Exercise 2.7. Verify all the steps leading to Eq. (2.31).

Because Eq. (2.27) is the inverse of Eq. (2.2), we similarly have the rule that multiplying
d
α(x) by x is equivalent to acting on β(k) with i dk . Note the plus sign in this case, which is
due to the plus sign in the exponent in Eq. (2.2).
(iv) The relation between α and β can be extended to time-dependent waves. We will also
go to three dimensions since that will be more useful for what follows. We simply allow
time-dependent linear combinations of the time-independent plane wave:

eik·⃗x (2.32)

So we again introduce a β to make linear superpositions of plane waves, however we now


allow it to be time-dependent, so it is β(⃗k, t). Then we consider:
Z
1 ⃗
α(⃗x, t) = 3 d3 k β(⃗k, t) eik·⃗x (2.33)
(2π) 2
For the inverse, we have:
Z
1 ⃗
β(⃗k, t) = 3 d3 x α(⃗x, t) e−ik·⃗x (2.34)
(2π) 2

9
Technically, if f (k) has a power-series expansion in powers of k then we can replace every k n in that
∂ n
power series by −i ∂x .

36
(v) The relation Eq. (2.31) and its inverse are easily generalised to three dimensions:
⃗k → −i∇
⃗ x, ⃗k
⃗x → i∇ (2.35)
⃗ is the gradient operator, namely:
where ∇
 
⃗ ∂ ∂ ∂
∇x = , , (2.36)
∂x ∂y ∂z
To be precise, the above equation means:
⃗k β(k) ↔ −i ∇α(x)

(2.37)

⃗x α(x) ↔ i ∇β(k)
where ↔ means the two sides are related by a Fourier transform, assuming that α and β
were already related in this way.
Finally, consider the special case where β(⃗k, t) = β(⃗k)e−iωt for some ω. Then α(x) has the
same time dependence (the Fourier transform does not affect the time dependence). Thus if
we differentiate we find:
i∂β(⃗x, t)
ωβ(x, t) =
∂t (2.38)
i∂α(⃗x, t)
ωα(x, t) =
∂t
Thus, multiplying either β or α by ω is the same as differentiating with respect to time.
This relation has a different status from the previous ones involving ⃗x and ⃗k. The position
and wave vector are alternate representations of the same information, while that is not true
of time and frequency. Also we had to use a special form of β(⃗k, t) to get this result.
(iv) During the above discussions about waves, we did not specify whether the wave is
relativistic (travels at the speed of light) or non-relativistic. However, equations like Eq. (2.2),
Eq. (2.27), Eq. (2.31) and Eq. (2.38) actually do not care about this. They are equally valid
for relativistic or non-relativistic waves.

2.4 Wave function and Schrödinger equation

We now consider a non-relativistic particle of mass m, such as an electron, with velocity


10
v≪c . As we explained at the outset, classical physics tells us that Newton’s equations
10
For particles that move at velocities close to c we need to use special relativistic quantum mechanics
which is much more difficult than the non-relativistic version we are discussing here. In particular this
means it is harder to study the quantum physics of light itself, although light was the first object for which
wave-particle duality was proposed!

37
(in their regime of validity) determine the dynamics of objects. But now we want to describe
the wave-like nature of the same objects. But what equations determine the wave? We must
make a guess that makes intuitive sense, and also leads to definite experimental predictions
that can be verified. Our guess should explain, for example, the behaviour of electrons
in a harmonic trap, electrons in an atom, electrons scattering from a target and even free
electrons.
The key is de Broglie’s hypothesis, which says that there is a universal proportionality
between the momentum of a particle and the wave vector of its associated wave, p⃗ = ℏ⃗k.
This implies that a wave having wave number:

⃗k = p⃗ (2.39)

describes a quantum particle of momentum p⃗ 11 . For non-relativistic particles the kinetic
energy T is given by:
p⃗ 2 ℏ2⃗k 2
=
T = (2.40)
2m 2m
where in the second step we used Eq. (2.39).
Recall that a wave can be described by a function α(⃗x) of position or a function β(⃗k) of wave
number. In quantum theory we will write α(⃗x) as ψ(⃗x) and call it the “wave function” of
the associated particle. For β(⃗k) we write ψ̃(⃗p ) 12 . We can also allow the wave function to
have a time dependence.
Thus ψ(⃗x, t) is the wave function of a particle in the position basis, and ψ̃(⃗p , t) is the wave
function of the same particle in the momentum basis. Note that in general the wave function
is complex. The relation between ψ(⃗x, t) and ψ̃(⃗p , t) is:
Z
1 p ·⃗
i⃗ x
ψ̃(⃗p , t) = 3 d3 x ψ(⃗x, t)e ℏ (2.41)
(2πℏ) 2
Exercise 2.8. The appearance of ℏ in the exponential is easy to understand. But can you
figure out why it also appears in the pre-factor? (Hint: Consider the inverse transform giving
ψ(⃗x, t) in terms of ψ̃(⃗p , t) and require that the same pre-factor should appear there.).

It should be clear by now that the concept of “wave function” is not special to quantum
physics – we have seen such functions even when discussing classical waves. However, “wave
function of a particle” is special and arises only in quantum physics.
11
There is also a relation E = hν = ℏω relating energy and frequency, but it is less useful for matter waves
since their frequency is not linearly related to momentum and less straightforward to measure.
12
We have switched to using p⃗ = ℏ⃗k rather than ⃗k as the variable.

38
Let us consider wave functions in the position basis. How should the momentum p⃗ should
be represented in this basis? For this, we use Eq. (2.39) and Eq. (2.35) to write:

p⃗ → ℏ⃗k → − iℏ∇
⃗ ⃗x (2.42)


Thus we say that in the position basis, p⃗ is represented on the wave function ψ(⃗x) by −iℏ∇.
This is not a new assumption, but follows directly from the analogous relation for classical
waves together with the de Broglie hypothesis.

Exercise 2.9. How is the position ⃗x represented in the momentum basis?

Now let us try to build up the properties of this wave function using all the above results.
This was the insight of Erwin Schrödinger. Note that this is not a derivation. Rather, like
all new ideas in physics, it is a guess – a proposal – inspired by known experimental facts
and then confirmed by more experiments. The starting point is to use the relation for the
total energy of a classical particle as the sum of kinetic plus potential energy:

p⃗ 2
E= + V (⃗x) (2.43)
2m
where V (⃗x) is some arbitrary potential. We would like to treat this equation as a condition
imposed on the wave function, which hopefully is powerful enough to determine the wave
function in any given situation.
So let us first require:
p⃗ 2
 
+ V (⃗x) ψ(⃗x) = E ψ(⃗x) (2.44)
2m
where ψ(⃗x) is a time-independent wave fiunction. The first term on the right-hand-side
involves the particle momentum p⃗ . We explained above that in the position basis, this gets
replaced by −iℏ∇.⃗ Thus we get:

ℏ2 ⃗ 2
 
− ∇ + V (⃗x) ψ(⃗x) = Eψ(⃗x, t) (2.45)
2m

This is called the “time-independent Schrödinger equation”.



What about the time-dependent case? For this, we replace E by ℏω and then ω by i ∂t
(following Eq. (2.38)). The final result is then:

ℏ2 ⃗ 2
 

iℏ ψ(⃗x, t) = − ∇ + V (⃗x) ψ(⃗x, t) (2.46)
∂t 2m

39
13
This is known as the “time-dependent Schrödinger equation” . It is worth repeating that
this equation cannot be derived. It is one of the basic postulates of quantum mechanics.
However it is extremely well-motivated once we know the basic properties of waves, as well
as the experimental fact that p⃗ = ℏ⃗k.
We know that solutions of differential equations are determined once suitable boundary
conditions are imposed. So we may hope that this equation is enough to determine ψ(⃗x) in
every physical situation, and that turns out to be true – Schrödinger’s equation tells us how
to find the wave function associated to a particle. But it does not tell us how to interpret
it. We will come to that soon.
To summarise, the wave-function ψ(⃗x, t) of a non-relativistic particle satisfies the Schrödinger
equation. The equation depends on (i) the mass of the particle and the potential in which it
h
is moving, (ii) Planck’s constant (through its scaled version ℏ = 2π
). We can imagine solving
the equation in specific cases to find the wave function.
But this brings up a major question: what exactly does the wave function tell us about the
physics of the particle?

13 ⃗ should be written as ∇
In principle, ∇ ⃗ ⃗x to indicate that it is a differential operator with respect to ⃗x,
but this will be understood from the fact that it acts on a function of ⃗x.

40

You might also like