Notes
Notes
Notes
J. de Graaf
September 12, 2024
Introductory Remarks
The lecture notes for the course Advanced Statistical Physics (Voortgezette Statistische Fysica; NS-
370B) were originally put together by R. van Roij and later extended and reworked by L. Filion. The
version you have before you today, has been subsequently modified by J. de Graaf to account for changes
in the content and structure of the course. This includes the repartitioning of material between academic
years 2021-22 and 2022-23 to accommodate the topic of ideal quantum gases.
While care has been taken to avoid mistakes, some issues and typos may be present. The authors
apologize in advance for any inconvenience this may cause. All feedback regarding mistakes, confusing
text, typos, etc. is sincerely appreciated and will be used to improve future versions. Throughout the
course, electronic versions of the updated notes will be made available on a weekly basis, which eliminate
any remaining problems. Please direct feedback or other questions regarding the course to he course
coordinator J. de Graaf via the e-mail address j.degraaf@uu.nl.
Course Approach
At Utrecht University, Advanced Statistical Physics is the gateway to both the theoretical and exper-
imental physics Master’s programs. In addition, knowledge of statistical physics and thermodynamics
will also prove extremely useful for climate physics. We therefore recommend that you try your best to
grasp the concepts presented in this course, as you will encounter them throughout your carreer as a
physicsist. That is, the methodologies discussed here appear across all length and time scales that are
considered in physical theories — from cosmic events to quantum systems — and beyond, e.g., when
considering the concept of entropy in biology, general complex systems, and even compute science.
Advanced Statistical Physics is probably the first third-year physics course that you encounter during
your studies. As such, the approach taken during this course will be a bit different from what you
are used to thus far. Lectures will focus more on conceptual aspects, less on working out problems or
showcasing derivations from the notes. You are expected to work these out on your own, before the
lecture. The problem classes will cover more material and the average student is not expected to be able
to finish the exercises during the regular contact hours; more so than was the case for your second-year
courses. In addition, you will also be asked to independently work through one of the chapters. This
will help prepare you for the level of independence that you will need to show during any of the Master’s
programs, to which your Bachelor’s grants access.
The intent is that any student who has worked through all the exercises independently, should not be
able to fail the exam, barring exceptional circumstance. This is because 70% of the exam will comprise
(parts of) example derivations and exercises that appear in the notes, which are for the most part
covered by your homework. The reason for this approach is to incentivize you to complete all the
problems before taking the exam. Have a look at some of the exams of the past years in the first few
weeks of the course and you will see how this is implemented. This will also help prepare you (in terms
of mindset) for the speed, at which you are expected to be able to move through the problem sets. The
self-study chapter will be tested on during the exam, so studying it well is important. The material of
the final exam covers all the homework, while the material of the midterm only tests the homework and
sections covered up to that point.
No solutions to problems will be provided. The is done to force you to engage with the problems and get
stuck. Being stuck on a problem is quite natural in and it is crucial to develop experience in becoming
unstuck on your own. A physicist is expected to gain confidence in tackling problems, to which there
are no set answers, be that in academia or industry. Thus, how you handle a no-answer course is telling
4
of your ability to engage with (physics) Master’s programs, especially the tougher ones. As a class, you
are, however, encouraged to cooperate to complete as many of the exercises as is possible, as physicists
also work together. Lastly, it is strongly recommended that you prepare the exercises before the problem
classes to ensure that you can make the most out of these. We appreciate your effort in this regard.
Grading Scheme
Course Content
This course explores the principles and applications of thermodynamics and statistical physics, empha-
sizing the description of classical many-body systems and touching upon a few simple quantum gasses.
We cover the following topics: phase transitions (gas-liquid condensation, magnetic ordering, crystal-
lization, phase separation, and liquid-crystalline order), critical phenomena (exponents, divergent length
scales, and fluctuations), and the structure and thermodynamic properties of non-ideal gasses, classical
fluids, and liquid crystals. The theoretical framework comprises mean-field theory, a simple renormal-
ization group of spin systems, Landau theory for first- and second-order phase transitions, nucleation
theory, the virial expansion for non-ideal atomic gasses, and Onsager theory for anisotropic particles. In
addition, the formal relationship of the various thermodynamic potentials (energy, free energy, enthalpy,
Gibbs free energy, and grand potential) are related to each other via Legendre transformations; universal
thermodynamic identities are also derived.
Course Goals
By the end of this course, you will have learned and may be tested on the following aspects of (advanced)
statistical physics:
• Associate (generalized) partition functions to thermodynamic potentials and derive corresponding
thermodynamic identities. Connect potentials to each other via Legendre transforms.
• Formulate a (grand) canonical partition function for a simple model. Knows the classical and
quantum mechanical distributions for ideal gasses. Can determine thermodynamic properties
from these.
• Describe the phase transitions of (an)isotropic particles and study these using mean-field theory
and renormalization-group calculations.
• Identify scalar order parameters and apply these to phase transitions. Use such parameters to
describe first- and second-order phase transitions within the Landau theory.
• Compute thermodynamic quantities for classical non-ideal gasses and apply thermodynamic per-
turbation theory to a known reference system to chart the behavioral changes.
5
Further Reading
The contents of these lecture notes heavily draw upon statistical physics textbooks and third-party
lecture notes; too many to list here. However, the format of and approach taken in these lecture notes
may not necessarily suit everyone. We therefore suggest the following extra reading material:
1. Introduction to Modern Statistical Mechanics (1987) by D. Chandler
2. Statistical Mechanics, Third Edition (2011) by P.K. Pathria & P.D. Beale
3. Statistical Mechanics, Second Edition (2016) by K. Huang
4. Theory of Simple Liquids (2006) by J.-P. Hansen & I.R. McDonald
5. Statistical Mechanics: Entropy, Order Parameters, and Complexity (2006) by J.P. Sethna
The books by Chandler and Pathria & Beale are more mathematically focussed. The former is more
appropriate to a graduate level and might not be the easiest to get into without a solid understanding
of statistical physics already. The book by Hansen & McDonald is also more appropriate for graduate
students, while Sethna gives a quite different perspective on the material and provides a lot of interesting
exercises. Lastly, Huang, might be a good source for additional information on ideal quantum gases.
6
Contents
1 Thermodynamics 13
1.2.1 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 Statistical Mechanics 25
7
8 CONTENTS
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Chemical Equilibria 81
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6 Phase Diagrams 89
7.5 Transfer Matrix Method for 1D Ising Model with Field . . . . . . . . . . . . . . . . . . . 119
11.5 Correlations and Mean-Field Theory for the Ising Model . . . . . . . . . . . . . . . . . . 170
CONTENTS 11
Thermodynamics
Before proceeding, it is important to once and for all make very clear the distinction between ther-
modynamics, statistical physics, and statistical mechanics. Thermodynamics describes the behavior of
many-body systems using bulk quantities, such as pressure, temperature, etc. These quantities describe
the macroscopic behavior of the system. In contrast, statistical mechanics starts with a microscopic
picture of the system and uses this to determine macroscopic properties. Thus, thermodynamics follows
from statistical mechanics in the limit of an infinite number of particles, which effectively holds for ev-
eryday situations. Statistical physics is a branch of physics that employs methods of probability theory
and statistics, but does not focus exclusively on the derivation of thermodynamics from microscopic
considerations. Statistical mechanics is therefore a subset of statistical physics, though the two terms
are often loosely used, which has muddied the distinction.
In this chapter, we review the thermodynamics that we will need for this course. The topics herein are
covered very briefly, as we assume familiarity with basic thermodynamics. Any reader that struggles
with these concepts or requires a more in-depth refresher, is recommended to refer to a basic textbook
on the subject (e.g., Blundell & Blundell, the book used our Statistical Physics course [NS-204B]).
13
14 CHAPTER 1. THERMODYNAMICS
1.2.1 Work
Given a generalized applied ‘force’ f and an associated (conjugate) extensive mechanical variable X,
the work done on a system is defined to be
δW = f · dX. (1.2)
We can exert many different types of f on a system: pressure, an external field, a chemical potential,
etc. In the case of an externally applied pressure, the conjugate variable would be the volume, leading
to the expression for mechanical work: δW = −pdV . Here, the minus sign comes from the external
application of the force on the system: for positive applied work the system’s volume shrinks as a
consequence and its pressure goes up, hence the sign convention. Similarly, we can exert a mechanical
force with magnitude f on a block resting on a floor, which moves as a consequence thereof. When the
force is directed parallel to the floor, the distance x which the block moves is the conjugate variable; this
gives us: δW = f dx. For chemical work, the force is the chemical potential µ of the system, and the
associated ‘distance’ is the change in the number of particles N . That is, N is the conjugate variable to
µ and this results in the work: δW = µdN . Positive chemical work on the system increases the number
of particles in it. Lastly, we should note that electric and magnetic work can be performed, but that
the latter is not without controversy [Macdonald, Am. J. Phys. 67, 613 (1999)].
Here, we will consider the paths shown in Fig. 1.1 to provide a minimal intuition into path dependence
and hence the nature of exact and inexact differentials. In path I on the left, the volume V first increases
1.3. SECOND LAW 15
Pressure
Pressure
𝐴 I 𝐴
𝐵 II 𝐵
0
0
0 Volume 0 Volume
Figure 1.1: Comparison of two paths in a pV -diagram. The paths connect the points A and B that
have identical coordinates on both diagrams. These respectively indicate the initial and final states of
a closed system, for which work must be exerted to change the volume V . The hashing indicates the
work performed during this process.
at constant pressure p (∆V > 0, external work done), then the pressure decreases at constant volume
(∆V = 0, no work done). In path II on the right, these operations are performed in reverse order. In
the pV -diagram, the area under the curve represents the work done during the transition from initial
state to final state. This is represented by the hashed area. Clearly, the work done in passing from A
to B is greater for path I than it is for path II. We conclude that unless the path is specified, the work
done in the process A → B cannot be determined. The associated differential δW must therefore be
inexact. Referencing Eq. (1.1) and noting UB − UA is equal between the paths for a closed system, we
know that δQ must also be different in such a way that the whole is an exact differential.
We note that the equality in Eq. (1.3) holds only when the heat δQ is taken up reversibly by the system.
In this case, the entropy is a state function! A process is reversible, when taking the forward path
16 CHAPTER 1. THERMODYNAMICS
between two states A and B can be taken in reverse to arrive back at A. An example of a reversible
process in thermodynamics is a slow compression of a gas (at fixed pressure and fixed number of particles)
without going through a phase transition. Examples of irreversible processes are: plastic mechanical
deformation under an applied force, heat flow from hot to cold with a finite temperature difference, and
mixing of gasses. In general, reversible processes are idealizations of physical reality, where friction and
dissipation are present. It is prudent to revisit your basic thermodynamic lecture notes and materials,
if you struggle with the concepts of path dependence and entropy, and how these relate to heat engines
and the laws of thermodynamics.
Thus, the conjugate — in this case intensive — variables may be obtained from the natural variables of
the thermodynamic potential through partial differentiation. This implies that thermodynamic poten-
tials are determined up to a constant, effectively irrelevant offset. That is, this offset drops out of any
physical quantity that results from imposing the value of the natural variables.
Assume we have a real-valued function f which is a natural function of a set of n variables {xi }. Then
the differential may be written as
n
X ∂f
df = ui dxi , with ui = . (1.6)
i=1
∂xi {xj̸=i }
Here, the ui serve as the conjugate variables to the xi . Now, let us define a new function g such that
g = f − uk xk , (1.7)
1.5. LEGENDRE TRANSFORMS 17
meaning that g is a natural function of variables x1 , x2 , · · · , xk−1 , uk , xk+1 , · · · , and xn . Now g is the
Legendre transform of f .
Note that we have swept two rather important aspects of f under the rug in our above derivation.
In order to be able to Legendre transform f : (i) The function must be strictly convex. For a single-
variable function, this implies that its second derivative is strictly positive. This, in turn, implies strict
monotonic increase of the first derivative, i.e., it may be inverted. (ii) The function must be sufficiently
smooth. The latter is casually assumed to be true for most physical systems.
𝑓(𝑥)
0 𝑥 0 𝑥
𝑓 𝑥 − 𝑢𝑥
Figure 1.2: Graphical representation of the Legendre transformation. (left) A convex function f (x)
(solid blue curve) and a non-convex function (dotted red curve). We also indicate a family of functions
f˜ (solid light-blue curves) that follow from f . If we assign u = ∂f /∂x for the slope, then each member
of the family gives rise to the same value of f (u), where we have used inversion to write x(u). (right)
The tangent line (dashed red) construction for the Legendre transform.
Taking the tangent line to a specific point of f , we can construct the intercept with the vertical axis, see
the right-hand panel to Fig. 1.2. Note that this intercept is g(x) as defined in Eq. (1.7). However, this
is still not sufficient, because we require g(u), which is where we rely on the convexity condition on f to
invert u(x) to x(u). Why is this an acceptable definition of a Legendre transform? The intercept seems
an arbitrary choice. However, note that this is a number for which a line with slope u(x) is tangent to
f (x). That is, given g(u), u, and x, you will uniquely define f (x), rather than a family of curves as on
the left-hand side of Fig. 1.2. We refer the interested (or confused) reader to the pedagogical discussion
of the Legendre transform by Zia et al. [Am. J. Phys. 77, 614 (2009)] for further information.
We can now use the Legendre transformation to switch between different thermodynamic potentials, i.e.,
ones that depend on natural variables. Specifically, at constant N , V , and T , the relevant free energy
is called the Helmholtz free energy and it is constructed by
Hence, F = U − T S and
∂F (N, V, T ) ∂F (N, V, T ) ∂F (N, V, T )
S=− , p=− , µ= . (1.10)
∂T N,V ∂V N,T ∂N V,T
Lastly, we remark that in literature, the Helmholtz free energy is often simply referred to as the free
energy and can be denoted by A rather than F .
Similarly, we can construct a free energy at constant N , p, and T in the following manner
such that G = U − T S + pV . This is called the Gibbs free energy and has the following derivatives
∂G(N, p, T ) ∂G(N, p, T ) ∂G(N, p, T )
S=− , V = , µ= . (1.12)
∂T N,p ∂p N,T ∂N p,T
The case with independent variables (T, V, µ) corresponds to the grand potential Ω and is given by
with
∂Ω(T, V, µ) ∂Ω(T, V, µ) ∂Ω(T, V, µ)
S=− , p=− , N =− . (1.14)
∂T V,µ ∂V T,µ ∂µ T,V
1.5. LEGENDRE TRANSFORMS 19
1.6 Exercises
Q1. Intensive Variables
Consider constructing a thermodynamic potential for an ensemble where all the intensive variables
(T , p, and µ) are fixed. Would this be possible? Explain.
(a) 1/T , V , N
(b) 1/T , V , µ/T
(a) If we define
∂S
CV = T , (1.18)
∂T V,N
assuming
∂S
Cp = T . (1.21)
∂T p,N
dE = T dS − P dV + µdN. (1.22)
Using this formula, one can derive the definition for, e.g., the pressure
∂E
P =− . (1.23)
∂V S,N
1.6. EXERCISES 21
(a) Rewrite the above equation to the form dS = · · · , and find expressions for
∂S ∂S
and . (1.24)
∂V N,E ∂N V,E
(b) Now, consider the more general formula of this type Adx + Bdy + Cdz = 0. By finding
expressions for
∂z ∂z ∂y
, , and , (1.25)
∂x y ∂y x ∂x z
derive the triple product relation
∂x ∂y ∂z
= −1. (1.26)
∂y z ∂z x ∂x y
(c) You can also derive the triple product relation in a more intuitive way using geometry, see
Fig. 1.3. Imagine a surface given by z(x, y) with a triangle drawn on it, see the figure below.
This particular triangle is special: when you go from point 1 to point 2, y is kept constant.
Similarly, z is kept constant going from 2 to 3, and x is kept constant going from 3 to 1.
Starting at point 1, which has the position (x0 , y0 , z0 ), we follow the line from 1 to 2 to find
an expression for the position of point 2
dz
(x0 + ∆x, y0 , z0 + ∆x), (1.27)
dx y
where ∆x is the distance traveled in the x-direction going from point 1 to 2. In a similar
fashion, continue following the line to point 3, and find an expression for the position of point
3. Then, follow the line from point 3 to point 1, and find an expression for the position of
point 1. Compare this expression to (x0 , y0 , z0 ) to derive the triple product relation.
(d) Use the triple-product rule to show that entropy maximation implies internal energy mini-
mization for an isolated system. Start by writing down the conditions for having a maximum
in S in terms of U and the other thermodynamic variables.
22 CHAPTER 1. THERMODYNAMICS
T
Q
W
Figure 1.4: Schematic of a machine capable of taking up an amount of heat Q from a heat bath at
temperature T and converting this into work W .
does not change during the process, and that the entropy of the heat bath decreases. By how
much? Is the theorem a version of the second law of thermodynamics?
(b) Although it is not possible to convert all heat from a heat bath to work, it is certainly possible
to convert a part of the heat to work. Figure 1.5 shows the machine M that takes up an
T1
Q1
M W
Q2
T2
Figure 1.5: A more involved machine that not only takes up heat Q1 from a bath at temperature T1 ,
but also puts some of it Q2 back into another bath at temperature T2 , while covertings some of the heat
taken up into work W .
amount of heat Q1 from a heat bath (“boiler”) at temperature T1 , conducts work W and
deposits heat Q2 in a cold heat bath (“condenser”) at temperature T2 . What is the relation
between Q1 , Q2 and W according to the first law of thermodynamics?
We next assume that M is reversible, i.e., the machine can go backwards through reversing the
inputs and outputs (the machine hence takes up work W from the environment, takes up heat
Q2 from the cold heat bath and deposits heat Q1 in the warm heat bath) [think about a fridge].
We consider now another machine M ′ , not necessarily reversible, that also operates between the
1.6. EXERCISES 23
T1
Q1 Q'1
M W M' W'-W
Q2 Q2
T2
Figure 1.6: Representation of coupled heat engines, of which M is reversible and M ′ is not necessarily
reversible, also see the description in the question.
temperatures T1 and T2 . This machine takes up an amount of heat Q′1 from the warm bath,
conducts work W ′ , and is constructed such that precisely Q2 is dumped into the cold bath (not
Q′2 ). As indicated in the sketch below, work W ′ is partly used to turn back machine M, and the
rest W ′ − W can be used for other purposes, this situation is respresented in Fig. 1.6.
(c) Show that the theorem applied to the combined machine M+M’ forbids that W ′ − W > 0. It
follows that W ′ ≤ W . Check that this implies that no machine can be more efficent than a
reversible machine! You should define efficiency as delivered work per amount of heat taken
up from the boiler.
(d) Assume now that M’ is also a reversible machine. Show that then W = W ′ . The brilliant
conclusion drawn by Carnot is that every reversible machine regardless of the design delivers
the same amount of work per unit heat for fixed T1 and T2 . The function W (Q1 , T1 , T2 ) is
therefore a property of nature, not of the machine!
(e) Since every reversible machine delivers the same amount of work W (Q1 , T1 , T2 ), it can be
calculated by considering a very simple machine, e.g., the Carnot cycle of a classic mono-
atomic ideal gas. Use that Q1 /T1 = Q2 /T2 for a Carnot engine (this can be shown, but
you do not need to do that). Combine this with your answer to part (b) and calculate the
universal maximum amount of work W (Q1 , T1 , T2 ).
(f) Now consider an infinitesimal reversible change in the state of a certain quantity of matter
by means of an amount of work dW done by the system and/or a receiving an amount of
heat dQ. We know already that dE = dQ − dW does not depend on the path chosen from
the initial to the final state, but dQ and dW do. Why was this again? Consider the quantity
Z e
dQ
b T
over a reversible path from an initial state b to another final state e. The questions is now
if this integral depends on the chosen path. The answer is no. Argue that based on the fact
that each reversible cycle can be build up by Ha large number of small Carnot cycles, such
that it follows for each reversible loop process dQ/T = 0. It follows then that dS = dQ T is
a total differential, and thus S is a function of state (that is hence independent of how the
system arrived at a certain state). To understand the meaning of S requires a microscopic
theory (such as statistical mechanics), but the existence of a state function S follows purely
macroscopic (thermodynamic) considerations as you have in this exercise.
24 CHAPTER 1. THERMODYNAMICS
Chapter 2
Statistical Mechanics
In thermodynamics, we describe the behavior of many-body systems using bulk quantities, such as
pressure, temperature, etc. These quantities describe the macroscopic behavior of the system. In
contrast, in statistical mechanics, also often referred to as statistical thermodynamics for this purpose,
we start with a microscopic picture of the system and use it to derive the macroscopic properties. To
this end, we often distinguish between microstates and macrostates. In a microstate, we specify the
value of every degree of freedom available to our system. For instance, the position and momentum of
all particles in a gas, or the direction of all spins in a magnet. A macrostate, however, contains many
microstates, and classifies systems based on macroscopically measurable quantities. The information
that is lost by translating between a microstate and macrostate description of a system gives insight
into the concept of entropy, as we shall see.
Formulating the link between microscopic realizations of the system and macroscopic observables is one of
the major success stories in modern physics. It is particularly appealing that the theory hinges on a single
fundamental assumption, which leads to thermodynamics, after some(times laborious) mathematical
derivations. Throughout this chapter, we will assume discrete (countable) microstates and we refer to
Chapter 3 for a discussion of classical continuous ensemble theory.
The fundamental assumption of statistical mechanics states that: When a system is in equilibrium, all
microstates of a closed system are equally likely. In other words, if we have some way of counting the
number of possible states Ω(N, V, U ), then the probability to find a closed system in a microstate m is
given by
1
if U (m) = U
P (m) = Ω(N, V, U ) . (2.1)
0 if U (m) ̸= U
Here, U (m) is the total energy of microstate m. Note that the above expression conditions microstates
m to satsify the internal energy criterion, i.e., only states with internal energy U are permitted. This
25
26 CHAPTER 2. STATISTICAL MECHANICS
𝑇 𝑇, 𝜇
𝑁, 𝑉, 𝑈 𝑁, 𝑉 𝑉
Figure 2.1: Visual representation of the ensemble construction that is discussed in this chapter. (left)
The microcanonical ensemble. This ensemble is characterized by an isolated system with fixed volume
V , number of particles N , and energy U . (middle) The canonical ensemble. We consider a region of
interest with fixed particle number N and volume V (black rectangle) that is embedded in a much
larger microcanonical volume. This volume serves as a reservoir with effective temperature T , which
can exchange energy with the canonical region of interest. This canonical region is thus maintained at
temperature T , but has fluctuations in U . (right) The grand canonical ensemble. We now consider a
region with fixed volume V (dashed black rectangle) that can exchange energy and particles with its
embedding microcanonical reservoir with associated temperature T and chemical potential µ.
type of, admittedly slightly convoluted, notation will be useful in discussing the microcanonial ensemble
in continuum phase space in Chapter 3.
N and fixed volume V that is sitting in a large reservoir, see Fig. 2.1(middle). Assume further that
the total system (subsystem plus reservoir) is closed, and that the subsystem can exchange energy with
the reservoir. Let Ωr (Ur ) be the number of microstates of the reservoir with energy Ur , and let ϵm
be the energy of the subsystem in microstate m. The total energy of the closed system is given by
Utot = Ur + ϵm . The total number of microstates of the closed system can be expressed as
X
Ωtot = Ωr (Utot − ϵm ) , (2.4)
m
P
where m indicates a sum over all microstates of the subsystem. From the fundamental assumption
of statistical physics, every microstate of the closed (total) system is equally likely. Then, if Pm is the
probability of finding the subsystem in state m, we have
Ωr (Utot − ϵm )
P (m) = . (2.5)
Ωtot
That is, P (m) is the fraction of the total number of microstates, for which the subsystem is in state m.
We will now rewrite the probability using the mathematical identity (“log” denotes the natural logarithm
throughout):
Ωr (Utot − ϵm ) = exp [log [Ωr (Utot − ϵm )]]. (2.6)
If we assume the reservoir is very large, then Utot ≫ ϵm . Next, we Taylor expand Eq. (2.6) around Utot
to obtain
Ωr (Utot − ϵm ) ≈ exp [log [Ωr (Utot )] − βϵm ], (2.7)
where we have used
∂ log Ωr (Utot )
β≡ , (2.8)
∂Utot
which follows from the definition of temperature given in Eq. (2.3). Plugging this into the probability
P (m) we obtain
Ωr (Utot )e−βϵm e−βϵm
P (m) = P −βϵ ≡ , (2.9)
Ωr (Utot ) m e m Z(T, V, N )
where Z(T, V, N ) is the partition function, which is given by
X
Z(N, V, T ) = e−βϵm . (2.10)
m
The distribution function e−βϵm is called the Boltzmann distribution and indicates the probability
of encountering a state with energy ϵm given a fixed temperature T . The associated Helmholtz free
energy, i.e., the free energy in the N , V , and T ensemble, is given by βF (N, V, T ) = − log(Z(N, V, T )).
Now that we have access to the probability distribution, we can determine averages, such as the average
energy
X 1 X
E ≡ ⟨ϵm ⟩ = ϵm P (m) = ϵm e−βϵm . (2.11)
m
Z m
where in the second step we used the chain rule. In statistical mechanics, the partition functions contain
the properties of the system within them. Thus, once we know the partition function — which may in
28 CHAPTER 2. STATISTICAL MECHANICS
practise be very hard to compute — we can derive from it the macroscopic physical quantities of interest.
Here, one must be careful in recognizing that the thermodynamic relations between averaged quantities,
such as U , S, and F , hold only in the thermodynamic limit, but they can be extended to effectively hold
for systems containing N ≫ 1 particles. Thus, it is commonplace to make the identification U = E, but
that technically requires our small subsystem to be sufficiently large that the energy fluctuations on E
are negligible. You will encounter several examples of this in the exercises associated with this chapter.
and the probability for each of these states is 1/Ωtot as they are all equally probable according to the
fundamental assumption. Equation (2.13) leads to the grand-canonical probability for a microstate
(m, N ) that reads
is exact. Note that the existence of the reservoir only appears through the parameters β and µ. Lastly,
above we have defined
N
Xtot X N
Xtot
which is called the grand-canonical partition function. Note that often the number of particles in the
reservoir may be assumed to be infinite, so that the above sum runs to infinity. However, this need not
always be the case in practical examples, as we will encounter in the exercises.
All microstates of the subsystem with volume V are allowed: high-energy states, low-energy states, and
states with small and large numbers of particles. Clearly, these states are not all equally probable as
their statistical weight depends on ϵm and N through Eqs. (2.14) and (2.18) for given T and µ. One
can show, however, that the average energy U = ⟨ϵm ⟩ and the average number of particles ⟨N ⟩ are
at the maximum of strongly peaked Gaussian distributions for macroscopically large systems, allowing
for fluctuations that scale with ⟨N ⟩1/2 . These fluctuations can thus be ignored in the thermodynamic
limit, where we can speak of the number of particles N and the energy U of the system. The grand
potential, i.e., the free energy in the µ, V , and T ensemble is given by
βΩ(µ, T, V ) = − log Ξ(µ, V, T ). (2.19)
For a two-dimensional function f (x, y) subjected to a constraint g(x, y) = c, a new function L(x, y, λ) =
f (x, y) − λ (g(x, y) − c) may be introduced, where, λ is called the Lagrange multiplier. The idea is to
minimize L so that the minimum of f is found under the constraint g = c. Geometrically the picture
is as follows. The constraint g = c maps out a surface in (x, y) space. For a value in the range of the
function f , say q, the equation f = q maps out another surface. If q is chosen much smaller than the
value for which f also satisfies the constraint, then the surface of f = q is disjoint from that generated
by g = c. By slowly increasing q, the point at which the surfaces first touch is found. This is the desired
value of q (say q ∗ ) with associated contact point (x∗ , y ∗ ).
When two surfaces just touch, their tangent spaces coincide, so the gradient of f with respect to the
space coordinates (this maps out the tangent space) is some multiple of that of g (λ of course). In
algebraic form this condition is obtained by ∇L(x, y, λ) = 0, which gives the extremal points (x∗ , y ∗ )
for which ∇f = λ∇g, i.e., exactly the necessary condition for the tangent requirement. Note that
(x∗ , y ∗ ) are dependent on λ and by plugging these solutions into g = c, an equation for λ is obtained,
which may be solved for to eliminate λ from the problem. Note that this condition is readily recovered
from the Lagrangian by minimization
∂L(x, y, λ)
= 0. (2.20)
∂λ
We can also provide a proof of the formalism. Let p(t) be a path, which lies on the n-dimensional
constraint surface imposed by g(r) = c, where r denotes the spatial coordinates. Suppose that the
function f (r) has an extremum at the point P on the constraint surface and that p(0) = P . Let
h(t) = f (p(t)). Then our setup guarantees that h(t) has a maximum at t = 0. Taking the derivative of
h(t) and using the chain rule, we find
dh dp
(t) = ∇r f (r)|r=p(t) · (t), (2.21)
dt dt
30 CHAPTER 2. STATISTICAL MECHANICS
𝑔 𝑥, 𝑦 = 𝑐
𝑓 𝑥, 𝑦 = 𝑐1
𝑐7
𝑐6 𝑐2
∇𝑔 ∇𝑓 𝑐5
𝑐3
𝑐4
Figure 2.2: Illustration of constraint minimization via the Lagrange-multiplier formalism. We have
function f that takes two-dimensional coordinates (x, y) to a real number. In this case f can be thought
of as creating a landscape, to which the blue (dashed) curves are iso-height contours. The function g
— also taking coordinates (x, y) — defines a constraint that our minimization must satisfy, through the
contour g(x, y) = c, which is illustrated by the thick red curve. The f contour marked with c3 (solid
blue) just touches the g contour, in the point marked with a black cross. At this point, the tangent
space (black, dotted line) is shared by both curves and their gradients point in the same direction (up
to a sign), as illustrated by the use of the arrows and labels ∇f and ∇g, respectively.
where ∇r denotes the gradient with respect to the variable r. We have that t = 0 is a local maximum,
therefore
dh dp
0= (0) = ∇r f (r)|r=P · (0). (2.22)
dt dt
Thus, ∇r f (r) is perpendicular to any curve on the constraint surface through P . This implies that
∇r f (r) is perpendicular to the surface. Since ∇r g(r) is also perpendicular to the surface defined
by g(r) = c, we have that ∇r f (r) is parallel to ∇r g(r) in the extremum. This implies that there
exists a real-valued λ ̸= 0, such that ∇r f (r) = λ∇r g(r). We may therefore introduce the function
L(r, λ) = f (r) − λ(g(r) − c), which equals the original function f (r) at each point of the level set and
which is extremized at the point P .
Lastly, we should note that multiple (say N ) constraints may be introduced via a set of Lagrange
multipliers
XN
L(r, λN ) = f (r) − λi (gi (r) − ci ), (2.23)
i=1
N
where λ = {λ1 , · · · , λN } and the constraints are specified by level sets gi (r) = ci . Checking whether
an extremal point found by using Lagrange multipliers is a maximum, minimum, or saddle point along
the subspace defined by the constraints is not trivial. This involves properties of the bordered Hessian,
with the Hessian referring to the double-derivative matrix. We refer to a course on multivariate calculus
or classical mechanics if you wish to learn more about the use of Lagrange multipliers. In this chapter,
we will give a few exercises, by which you can familiarize yourself with the concept, so that you can
work with it throughout these lecture notes.
2.6. EXERCISES 31
2.6 Exercises
Q6. A Refresher on Notation
In the following exercises, we will be working with exponents, logarithms, multivariate integrals
and producs thereof. Here, we have a brief look at some of the more common notations and we
will highlight some of the convention differences between physics and mathematics.
(a) We write ea+b = exp(a + b) = exp(a) exp(b) = ea eb . The N -variable generalization of this
expression is written !
XN YN
exp ai = exp(ai ), (2.24)
i=1 i=1
where we have introduced the product notation and the ai are some (real or complex) num-
bers. Use this relation to write down the analogous form for the natural logarithm — com-
monly denoted by “log” in statistical physics — of a product of N variables ai .
(b) In statistical physics, it is commonplace write integrals in the left-acting operator form
Z 1 x=1
1 3 2
dx x2 = x = . (2.25)
−1 3 x=−1 3
where r N is the set of all r i , i.e., r N = {r 1 , · · · , r N }, and Φ is some function that depends
on all of these variables. Now assume that
N
!
X
N 2
Φ(r ) = exp − ai |r i | , (2.27)
i=1
where all ai > 0 are real valued, and that the integrals in Eq. (2.26) are all over R3 . Make
optimal use of the sum and product notation to evaluate the integral, providing the requisite
intermediate steps.
where the sum is over all states and ps is the probability of being that state.
(a) Show that the Gibbs definition reduces to that of Boltzmann in the microcanonical ensemble.
(b) Show that in the canonical ensemble, the Gibbs definition of entropy gives rise to a well-known
thermodynamic identity.
(c) Argue in a few words why the Gibbs definition is more general.
(d) Show that the Gibbs definition leads to an extensive entropy.
For an enlightening and surprisingly recent discussion of the difference between the two definitions
we refer the interested reader to [Frenkel and Warren, Am. J. Phys. 83, 163 (2015)]. Next, we will
consider unifying the various definitions of probability in the ensembles using the Gibbs entropy
via a variational principle.
P
(e) Use the fact that s ps = 1 as a constraint to maximize Eq. (2.29). Why should you maximize
rather than minimize the equation? You should find ps = exp(λ/kB −1), with λ the Lagrange
multiplier. Why is this reasonable? Explain in a few words.
(f) Assume that there are Ω states in total and use this to solve for the Lagrange multipier. Is
the answer you obtain the one you expect?
(g) Introduce a second condition that sets the average energy to a constant E = ⟨ϵs ⟩. Call the
Lagrange multiplier β and maximize Eq. (2.29) with respect to these two constraints. What
do you find for the probability now? Show that one of the constraints naturally leads to the
canonical partition function Z.
This approach can also be extended to include other quantities that are conserved on average and
we will see this in Chapter 4 in the context of the grand-canonical probability.
(a) What do the microstates in this system look like? Sketch at least two.
(b) Calculate the canonical partition function Z for this system.
(c) What would happen to Z if the end was no longer pinned to the origin, but instead in its
center? Explain in a few words.
• Only one piece (black or white) can occupy a square, be that a black or white square.
• Black pieces have −ϵ energy on a white square.
• White pieces do not interact with either type of square.
We will now determine the canonical partition function Z(Nw , Nb , L, T ) for this system, with T
the temperature.
(a) Assume that there is no preference for black pieces to be on white squares (set ϵ = 0 for now)
and that Nw + Nb ≤ L2 .
We now assume an equimolar mixture (50:50) of black and white pieces occupying all squares.
Next, we switch the interaction between the black pieces and the white squares back on (i.e.,
ϵ > 0).
(b) Will there be a phase transition in this system when switching on ϵ? Explain why.
(c) Provide the canonical partition function for the new situation.
(a) What do the microstates of this system look like? Draw a few.
(b) What is the internal energy of this system as a function of β = 1/kB T , H, and N (use the
ensemble characterized by these variables).
(c) Determine the entropy of this system as a function of β, H, and N .
(d) Determine the behavior of the energy and entropy for this system as T → 0.
Combine this result with extensiveness arguments to show that the standard deviation of the
energy is much smaller than the average energy in thermodynamically large systems. Use this to
show that the constant-volume heat capacity
∂⟨ϵs ⟩
(2.32)
∂T V
can be written as
⟨ϵ2s ⟩ − ⟨ϵs ⟩2
CV = . (2.33)
kB T 2
Q15. Model for the Adsorption of a Gas at a Crystal Surface from Daan Frenkel (Lecture Notes)
Consider a simple model for a surface which consists of M adsorption sites which are arranged on a
square lattice. Assume that the molecules of a gas are indistinguishable and can be adsorbed onto
one of the lattice sites, but that each lattice site can only adsorb a single molecule. Further assume
that an occupied site has binding energy ϵ and that the adsorbed gas is in thermal equilibrium at
temperature T . Note that for a two-dimensional system (such as an adsorbed layer) the area A
and surface pressure Π play the same role as the volume V and pressure P in a 3D system.
(a) If there are N molecules adsorbed onto the surface, what do the microstates for this system
look like? Calculate the canonical partition function.
(b) Compute the grand partition function. Express Π as a function of z, A, and T , where z = eβµ
is the fugacity, with β = 1/kB T .
(c) Calculate ⟨N ⟩.
(d) Calculate the occupied fraction of sites f = ⟨N ⟩/M .
(e) Use the quantities you just derived to calculate the equation of state Π(f, T ). What is the
surface pressure in the limit of low coverage. Is this what you would expect? Explain.
Q16. Defects in a Crystal
Generally when we consider a crystal, we ignore the possibility of defects like vacancies and in-
terstitials. A vacancy corresponds to an empty lattice site, while an interstitial corresponds to an
“extra” particle without its own lattice site. In reality, however, all crystals have a finite concen-
tration of such defects, even in equilibrium. In this exercise, we explore the equilibrium vacancy
concentration of the hard-sphere crystal.
Assume that we know the Helmholtz free energy of a perfect crystal F perfect (N = M, V, T ), where
N is the number of particles, M is the number of lattice sites, V is the volume, and T is the
temperature. Note that in a perfect crystal, the number of lattice sites M equals the number
of particles N . Additionally, assume that the Helmholtz free energy associated with changing a
specific particle into a vacancy, i.e., removing that specific particle, is given by f vac (ρM , T ), where
ρM = M/V , with M the number of lattice sites. Define N vac as the number of vacancies, such
that M = N + N vac . Finally, we assume that the vacancies do not interact with each other.
(a) What is the Helmholtz free energy of a crystal with M lattice sites, N particles, and temper-
ature T ? Hint: Note that since the vacancies do not interact, they are randomly distributed
through the crystal; you will have to incorporate this into the free energy.
(b) Assume that the equation of state (the pressure as a function of density) is not affected by
the presence of vacancies. In other words, the pressure P (M, N, V, T ) is not dependent on
the number of particles. Show that the Gibbs free energy is given by
βG(M, N, P, T ) = βN µperfect (P, T ) + β(M − N )µvac (P, T )
N M −N
+N log + (M − N ) log , (2.34)
M M
2.6. EXERCISES 35
where µperfect (P, T ) is the chemical potential of the perfect crystal, µvac = µperfect (P, T ) +
f˜vac (P, T ), with f˜vac (P, T ) equal to f vac (ρM , T ), evaluated at the density corresponding to
pressure P in a perfect crystal. Hint: Use that G = µN for a single-component system.
(c) Simulations have found that (close to the melting point of the crystal) the chemical potential
of a vacancy µvac is approximately 8.7kB T . Assuming that the concentration is very small,
−N
show that the equilibrium vacancy concentration at this point is approximately MM =
−4
1 · 10 . Hint: This will require minimizing the free energy in Eq. (2.34) with respect to one
of its variables, which one can be determined by thinking about what is fixed in the ensemble.
36 CHAPTER 2. STATISTICAL MECHANICS
Chapter 3
In this chapter, we make the transition from partition functions and thermodynamic potentials for
countable states to continuous distributions. State-counting arguments follow by adopting an atomic
or even quantum view of the world. However, we have yet to touch upon situations where there is a
natural and continuous microscopic dynamics to the system. For example, in a gas, the trajectories
of the molecules are described by Newton’s equations of motion. Through collisions, the gas molecule
may (eventually) explore all possible positions and momenta in the system. This complicates our state-
counting procedure, at the very least in terms of normalization. The macroscopic properties of a system
may be derived from the microscopic dynamics by means of kinetic theory, as was for instance considered
by Maxwell and Boltzmann. However, it is possible to recover a ‘state-counting’ formalism as well, which
substantially cuts down on the mathematical manipulation required to obtain equilibrium quantities.
We will examine aspects of both formalisms in this chapter.
The time evolution of the system can be seen as a motion of the phase point along its phase trajectory.
This motion follows from the Hamiltonian H(r N , pN ) of the system, which is written here as
N
X p2i
H(r N , pN ) = + Φ(r N ), (3.1)
i=1
2m
where m is the mass of the particles and Φ(r N ) the potential energy, which includes the external potential
that defines the volume V . The Hamiltonian thus depends parametrically on the number of particles
1 In general, the dimension of phase space can be determined by counting degrees of freedom. For example, if each
particle has n other internal degrees of freedom (e.g., vibrations), then the dimension becomes (6 + 2n)N .
37
38 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
𝑥, 𝑝
𝑝
𝑡
0
0 𝑥
Figure 3.1: The phase portrait of a simple pendulum. (left) A sketch of the system. (middle) The x
component of the position (blue) and the associated momentum p (red). Note that these are out of
phase by a quarter period, i.e., the momentum is zero at the maximal extension and maximal at zero
extension. (right) The phase-space representation of this simple oscillatory motion.
N and the volume V (or parameters that describe the volume V ). This form of the Hamiltonian is not
suitable for electromagnetic systems with velocity-dependent forces, but the formalism can be extended
to include these as well. The Hamilton equations
∂H ∂H
ṙ i = and ṗi = − , (3.2)
∂pi ∂r i
together with 6N initial conditions determine the trajectory uniquely and completely. It follows that
trajectories in phase space do not intersect. This is a consequence of the dynamics being conservative.
For example, in Fig. 3.1(right), oscillations with greater amplitude would correspond an origin-centered
circle with a larger radius; the various concentric circles clearly do not intersect.
that advects the phase-space probability f (in much the same way as a fluid velocity u moves suspended
particles j = uc with c the particle concentration). The associated flux in this picture is j = Γ̇f (Γ, t).
The change of f with time this produces is then f˙ = −(∂/∂Γ) · j, where we imply conservation of the
phase space probability (the particle analogy is ċ = −∇ · j). Imposing the continuity equation makes
sense, as the total integral over f should always be equal to 1 and thus the dynamics of f should conserve
probability. Or in other words, phase-space points can neither be created nor destroyed as time evolves.
From the Hamilton equations (3.2) it follows directly that (∂/∂Γ) · Γ̇ = 0, which allows us to write
Eq. (3.3) as
∂f (Γ, t) ∂f
= −Γ̇ ·
∂t ∂Γ
N h
X ∂H ∂f ∂H ∂f i
= − ≡ {H, f }, (3.4)
i=1
∂r i ∂pi ∂pi ∂r i
where {, } denotes the Poisson bracket. If you have not seen this notation yet, you may think of it as a
classical analogue to the commutator that you know from quantum mechanics (the Poisson bracket was
obviously introduced first historically speaking). The Liouville equation is the starting point of most
theories of non-equilibrium statistical mechanics, e.g., kinetic theory.
It is common experience, however, that repeating the measurement on the same equilibrium system at
later times t0 yields an indistinguishable answer. Apparently, most values of a phase function (with a
macroscopic meaning) are close to their average value along a particular trajectory. Moreover, repeating
the measurement on a replica of the original system, i.e., following a different trajectory through phase
space, often also yields the same answer for A. This suggests an alternative microscopic description of
a macroscopic equilibrium state: instead of time averaging over a single phase trajectory, as proposed
in Eq. (3.5), we can average over a suitably constructed equilibrium ensemble with a corresponding
equilibrium probability density f (Γ) that does not depend on time explicitly. The ensemble average is
now defined as Z
⟨A⟩ = dΓ f (Γ)A(Γ), (3.6)
R
where the normalization dΓ f (Γ) = 1 is understood.
Systems for which A = ⟨A⟩ for all continuous phase functions A(Γ) are called ergodic. Although
ergodicity can almost never be proven, it is often assumed to hold. There are, however, manifestly
40 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
nonergodic systems. Nonergodicity results if trajectories are restricted, for macroscopically long times
to a subspace. This can be caused the presence of other conserved quantities besides energy (e.g., angular
momentum), or due to spontaneous symmetry breaking (e.g., in antiferromagnets). Also systems with
an extremely slow dynamics compared to the observation time, e.g., glasses, are nonergodic.
It should now be obvious that our common experience is based around fully ergodic systems in equi-
librium. For such systems, f (Γ, t) is stationary, i.e., (∂f /∂t) = 0 in Eq. (3.4). Equilibrium ensembles
are therefore characterized by phase-space distributions f (Γ) that satisfy {H, f } = 0. That is, the
distribution Poisson-commutes with the Hamiltonian. This implies that the Γ-dependence of f can only
involve conserved quantities: typically the mass, momentum, and energy. In most cases of interest here,
the energy is the only conserved quantity2 , so that then f (Γ) = f˜ (H(Γ)) for some function f˜. We will
see how this translates to the functional form of the ensembles shortly.
Before we continue on, we should remark that the two types of average discussed here are also manifest
in present-day computer simulations of model systems for condensed matter. In Molecular Dynamics
simulations the equations of motion, Eqs. (3.2), are integrated numerically for typically N = 100 - 10,000
particles, starting from some initial configuration. This generates a phase trajectory over which time
averages are taken. Conversely, in Monte Carlo simulations, configurations are randomly generated, and
then accepted or rejected in such a way that configurations (and hence observables) are sampled with
the correct statistical weight f (Γ).
It is tempting to identify ω as a measure for the number of states available to the system by making
the analogy to the way microcanonical probability was normalized in Chapter 2. That, is Eq. (3.7) has
the appearance of a continuous-space extension of the 1/Ω, where Ω is the total number of states in a
finite, discrete system. Thus, surely, ω should relate to the total number of states. This way of thinking
is, however, not entirely without risk, as it is difficult to assign a number to the amount of states that
ω represents, i.e., we require a form of normalization to recover our analysis from Chapter 2.
We will shortly demonstrate that a natural unit of measure for phase-space volume is given by the Planck
constant h, but for now we will assume this is the case. Thus, one could write ω/h3N for the number
2 Linear and angular momentum are not conserved due to collisions with the wall of the container that specifies the
of states. However, this is where the issues come to the fore. A hyper surface has dimension 6N − 1
and is a set of measure zero3 in the embedding 6N -dimensional phase-space volume, e.g., a sphere has
a 2D area but no volume in 3D. This means that the effective number of states on the sphere should
be counted using a phase-space volume element that has the appropriate dimension. In our example of
a sphere, the elements should have dimension of area, rather than of volume. For measuring a 6N − 1
dimensional hyperspace, one would need to introduce some fraction or residual of h.
One can bypass this dimensionality issue by defining a thin shell with energies between E − ∆E and
E for some small, yet finite ∆E > 0. In this case, the probability of being in a small volume dΓ
around a specific phase space point Γ can be normalized correctly, i.e., using h3N . Nonetheless, the
introduction of an arbitrary ∆E is not entirely satisfactory. This issue has lead do controversy in the
statistical physics community even up to the 2000s. Presently, the preferred resolution to the issue is
the introduction of a ∆E that can be made arbitrarily small.
A system of N particles in a volume V in contact with a heat bath at temperature T can change its
energy by exchanging heat with the reservoir. For this reason, the system is no longer restricted to the
constant-energy hyper surface. Instead, one can prove that a phase point Γ with energy H(Γ) has a
probability distribution fc (Γ) ∝ exp(−H(Γ)/kB T ). It turns out to be convenient to write the canonical
distribution function as
exp[−βH(Γ)]
fc (Γ) = , (3.9)
N !h3N Z(N, V, T )
The factor h3N is included to make Z dimensionless (Exercise Q19 will clarify the choice), and the factor
N ! to make log Z extensive. We (re-)introduced the short-hand notation β = 1/(kB T ).
where F is the Helmholtz free energy, as you have derived in the exercises to Chapter 2. We know from
thermodynamics that F (N, V, T ) generates the full thermodynamics of systems with fixed (N, V, T ), just
as S(E, V, N ) does for systems with fixed (E, V, N ); this can be readily shown using Legendre transforms.
We already saw that the energy of the system at temperature T fluctuates. In the thermodynamic limit,
however, the relative fluctuations become vanishingly small as they are of order N −1/2 , and the average
energy ⟨E⟩ therefore plays the role of the internal energy U in thermodynamics, also see Chapter 2.
This also implies that the microcanonical and the canonical ensemble are equivalent, for most purposes,
in the thermodynamic limit.
3 It has measure zero this and most physical situations. However, one can construct pathological mathematical counter
Note that the Maxwell-Boltzmann velocity distribution is easily obtained in the canonical ensemble, viz.
is the probability density that a given particle of mass m has momentum p at temperature T .
The canonical average of momentum independent observables, i.e., observables described by phase
functions A(Γ) = A(r N ), can be written as
Z
1
⟨A⟩ = dΓ exp[−βH(Γ)]A(r N );
N !h3N Z(N, V, T )
Z
1
= dr N exp[−βΦ(r N )]A(r N ), (3.13)
Q(N, V, T )
This integral will turn out to play a much more important role in systems that have interactions between
the particles, see Chapter 13. Note that
Q(N, V, T )
Z(N, V, T ) = , (3.15)
N !Λ3N
where the thermal (De Broglie) wavelength is defined by
h
Λ= √ . (3.16)
2πmkB T
The probability distribution to find the system in a state with N particles at a phase space point Γ is
given by the grand-canonical distribution function
1
fg (Γ, N ) = exp[−βH(Γ) + βµN ], (3.17)
N !h3N Ξ(µ, V, T )
3.4. THE CLASSICAL ENSEMBLES 43
The grand-canonical ensemble can be regarded as a linear combination of canonical ensembles with
different numbers of particles. Similarly, one can regard the canonical ensemble as a linear combination
of microcanonical ensembles with different energies.
where Ω is the grand potential. Indeed, the thermodynamic potential of a system with µ, V , and T as
independent variables. For a homogeneous system we have Ω = −p(µ, T )V with p the pressure. The
number of particles fluctuates in the grand-canonical ensemble, therefore the probability W (N ) to find
exactly N particles in the system (regardless the state point Γ of these N particles) is obtained by
integrating out the phase-space coordinates,
Z
W (N ) = dΓ fg (Γ, N );
Z
(3.17) exp[βµN ]
= dΓ exp[−βH(Γ)];
N !h3N Ξ(µ, V, T )
(3.18) exp[βµN ]Z(N, V, T )
= . (3.20)
Ξ(µ, V, T )
⟨N 2 ⟩ − ⟨N ⟩2
∂ρ
= kB T . (3.23)
⟨N ⟩ ∂p T
z = exp[βµ], (3.24)
where si a generalized coordinate of the system, e.g., a position coordinate or a momentum coordinate,
while ai are positive parameters, and M is the number of degrees of freedom. Examples of degrees
of freedom are provided in Fig. 3.2. The left-hand panel shows several kinetic and one possible vibra-
tional degree of freedom for a dumbbell-shaped molecule. The right-hand panel shows that many-body
interaction potentials Φ may admit to a quadratic expansion about some thermodynamically favored
minimum. This fact makes the result we outline below much more generally applicable than is perhaps
suggested by the shape of Eq. (3.25).
Φ
translation 𝑎𝑖 𝑠𝑖2
rotation
𝑘B 𝑇
0 𝑠𝑖
vibration
Figure 3.2: Interpretation of the degrees of freedom used in an equipartition argument. (left) A rigid
dumbbell-shaped molecule has: 3 translational degrees of freedom; 2 rotational ones (assume point
masses). If there is a harmonic bond between the two spheres, then it will gain one 1 potential-energy
degree from vibrations and 1 degree of freedom from the kinetic energy term associated with the two
parts of the dumbbell moving closer and further away from each other. (right) A general interaction
potential Φ (blue curve) may admit a quadratic expansion about a minimum in some generalized co-
ordinate si (dashed red curve), we will see an example of this in exercise Q25. N.B., here we plot the
multi-dimensional potential landscape only along one coordinate. When this local quadratic nature is
pronounced compared to the thermal energy kB T (green dotted line), this degree of freedom contributes
1
2 kB T to the total energy of the system.
Due to the quadratic nature of the variables in the Hamiltonian in Eq. (3.25), the expectation value of
the Hamiltonian can be written
1X ∂H
⟨H⟩ = si . (3.26)
2 i ∂si
Now, if we assume we are in the canonical ensemble, then the expectation value on the right-hand side
3.5. THE EQUIPARTITION THEOREM 45
where dΓ(sj ) indicates an integral over dΓ excluding dsj and (sj )1 and (sj )2 are the extreme values of
the variable sj . At the extreme values, the Hamiltonian of the system becomes infinite. Specifically,
if s represented a position coordinate, then the extreme values of the coordinate correspond to the
boundary, and hence the potential energy at these points becomes infinite as the system must be
bounded. Similarly, if s corresponded to the momentum, then the extreme values are ±∞, and the
kinetic energy becomes infinite. In either case, the Hamiltonian becomes infinite, and the contribution
from the first term on the right-hand side of Eq. (3.27) goes to zero. This leaves
Z Z
1
dΓ(sj ) dsj e−βH
∂H β 1
sj = Z = . (3.28)
∂sj −βH β
dΓ e
1
⟨H⟩ = M. (3.29)
2β
This result is typically referred to as the equipartition theorem and implies that each quadratic degree
of freedom contributes (1/2)kB T to the internal energy of the system.
It should be clear that the above argument holds in a classical system. However, our modern understand-
ing tells us that matter is quantum mechanical in nature with discretized energy levels. In Exercise Q19,
you will demonstrate that a quantum harmonic oscillator will behave classically when the temperature
T ≫ ℏω/kB , where ω is the frequency and h = 2πℏ. This transition between a quantum description
and an effective classical one has consequences for the behavior of molecules. The interpretation one
can give to the transition is that at sufficiently low temperatures (compared to ℏω/kB ) a bond is in
its ground state or some of the lower excited states. However, there is nothing quadratic about this
degree of freedom, it is descrete and not excited, which means it will not contribute kB T /2 to the in-
ternal energy of the system. Raising the temperature (sufficiently) above ℏω/kB will make the bond
behave classically and harmonic, meaning that it now contributes kB T /2 to the system. Be careful, we
only speak of the bond here, but when a bond becomes classical, the system will typically also gain an
additional (quadratic) momentum degree of freedom, which needs to be taken into account. Let us go
into this point a bit further next.
You will have encountered a change in degrees of freedom in studying a classical system, namely, when
you considered the ratio of the constant-pressure Cp and constant-volume heat capacity CV of diatomic
gases. One can readily show that CV = (n/2)kB N , where the number of degrees of freedom per particle
is given by n. Similarly, Cp = (n/2 + 1)kB N , which means that γ ≡ Cp /CV = (n + 1)/n. For a diatomic
gas like nitrogen, we have n = 5 if the bond between the two atoms is close to the ground state: three
translational degrees and two rotational (because of symmetry); we will come back to this shortly. If,
however, the bond is vibrating, i.e., temperatures are sufficiently high for it to behave like an effective
harmonic spring, then n = 7. These seven degrees come from the 5 original degrees of freedom of the
ground-state molecule, plus 1 for the classically-behaving (harmonic) bond, and 1 additional one for the
46 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
associated kinetic kinetic degree of freedom. That is, we have that the two atoms moving with respect
to each other along the bond and the relative motion of these masses gives rise to a kinetic energy. This
means that upon increasing the temperature γ is expected to reduce from 7/5 to 9/7.
𝑇 [𝐾]
Figure 3.3: Effect of temperature on the degrees of freedom in pure nitrogen gas. The ratio γ of
the constant-pressure Cp and constant-volume heat capacity CV as a function of the temperature T in
Kelvin. The experimental data (red points) was converted from the Cp values listed in the table prepared
by M.W. Chase [NIST-JANAF Themochemical Tables, Fourth Edition, J. Phys. Chem. Ref. Data,
Monograph 9, 1-1951 (1998)]. The magenta data is from table A-4M in [K. Wark, Thermodynamics,
4th ed., p. 783 (New York: McGraw-Hill, 1983)], but only goes up to 1000 K. The dashed blue line
shows the value γ = 7/5, while the dashed green line shows the ratio γ = 9/7 ≈ 1.29.
Figure 3.3 illustrates the change in heat-capacity ratio for pure nitrogen as a function of temperature.
Here, we can clearly see that there is indeed a well-defined plateau value of γ = 7/5 at low temperatures,
when the vibrational degrees of freedom are effectively frozen out. Between 400 and 500 Kelvin, we
start seeing departures from γ = 7/5 and the value of the ratio decreases. However, there is no plateau
at γ = 9/7. This is due to the fact that the molecules begin to ionize at sufficiently high temperatures.
Intriguingly, lowering the temperature sufficiently, it can be shown that there is an additional plateau
at γ = 5/3, which corresponds to the angular momentum degrees of freedom becoming quantum-
mechanically discretized. This plateau is not shown in Fig. 3.3.
In the above example we commented on there being 5 degrees of freedom for the low-temperature state,
while there are 7 degrees of freedom in the high-temperature state. This might not be entirely obvious,
nor is it simple to intuit how many states a molecule has, when it comprised many atoms and bonds that
become harmonic at different temperatures. We will explore this matter further in Exercise Q22. Let
us briefly illustrate a degree-counting principle in preparation for this. Assume we have a monoatomic
gas, like argon. Then in 3D, an atom can move in the x, y, and z direction of some chosen coordinate
frame. Each of these directions has associated with it a kinetic term p2i /(2m) in the Hamiltonian
(i ∈ {x, y, z}), which contributes kB T /2. Thus, for an ideal gas in three dimensions, the internal energy
is U = 3N kB T /2, as you are familiar with.
3.5. THE EQUIPARTITION THEOREM 47
Implicitly, we have assumed that rotations of the atom do not contribute to the Hamiltonian. Each valid
axis of rotation would contribute its own kinetic energy term ∝ Iω 2 , where I is the relevant moment
of inertia and ω the angular velocity about that axis. Why then do we not get U = 3N kB T for argon
gas? You could think that we should treat the argon atom essentially like a point particle,4 for which
the rotation is irrelevant. However, this is not the right way of looking at the problem. Classically,
argon has a covalent radius of 106 pm (∼ 10−10 m), so it could be seen as a spinning billiard ball, which
would have rotational-energy contributions. This is really not how you should think of argon, though.
Firstly, the mass of this atom is mostly concentrated in the nucleus, which has a very small moment
of inertia (if we were to view argon as a classical object)5 , so it is really not a ball. Secondly, this line
of argument is not entirely satisfactory in itself, because nowhere in our derivation of Eq. (3.28), did
we consider the relative magnitude of the prefactors in front of the harmonic terms in the Hamiltonian.
Therefore, these should not matter! The fact that there is a term quadratic in the generalized degrees
of freedom is sufficient to obtain factors of kB T /2, irrespective of the spring strength (magnitude of the
prefactor). We conclude that our thinking in terms of spinning balls (and ridgid dumbbells for N2 ) fails
us in predicting the right number of quadratic degrees of freedom, at least if we had naively written
down all possible quadratic contributions to the Hamiltonian.
Clearly, some degrees of freedom matter, while others do not! The decision of which ones matter and
which ones do not, must be made at the point of writing down the (classical) Hamiltonian describing
the system. But it is not clear how to do that yet. The resolution should be sought in the fact that
the physics of the system does not change with rotations of the argon molecules about their centers,
nor does it with rotations about the connecting axis of the N2 molecules. This makes sense, because
classically, any spinning of the nucleus is not going to be a detectable quantity to begin with, meaning
that argon only has classical translational degrees of freedom that lead to quadratic expressions in the
momenta. However, spinning about axes orthogonal to the connecting one in a N2 molecule, can be
observed. Thus, we need to write down two angular-momentum terms with the associated with this in
our Hamiltonian. Let us make this more explicit next, also see Chapter 17..
The rotation of any general shape in 3D can be prescribed using three angles, referred to as Euler angles.
Water molecules do not possess a convenient rotational symmetry, meaning that their motion must be
described using three momenta and three angular momenta that each contribute a harmonic kinetic
term to the Hamiltonian. So for H2 O we expect 6 degrees of freedom, provided the bonds between the
oxygen and hydrogen remain in or close to their ground state. An N2 molecule has one rotational axis
about which it is rotationally symmetric, hence this only has 5 total degrees of freedom (when the bond
are not vibrating). Note that this would also hold for a CO molecule, which is not mirror symmetric,
but is rotationally symmetric. In colloidal particles, e.g., micron-sized objects suspended in a fluid,
the friction with the surrounding fluid removes the inertia from the dynamics of the colloids. Still, an
effective description in terms of a Hamiltonian with quadratic degrees of freedom of only the colloids can
be put forward. The way in which to arrive at that description will be covered in Chapter 16. In such
a case, symmetry arguments may be similarly invoked to establish which effective quadratic degrees of
freedom should appear in the Hamiltonian. Rotations about objects with symmetry axes that leave the
system invariant should not contribute to measurable thermodynamic properties in this case. At least,
this should be the intuition that we have on the basis of our understanding of statistical mechanics. It
4 Or the nitrogen molecule as an infinitely thin rod. Note that is what is being ‘suggested’ by the data, as we get a γ
factor of 1.4 in Fig. 3.3 at low temperature. Had rotations about the axis connecting the two atoms mattered, we would
have expected a ratio of 8/6 ≈ 1.33, which we clearly do not find!
5 Similarly for the nitrogen molecule, we know that the (classical) covalent radius of a single nitrogen is 71 pm
(0.71 10−10 m), while the (triply bonded) bond length is 1.09 Å (1.09 10−10 m). However, here too we must realize
that this extent is representative for the electron cloud, while most of the molecule’s mass is in the nuclei. Thinking clas-
sically, we could argue that the moment of interia about the axis connecting the two nuclei will be very small compared
to the one about the orthogonal axes.
48 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
Vibrational degrees of freedom are more tricky still. If we consider our example of N2 again, then the
implicit assumption behind a count of n = 5 is that the entire motion can be described by referencing
the center of mass two angles. When the bond becomes effectively harmonic at high temperatures, we
have that the potential connecting the two N atoms gives a quadratic contribution to the Hamiltonian.
However, this contribution is only meaningful if we account for the relative motion of the two bonded
atoms. There is a kinetic energy associated with this relative motion as well, which provides another
kB T /2 contribution. Hence, there are n = 7, rather than n = 6 relevant quadratic degrees of freedom.
Note that in general, equipartition is powerful, but it should be used sensibly. We can count degrees of
freedom that are quadratic. However, it can be that the Hamiltonian contains non-quadratic contribu-
tions. In such cases, the internal energy cannot be computed using equipartion arguments alone. For
example, in non-dilute gases, we have contributions coming from the interactions between the molecules.
3.6. EXERCISES 49
3.6 Exercises
Q17. The Stirling Formula
The approximative expression for the factorial
√
log N ! = N log N − N + log 2πN (N → ∞), (3.30)
Provide the expression for gN (x). Note: You should not get hung up on the pedagogically
motivated reuse of variables.
(c) Calculate x0 and give a Taylor expansion of gN (x) about x0 up to O (x − x0 )2 . Argue that
exp[N gN (x)] is extremely peaked about x = x0 if N ≫ 1 and gN (x0 ) is a maximum.
(d) Now derive the Stirling formula, and calculate or estimate the (relative) contribution from
the third term for N = 2, 69, and 1020 . Conclude that the third term is utterly irrelevant for
many applications in statistical physics.
Q18. Phase-Space Trajectories
Consider a classical point particle on a (1D) line. We denote its position by x and the momentum
by px ; the total energy of the particle is E and is assumed fixed — the system is assumed closed.
The mass of the particle is m.
Sketch the phase-space trajectory of this particle in the case that it is confined to a “box” with
two hard walls, one at x = 0 and the other at x = L, with L the size of the 1D box. Collisions
with the walls are assumed purely elastic.
Q19. Quantum and Classical Harmonic Oscillators
The eigenstates of a 1D harmonic oscillator, denoted by the quantum number n = 0, 1, 2, · · · , have
energies ϵn = ℏω(n + 1/2). Here ω is the frequency.PAt temperature T , the quantum mechanical
∞
canonical partition function is defined by Zq (T ) = n=0 exp[−βϵn ], with β = 1/(kB T ).
(a) Calculate Zq (T ).
(b) Calculate Zq (T ) in the high-temperature limit T ≫ ℏω/kB .
The same oscillator is classically described by a Hamiltonian of the form H(x, px ) = p2x /2m +
mω 2 x2 /2, with m the mass, px the momentum in the x-direction, and x the amplitude of the
oscillator. The classical partition function is given by
1 ∞
Z Z ∞
Zc (T ) = dpx dx exp[−βH(x, p)], (3.33)
Y −∞ −∞
with 1/Y a prefactor that we will determine by imposing Zc (T ) to be equal to the high-temperature
limit of Zq (T ), i.e., where we expect the classical behavior should be recovered.
50 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
(c) Use your knowledge of gaussian integrals to calculate Zc (T ) in terms of Y ω/(kB T ). Does
Zc (T ) depend on m?
(d) Compare (c) with (b) and conclude that Y = h, with h = 2πℏ Planck’s constant.
(a) Show that within the grand-canonical ensemble at fixed T and V the following relation holds
∂⟨N ⟩
kB T = ⟨N 2 ⟩ − ⟨N ⟩2 . (3.34)
∂µ V,T
where rc is the range of the interaction between the particles and the well, and V0 = (4/3)πrc3 .
Show that the free energy can be written
Z
3
V0 V0 2 −λ(x−1)
βF/N = log ρΛ − 1 − log 1 − +3 dx x e , (3.38)
V V
where Λ is the de Broglie wavelength. Hint: Use a transformation to spherical coordinates, assum-
ing that the total volume V is roughly spherical to obtain the result. A ‘cleaner’ transformation
that does not make this assumption will be treated in a later excercise.
Q22. Equipartition
What is the correct value of the internal energy per particle for the following situations? Provide a
physical argument to support your calculation for each, referencing properties of the Hamiltonian,
using only a few words.
(f) A gas of rigid (infinitely thin) rods in 2D, assume that they are sufficiently small that you
can use symmetry arguments.
(g) A gas of rigid, thin trianglular prisms in 3D; make the same assumptions as in (f).
For each of these, identify what the degrees of freedom are that contribute to the Hamiltonian and
provide the proper symmetry arguments, if applicable. For (e) it may be convenient to graph the
γ value as in Fig. 3.3.
Q23. Refresher on Matrix Properties
In this exercise, we refamiliarize ourselves with some basic matrix properties.
(a) Let A be a square, real-valued matrix. Is A diagonalizable? And what if A is also positive
definite? Show this for a 2 × 2 variant. Does this give you any insight into the eigenvalues?
(b) Let B be a matrix. Show that the determinant is preserved under orthogonal transformations.
What is the physical interpretation of this? If B is additionally invertable, then what is the
determinant of the inverse?
(c) Let C be a square matrix. Show that the trace is preserved under orthogonal transformations.
Unlike the determinant, the trace of a product of equal-sized matrices is not always the
product of the traces. What is the physical interpretation of the trace?
Q24. Equipartition in a Solid
Assume we have an N -particle system with Hamiltonian
N
X p2i
H(r N , pN ) = + Φ(r N ), (3.39)
i=1
2m
where pi is the translational momentum of particle i, m is its mass, and r i indicates the position
coordinates. Here, we assume that the potential energy Φ depends on all particle coordinates, but
may be complicated. For example, the potential may not be decomposable in pair interactions and
some external field may be encompassed within it. Additionally, we assume that the temperature
is sufficiently high that the particles interact classically. The system is, however, still in a solid
phase. Let Ri denote the equilibrium positions of the i-th atom in this phase and ui ≡ r i − Ri
the (instantaneous) deviation away from this. Provided the temperature is sufficiently low —
meaning far away from the melting temperature in this context — we expect |ui | ≪ σi , where σi
is a measure for the i-th particle’s extent.
(a) Argue in a few words that in this case the Hamiltonian may be written as a truncated Taylor
expansion
N
X p2i
H(r N , pN ) ≈ + Φ(RN ) + U T M U , (3.40)
i=1
2m
where U = (u1 , · · · , uN ) is the 3N -dimensional vector that contains all deviations, the
superscript T indicates transposition, and M is a complicated 3N × 3N matrix.
(b) Explain in a few words why there are no linear terms in U or even any of the ui .
(c) Similarly explain why the matrix M must be both real-valued and symmetric. What does
this imply for the eigenvalues of the matrix?
(d) Show that the average energy U = Φ(RN )+3N kB T irrespective of the exact shape of original
potential. Hint: You are making a physical assumption here, on top of the mathematical
requirements imposed by (c). These are not sufficient to complete the argument, as per Q23.
52 CHAPTER 3. CLASSICAL ENSEMBLE THEORY
dp(z)
= −gρm (z), (3.41)
dz
where p(z) and ρm (z) denote the height-dependent pressure and mass density, respectively, and g
denotes the gravitational acceleration. Note that z denotes the height in the vertical direction, it
has nothing to do with the fugacity.
(a) Derive the hydrostatic equilibrium condition by considering an infinitesimally thin slab in
between height z and z + dz. Consider the balance of the gravitational downward force on
the mass in the slab and the upward force due to a different pressure at different heights.
(b) Solve the force balance for a dilute (ideal) gas of particles of mass m, density ρ(z), and
constant temperature T such that p(z) = kB T ρ(z).
(c) Solve the force balance for an incompressible molecular solvent of mass density δs , and denote
the resulting pressure profile by ps (z).
(d) Calculate the upward force, exerted by an incompressible fluid with the pressure profile ps (z)
calculated in (c), on a sphere of radius a with center-of-mass at height z. Hint: the force on
a surface element dS with outward normal n̂ is −ps (z + an̂ · ẑ)dS n̂, and the result is named
after an ancient Greek natural philosopher who experimented in his bath tub.
(e) If the number density profile of the colloids is denoted by ρ(z), argue that the mass density
of the system can be written as ρm (z) = δs + (δc − δs )η(z), with η(z) = (π/6)σ 3 ρ(z) the
colloidal packing fraction at height z.
(f) Show that hydrostatic equilibrium reduces to dΠ(z)/dz = −mgρ(z) with m = (π/6)σ 3 (δc −δs )
the so-called buoyant mass of a colloidal particle. Note (i) that m can be negative (this
leads to so-called creaming, the opposite of settling), and (ii) that so-called density-matching
allows for m = 0 to study gravity-free systems on Earth (which is much cheaper than space
experiments). Is the definition of m consistent with your finding in (d).
(g) If the suspension is extremely dilute, its osmotic pressure satisfies Van ’t Hoff’s law Π = ρkB T .
Calculate the resulting profile ρ(z) for the case that ρ(z = 0) = ρ0 .
(h) It turns out that the profile ρ(z) can be measured in dense as well as dilute suspensions at
fixed temperature T , e.g. by confocal microscopy or by light scattering. Explain how the
complete equation of state Π(ρ) can be obtained from this single measurement of ρ(z). This
method is very efficient to obtain information about effective colloidal interactions.
Chapter 4
In this chapter, we describe ideal gases, i.e., gases which (classically) have no intrinsic interaction to
their Hamiltonian and are instantaneous thermalized. Ideal gases admit to straightforward analysis
using the techniques introduced in Chapter 3 and allow us to build additional intuition for some of the
more abstract quantities that we have covered thus far. In the context of real systems — atomic or
molecular gases — the ideal description should be a limiting case of a model accounting for interactions.
That is, infinitely dilute gases tend to behave in a (close-to) ideal manner, provided that the particle
interactions decay sufficiently fast, also see Chapter 13. Before we turn to interacting systems, we should
understand the dilute, non-interacting limit first.
There is another way in which the ideal-gas description can break down. In a quantum-mechanical
picture, particles can have a wave-like character with an associated ‘extent’ that is captured by the De
Broglie wavelength Λ (3.16). Such quantum particles are only dilute (and weakly interacting) whenever,
the particle separation d is significantly larger than Λ. This can be expressed as follows
1/3
h V
Λ= √ ≪d∝ = ρ−1/3 . (4.1)
2πmkB T N
Clearly, this approximation fails at low temperatures (and obviously at high density ρ = N/V , where we
do not expect ideality). In this limit, we will need to take into account quantum-mechanical effects, such
as quantization. This quantization is a positive aspect from the perspective of a statistical description
of matter. It allows us to justify the (somewhat arbitrary) cutoff h3 for the continuum phase-space
volume of a single state, which we used in Chapter 3 to obtain a countable number of classical states.
Here, we will delve slightly deeper into the justification of this result than we did in Exercise Q19.
Another important difference between classical and quantum-mechanical systems is how to deal with
identical particles. In quantum mechanics, identical particles of half-integer spin (fermions), like the
electron, can never occupy the same state by the Pauli exclusion principle. The associated wave-function
is antisymmetric under particle exchange and we will show that this naturally leads to Fermi-Dirac
statistics. Identical particles of integer spin, like the photon, have symmetric exchange properties and
can therefore occupy the same state. This will lead bosons to obey Bose-Einstein statistics. These two
are distinct from Maxwell-Boltzmann statistics, which describes the distribution of classical particles
over various energy states in thermal equilibrium. We will cover how these statistics impact the behavior
of ideal (quantum) gases. Specifically, what this implies for photons emitted from a black-body radiator
and electrons that move in solid-state materials like semiconductors, among others.
53
54 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
Here, we have used that the configurational integral of a non-interacting system is simply Q(N, V, T ) =
V N and we have accounted for indistinguishability through the N ! term. We will return to the implica-
tions of this term later, when we contrast the classical and quantum statistics. The ideal-gas Helmholtz
free energy follows from Eq. (4.2) and reads
N Λ3
βFid (N, V, T ) = N log −1 , (4.3)
V
using the Stirling approximation in the thermodynamic (large-N ) limit. One should expect systems
with finite-range interactions to behave ideal-gas-like at sufficiently low densities. That is, the above
expression is a limiting form1 for ρ = N/V → 0.
N Λ3
∂Fid 5
Sid = − = N kB log − . (4.4)
∂T N,V V 2
This result is known as the Sackur-Tetrode equation and it is extensive by virtue of introducing indis-
tinguishability of the particles. That is to say, the factor N ! is crucial to obtain the scaling S ∝ N at
constant density for particles that are free to move. Note that appearance of a (quantum-mechanical)
factor Λ in the expression for the entropy is not problematic, because in practice, we are only able to
compute entropy differences. The Λ factor will drop out of any entropy difference, meaning that any
classically measurable or derivable values is independent of Λ. Such a difference can be equivalently
computed by integrating over the heat capacity weighted by the inverse temperature (convince yourself
that this is accurate by referencing exercise Q3).
You will compute the grand-canonical partition function Ξ in Exercise Q26. In the µV T ensemble
described by Ξ, the chemical potential µ imposes the average value of the number of particles in the
system. Let us use the ideal-gas system to gain additional intuition for this quantity. In the N V T
1 Often, the limit ρ → 0 leads to confusion among more mathematically inclined students. It simply means that there
is a small value of ρ greater than zero, below which the difference between Eq. (4.3) and the free-energy of the interacting
system is small and vanishing in a relative sense as ρ approaches 0. The limit is not explicitly taken, as the logarithm
evidently diverges. Expansions around the ideal-gas limit involving the free energy or chemical potential expressions are
therefore always about some finite, but small density ρ0 that can be made arbitrarily small.
4.2. DISPERSION RELATION AND DENSITY OF STATES 55
ensemble, µ can also be computed from the free energy F and the corresponding relation between µ and
N (thermodynamically identical to the µV T result) reads
N Λ3
∂Fid
µid = = kB T log . (4.5)
∂N T,V V
Note that µ is negative whenever Λ3 N/V < 1, which is what we demanded earlier for the system to
behave like an ideal gas. This might be considered peculiar, as we had understood µ to be the energy
cost associated with adding a particle to a subsystem (at least in the picture that we had of the µV T
ensemble). Why should µ be negative? Especially in a regime where the classical description (4.1)
should hold. Examining the relation in the SV N ensemble, we have the identity
∂U
µ= . (4.6)
∂N S,V
Thus, the interpretation we could give µ is that of an energy cost associated with adding an extra
particle at fixed entropy and volume. However, adding a particle provides the system with an increased
number of ways in which to share the energy amongst the particles, thereby increasing the entropy. For
a fixed entropy, the energy must therefore be reduced to add an additional particle, hence µ < 0.
ℏ2 2
Eψ(r) = − ∇ ψ(r) + V (r)ψ(r), (4.9)
2m
with E the energy. For our choice of V this gives rise to standing-wave solutions
r
8 πn
x
πn
y
πn
z
ψ(r) = sin x sin y sin z , (4.10)
L3 L L L
where the prefactor ensures normalization and the quantum numbers ni with i ∈ {x, y, z} are elements
of N (the positive integers). Clearly, the system is quantized through the introduction of the boundary
condition, see Fig. 4.1, which shows the examples of the wave solutions for a 1D system that is slightly
easier to visualize.
𝑛=5 ℏ2 2
𝑘
2𝑚
𝐸6
𝑛=3 𝐸5
𝐸4
𝐸3
𝐸2
𝑛=1 𝑘
𝑘1 𝑘2 𝑘3 𝑘4 𝑘5 𝑘6
Figure 4.1: (left) The first few one-dimensional (1D) standing waves in a box. The three-dimensional
(3D) system considered in this section is a multiplicative composition of these 1D systems. (right) The
quadratic relation between energy E and wave number k is provided on the right. The red curve is
drawn to guide the eye along a quadratic, but note that the energy levels are discrete!
Negative ni are not considered, as changing ni to −ni merely changes the phase of the wavefunction
by a factor of π, which is expressed by its sign changing from + to −. That is, sign inversion does not
produce a function describing a new state of the particle. It also does not affect the probability density
for the position of the particle, as this is given by |ψ|2 . The wave vectors associated with the quantum
numbers are given by
π
ki = ni . (4.11)
L
and we recall that the particle momentum is given by p = ℏk. This implies that changing the sign is
equivalent to having the particle travel in the opposite direction. However, the waves we obtain for
4.2. DISPERSION RELATION AND DENSITY OF STATES 57
the particle in the box are standing waves, i.e., they are a superposition of waves traveling in opposite
directions. That can be appreciated by recognizing that 2 sin(ki x) = exp(ki x) − exp(−ki x).
The permissible energy levels for the particle in our 3D box are
ℏ2 2 π 2 ℏ2
n2 + n2y + n2z ,
En = k = (4.12)
2m 2mL2 x
where k = |k| and n = (nx , ny , nz ). The above expression is also referred to as a dispersion relation as it
connects momentum to frequency (recall E = ℏω according to the De Broglie relations). The dispersion
relation takes a quadratic form. In general, such a nonlinear relation implies that propagation speed of
any ‘particle pulse’ will not be equal to its phase velocity. Additionally, a nonlinear dispersion relation
typically leads to the spreading of the pulse with time, which is what is referred to as dispersion. This
will prove relevant later.
The quantum-mechanical partition function for a single particle can now be written as
X
Z1 = exp (−βEn ) . (4.13)
n
This sum can be evaluated by making use of an integral, see Exercise Q19 for the 1D variant of this
argument; Exercise Q28 covers the general case. This is a reasonable approximation at high temperature,
where quantum effects are limited. The approximative integral is given by
X Z L3 1 4πL3
Z Z
≈ dn = 3 dk = 3
dk k 2 , (4.14)
n
π 8 π
where in the last step we have transitioned to spherical coordinates (hence the factor 4πk 2 ) and the
factor 1/8 accounts for the octant to which n is bounded. It will be useful to further convert the integral
to one over energy E instead
ℏ2 2 ℏ2
E= k → dE = kdk. (4.15)
2m m
This results in
3/2
L3 √ mL3 √ L3 2m √
Z Z Z Z
2
dk k = 2m 2 3 dE E = dE 2 E ≡ dEg(E), (4.16)
2π 2 2π ℏ 4π ℏ2
where g(E) is the density of states (DoS). That is, g(E)dE counts the number of states with an energy
between E and E + dE. This is simply a measure for integration, which we can apply to any function,
including exp(−βE) to arrive at the partition function.
Note that while we used approximations to arrive at the DoS, in practice, it is possible to replace any
sum over states by an integration over a suitable density of states without significant loss of precision.
Direct application the above integral form to a bosonic quantum system would lead to an oversight.
This will not turn out to be of relevance to the calculations that we perform in this chapter. However,
it will be relevant for the formation of a low-temperature Bose-Einstein condensate, which we will cover
in Chapter 6. For the sake of completeness, we also provide more mathematical route toward defining
a density of states. This starts from the partition function and makes the formal rewrite
X XZ
Z1 = exp (−βEn ) = dk exp (−βE) δ(E − En (k)), (4.17)
n n
where the Dirac δ was introduced. This allows us to identify the density of states as
3/2
L3 2m √
XZ
g(E) = dk δ(E − En (k)) ≈ 2 2
E. (4.18)
n
4π ℏ
58 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
with c the speed of light. Note that we have used the De Broglie relation to express the momentum of
the particle in terms of ℏk and m should be interpreted as the rest mass.
For particles with a finite rest mass, the system is said to be ‘gapped’. Repeating the basic manipulation
above, we obtain
VE p 2
g(E) = E − m2 c4 , (4.20)
2π 2 ℏ3 c3
for the DoS with V the volume of the system. It is clear that there is a band of energies 0 ≤ E < mc2 for
which g(E) is imaginary. This is an important result, as we will discuss shortly. For massless particles
the DoS reduces to
V E2
g(E) = , (4.21)
2π 2 ℏ3 c3
which is real for all E ≥ 0. The DoS in dimensions lower than 3 are different and will be considered in
Exercise Q27. This difference will turn out to have strong implications for the phase behavior of bosons.
𝜔 𝜔
𝜔0
gap
𝑐𝑘 𝑐𝑘
Figure 4.2: Representation of the relativistic dispersion relation for a particle with rest mass m (blue)
and without (red). (left) The particle wave frequency ω is given as a function of the wave number k.
For the particle with a rest mass, the low-k limit approaches a constant ω0 . The presence of a constant
ω0 at k = 0 makes the system gapped. (right) The curves approach each other in the high-k limit, i.e.,
a particle with rest mass m0 will tend toward the massless dispersion ω = ck.
4.3. BOSONS AND FERMIONS 59
Turning Eq. (4.19) into a dispersion relation, we find ω 2 = c2 k 2 + ω02 , with E = ℏω via the Planck
relation and ω0 = mc2 /ℏ. Figure 4.2 shows this result. Here, we see that for large wavelengths, or
small k, ω ∝ ω0 , while for short wavelengths, or large k, ω ∝ ck. Waves in the former regime possess
a vanishing group velocity and diverging phase velocity2 . In other words, for frequencies below the
cutoff ω0 , the wave number is imaginary. In practical terms, this implies that such matter waves do not
propagate; propagation implies a finite real value of k. Any wave that does form with a lower energy is
evanescent and the amplitude characterized by decaying exponentials. The interpretation of the above
result is that there are no ‘free’ states between zero energy and the gap.
Turning to solid-state electronic systems (e.g., metals, semiconductors, and superconductors3 ), a gapped
dispersion relation implies there is an energy range where no electronic states (free electrons) can exist.
An intuitive example is found in a semiconductor, the ‘band gap’ in such a material generally refers to
the energy difference between the top of the valence band and the bottom of the conduction band. The
energy difference is what is required to promote a valence electron bound to an atom to a conduction
electron, which is free to move and serves as a charge carrier. You already considered this scenario
in an approximative manner in Chapter 5. The interpretation of evanescence in this context is, e.g.,
that a thermal fluctuation kicks an electron into the conduction band briefly. However, because there is
insufficient energy available for it to propagate over the lattice, eventually it falls back into the ground
state, in which it is again bound to its original lattice site.
Given two complex numbers with identical modulus, these must differ only by a phase eiθ with θ some
real phase-angle value. Thus we have that
which implies e2iθ = 1 and4 and eiθ = ±1. The positive value is associated with bosons (ψ is symmetric
under particle exchange) and the negative one with fermions (ψ behaves antisymmetrically). Quantum
2 A divergent phase velocity that can easily exceed c might be somewhat concerning. However, this is a completely
artificial limit that has no bearing on information transport and does not induce causality violations.
3 In a superconductor, emergence of the gap at the critical temperature marks the superconducting transition, but
discussing this transition in detail goes beyond the scope of these notes.
4 N.B. There is a subtlety that we have swept under the rug here, this result holds for systems with three or more spatial
dimensions. In 2D, there is a manner in which to construct phase factors with values eiθ ̸= ±1 that lead to identities
upon double exchange. This is because in 2D clockwise and counter-clockwise rotations are well-defined concepts. The
particles that are associated with this more general interchange rule are called anyons and these possess a statistics that
continuously ‘interpolates’ between that of bosons and fermions. They play an important role in the fractional quantum
Hall effect. A detailed discussion of anyons goes beyond the scope of these lecture notes and we refer the interested reader
to a textbook on solid-state quantum statistics for further information.
60 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
field theory will tell you that bosons are those particles with integer spin values s = 0, 1, 2, . . . and
fermions are the ones with half-integer spins s = 1/2, 3/2, 5/2, . . . .
The main implication of the (anti)symmetry under exchange is the Pauli exclusion principle. It can be
readily shown from the above that
1
ψ(r 1 , s1 ; r 2 , s2 ) = √ (ψ(r 1 , s1 )ψ(r 2 , s2 ) ± ψ(r 2 , s1 )ψ(r 1 , s2 )) , (4.25)
2
where the plus symbol represents two bosons, the minus symbol two fermions, and the two-entry ψ are
normalized single-particle wave functions5 Assume now that we have fermions and that both particles
are in the same state, as represented here by the identical quantum number sk , then ψ(r 1 , sk ; r 2 , sk ) =
2−1/2 (ψ(r 1 , sk )ψ(r 2 , sk ) − ψ(r 2 , sk )ψ(r 1 , sk )) = 0. This shows that two fermions cannot occupy the
same single-particle state, i.e., have all identical quantum numbers. In other words, identical fermions
must occupy different states. When multiple are added to a system, they start filling energy levels from
the bottom up, as we will see later. Note that it is crucial to realize that an energy level may be multiply
degenerate, so that there may still be many fermions with the same energy. Having multiple bosons
in the same single-particle state, is, however, completely acceptable. This allows bosons to undergo
Bose-Einstein condensation, which we will cover in Chapter 6.
Whether classical particles must be treated as distinguishable or as indistinguishable for the purpose
of statistics, depends on the specifics of the system. If, for example, such identical particles are bound
to lattice sites and we can see this bond, we may distinguish them by labeling them according to the
associated site, which does not change. If they are, however, free to move, then it will be impossible to
tell which particle is which, when contrasting two snapshots of the system. The particles are identical,
after all. In this case, we must treat the particles as indistinguishable and correct for configurational
overcounting. This is, for example, the case for a classical ideal gas, as you have seen.
Figure 4.3: Visualization of the permissible state occupation for, from left to right, two (identical)
fermionic, bosonic, and classical particles, respectively, having three possible internal states each, labelled
according to their (in this case unique) energy levels ϵ0 , ϵ1 , and ϵ2 . Two quantum-mechanical particles
cannot be distinguished and we therefore duplicate the label ‘A’ for these, whilst two classical particles
can be ‘distinguished’, hence we label them ‘A’ and ‘B’. (left) The two fermions cannot be in the same
single-particle state, hence we have exactly three possible combinations. (middle) Two bosons can
occupy the same single particle state, leading to 6 possible combinations. Here, the indistinguishable
character of quantum particles is expressed: AA = AB = BA is one state! (right) We treat classical
particles as distinguishable, i.e., AB ↔ BA ̸= AA, and these can occupy the same single-particle state,
leading to a total of 32 = 9 occupation combinations. However, in writing down a partition function
must introduce a factor 1/2! to account for the fact that exchange of A and B leaves the system invariant.
The above discussion is perhaps a bit abstract, even potentially confusing. Let us therefore illustrate the
consequences of the classical, bosonic, and fermionic conditions on state occupation. Figure 4.3 shows
state-occupation tables for two identical particles, labeled A (and B), which can each (separately) be in
states with energy ϵ0 , ϵ1 , and ϵ2 . These energy levels are associated with some single-particle quantum
number that distinguishes the states; in practice the energies can be the same. Consulting Fig. 4.3, we
6 Including the factor N ! ensures the extensiveness of (certain) physical quantities, as we have argued previously.
However, this does not imply that there is an issue with extensiveness of classical distinguishable particles.
62 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
1 −2βϵ1
Zclas. = e + e−2βϵ2 + e−2βϵ3 + 2e−β(ϵ1 +ϵ2 ) + 2e−β(ϵ1 +ϵ3 ) + 2e−β(ϵ2 +ϵ3 ) ; (4.26)
2!
Zbos. = e−2βϵ1 + e−2βϵ2 + e−2βϵ3 + e−β(ϵ1 +ϵ2 ) + e−β(ϵ1 +ϵ3 ) + e−β(ϵ2 +ϵ3 ) ; (4.27)
−β(ϵ1 +ϵ2 ) −β(ϵ1 +ϵ3 ) −β(ϵ2 +ϵ3 )
Zferm. = e +e +e . (4.28)
We reiterate that here we have treated the ϵi as distinct, even though it should be emphasized that
these are simply proxies for distinct states and may represent equal energy. We will capture this later
using the DoS. In the table, we already see the significant differences between bosonic, fermionic, and
classical systems. Examining the first energy level, we see that for fermions, 2/3 of the possible states
have occupation of this state, for bosons the fraction is 1/2, and for classical particles the ratio goes up
to 5/9. More importantly, for fermions there is no double occupation, for bosons 1/3 of the states is
doubly occupied, and for classical particles the double-occupation fraction is 1/5. We might, therefore,
expect the quantum departures from the classical ideal gas at low temperature be in opposite directions.
A more involved calculation will show that this is indeed the case.
Extending this to k possible energy levels per particle for N particles, there are
Ωclas. = kN ; (4.29)
(N + k − 1)!
Ωbos. = ; (4.30)
N !(k − 1)!
k
Ωferm. = , (4.31)
N
states, respectively. Note that for fermions, this implies k ≥ N to accommodate all particles. For
bosons, the combinatorics is that of placing N balls in k baskets; Exercise Q30 works this out.
Next, we turn to question of computing the mean occupation number ⟨ni ⟩ for a state ϵi (accounting for
degeneracy). This will give the classical Maxwell-Boltzmann statistics and the two quantum statistics:
Fermi-Dirac for fermions and Bose-Einstein for bosons. The latter two will prove instrumental for
computing quantities in quantum systems. However, we shall first deal with a classical system to gain
intuition for these statistics and how the DoS comes into play.
𝑘 ⋯ 𝑛𝑘 = 0
𝑗 𝑔𝑗 = 2 𝑛𝑗 = 5
3 𝑛3 = 3
2 𝑔2 = 3 𝑛2 = 1
1 ⋯ 𝑛1 = 0
1 2 3 𝑁
Figure 4.4: Representation of a system with N identical classical particles, which each have k discrete
energy levels, as illustrated by the black lines. These levels are iterated over using index j. The red
coloring indicates the current level at which a particle is. An energy level may be degenerate, as indicated
by the small blue arrows and the value of gj . The numbers nj count the number of particles in the
system that are in a specific (single-particle) energy level ϵj . Note that we do not show all particles, so
that n3 = 3 is not an error. The three particles that are at ϵ3 are located in the · · · region.
We will now consider the total number of ways in which N particles may be distributed over the k
possible levels. Let us assume that we know the value of the nj . Then the various ways in which this
subdivision can be picked out of the N particles available is given by the multinomial expression
k
N! Y 1
w= = N! . (4.33)
n1 !n2 ! · · · nk ! n
j=1 j
!
Let us now further assume that for each energy level ϵj , there are gj states that lead to this energy.
Then the total number of combinations w̃ may be written as
k n
Y gj j
w̃ = N ! . (4.34)
n !
j=1 j
n
The multiplication accounts for each particle in the j-th level to be in any of the gj states, leading to gj j
different possible state combinations. The total number of states w in equation (4.33) is thus a special
case of w̃ that describes a system, where each energy level corresponds to exactly one state. Finally, we
can assume that the particles are indistinguishable and arrive at the total number of combinations Ω
for our k-energy-level gas
k n
Y gj j
Ω= . (4.35)
j=1 j
n !
The idea that we should now have is that we should pick our nj sensibly, such that the number of
states Ω is maximized. For example, if we chose n1 = N , then the total number of states Ω = g1N /N !.
64 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
However, by choosing a different distribution of nj , we can increase the value of Ω substantially. Your
statistical mechanics intuition should inform you that the choice we want is the one that maximizes
the entropy in the system. However, this maximization must be carried out under the constraints in
Eq. (4.32), for which turn to the Lagrange-multiplier formalism of Chapter 2.
where we ignore higher-order terms. We now introduce the Lagrange multipliers to account for the
constraints of Eq. (4.32) and find the extremum
Xk k
X
L = log Ω + λ1 N − nj + λ 2 U − n j ϵj ;
j=1 j=1
k
X
= λ1 N + λ2 U + [nj log gj − nj log nj + nj − (λ1 + λ2 ϵj )nj ] . (4.37)
j=1
Note that according to our constraint minimization, we still need to determine the values of λ1 and
λ2 . If we have that N ≫ 1, then we can extract the result by manipulating Eq. (4.38). Multiplying
Eq. (4.38) by nj , summing over j, and substituting the expression for log Ω, we obtain
log Ω = λ1 N + λ2 U, (4.40)
and subsequently
1 λ1
dU = d log Ω − dN. (4.41)
λ2 λ2
But this may be identified with the relation dU = T dS − pdV + µdN that holds in thermodynamic
limit. It should be clear that log Ω serves the role of S/kB and thus we arrive at λ2 = β and λ1 = −βµ.
Substituting these relations back into our equation for nj , we arrive at
gj
nj = , (4.42)
exp [β(ϵj − µ)]
and the expression
k k
X X gj
N= nj = . (4.43)
j=1 j=1
exp [β(ϵj − µ)]
4.5. MAXWELL-BOLTZMANN STATISTICS 65
Here, we recognize that the gj serve as a (discrete) density of states! That is, the number of states
present for a given energy level ϵj . Generally, as we have seen in this chapter, the density of states can
vary depending on the nature of the particle (relativistic or not). It is therefore desirable to separate
the DoS from occupation of a single-particle energy level.
The energy-level statistics corrected for the DoS, has the following form
1
fMB (ϵ) = , (4.44)
exp[β(ϵ − µ)]
which is referred to as Maxwell-Boltzmann statistics. Quantities such as energy and particle number
for general classical gases can be derived from this statistics and an appropriate DoS. For example the
number of particles N and energy E in the system are given by
Z
N = dE g(E)fMB (E); (4.45)
Z
E = dE g(E)EfMB (E), (4.46)
respectively. Here, we have made the necessary transformation to go from a sum over energy levels to
an integral representation.
Let us consider the grand-canonical ensemble, where we have — referencing Eq. (2.18) — the following
equalities
∞ X
X ∞ X
X
Ξ(µ, V, T ) = exp[βµN − βEs ] = exp[βµN − βE], (4.47)
N =0 s N =0 E
with s indexing individual states of the entire system with N particles in it. We assign an energy Es to
such an N -particle state s; the particle and thermal reservoir are assumed infinitely large. In the second
equality, we have replaced the sum over states by a sum over the various energies E that are present in a
system with N particles. It is important to understand that each E may be multiply degenerate; both in
terms of ways in which to have single-particle levels come together to obtain E and internal degeneracy
of a single-particle level (multiple quantum-number combinations give the same single-particle energy).
To make this mathematically clear, we introduce the single-particle energy levels ϵi ; the Es are thus
a sum over the relevant values ϵi contributing to a given state s. However, E is also a sum over ϵi ,
7 Itshould be noted that the Maxwell-Boltzmann statistics can equally be derived by considering the quantum gases
and taking the appropriate high-temperature limit, in which their statistics both reduce to the result derived above.
8 We assume that the particles are non-interacting, which is an approximation that allows us to write the total energy
as the sum of the individual particle energies. For dilute, free electrons in a (semi-)conductor, a fermionic particle, the
system turns out to satisfy this condition, though this is far from obvious in view of the long-ranged nature of Coulomb
interactions. Clearly, for photons, a bosonic particle, the approximation is excellent, because the electromagnetic field
does not interact with itself, due to the linearity of Maxwell’s equations.
66 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
P
but a simpler one, since there are ni particles that have this energy level ϵi ; N = i ni . This means
that P
the occupancy of this level (throughout the system) contributes ϵi ni and that the total energy is
E = i ϵi ni . Note that (again) we must be careful,P as the energy levels ϵi may be degenerate in terms
of quantum numbers, but this will not impact E = i ϵi ni .
Introducing the expressions for N and E into the equation for Ξ, allows us to break up the exponent of
a sum into a product of the individual terms. We can then exchange the sums over N and E with the
product to obtain
YX Y
Ξ(µ, V, T ) = exp[βni (µ − ϵi )] ≡ Zi , (4.48)
i ni i
where now we sum over the all the various occupancies of single-particle energy levels that can occur
in the system. Here, we define the factor Zi to be the effective single-level ‘grand’-canonical partition
function. Sketching the situation out might help in your understanding as to how to arrive at this result.
We have purposefully kept the indexing of the sum over ni ambiguous, as this will be where we will
impose the fermionic and bosonic character of our particles; we will come back to this shortly.
A key point in interpreting the exchange of the product and sum is that in Eq. (4.47) we sum over
all possible number of particles. The infinity in the sum over potential particle allows us to consider
instead all possible occupations of the single particle levels instead. This is because which particles carry
a certain energy level does not matter, because they are are intrinsically indistinguishable in quantum
mechanics! For example, let us assume that we have fermions, then energy level j ∈ {0, . . . , k} may
be assumed by one particle or none of the particles. Suppose we have one of the particles carry that
energy level, we cannot know which one it is. So, unlike classical, distinguishable objects (which we
may label, but for which the labelling may not be unique), there is only one state corresponding to that
level being occupied. This means that we do not get unpleasant combinatorial factors from exchanging
the sum and the product, as would be the case classically, since putting that level on, e.g., particle 1
out of 10 is different from putting it on particle 3 out of 10, and so on. Likewise, for bosons, there
might be 25 particles with the single-particle energy level ϵj . Again, however, which particles those are,
does not matter. This is also expressed in the fact that Zi = exp[βni (µ − ϵi )] = exp[β(µ − ϵi )]ni , where
we recognize that all the occupations (that are permissible) are independent of each other. This last
observation is relevant for bosons, as they can be in the same single-particle state multiple times.
The result in Eq. (4.48) is quite powerful, because each energy level can be dealt with independently,
meaning that finding ni particles in state i is independent of what is happening in the other states. For
example, it is now straightforward to compute ⟨ni ⟩ using the grand-canonical partition function, from
which we will extract Bose-Einstein and Fermi-Dirac statistics next. These computations will assume
no degeneracy factor (a single state per energy level), but a DoS may be readily added after completing
the calculation, in complete analogy to the Maxwell-Boltzmann scenario treated earlier.
∞
X 1
Zi = exp[βni (µ − ϵi )] = , (4.49)
ni =0
1 − exp[β(µ − ϵi )]
4.6. NON-INTERACTING BOSE GAS 67
using the geometric series. We find for the grand-canonical partition function
Y 1
Ξ(µ, V, T ) = , (4.50)
i
1 − exp[β(µ − ϵi )]
Taking a derivative of Ω with respect to µ (keep track of prefactors!), the energy-level occupancy is
found to read
1
⟨ni ⟩ = , (4.52)
exp[β(ϵi − µ)] − 1
where you should note the inversion of the elements in the exponent. When ϵi < µ the number of
particles is negative, which means that µ < ϵi for all i. Because the ground state is often chosen9
to have energy ϵ0 = 0, the chemical potential must be negative. This is consistent with our earlier
observations on the classical ideal gas.
1
fBE (ϵ) = . (4.53)
exp[β(ϵ − µ)] − 1
This result has similarities to the expression we found for the Maxwell-Boltzmann statistics. However,
there is now the addition of a ‘−1’ in the denominator. Since we have just argued that µ < 0, this
implies that as the chemical potential approaches the energy of the ground state from below, the number
of particles in the ground state diverges at fixed temperature. Similarly, the ground-state occupation
diverges when the temperature goes to zero at fixed µ. In practice, we will need to be a lot more subtle,
as we will see next.
where V is the system’s volume. The factor gs = 2s + 1 accounts for the degeneracy of an energy level
according to the number of spin states s that it represents. Using the above DoS, we can compute the
total number of particles in the gas
Z Z
g(E)
N = dE g(E)fBE (E) = dE −1 . (4.55)
z exp[βE] − 1
9 The specific choice of the ground-state energy will impact the integration boundaries and expression for the DoS.
68 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
Note that because technically we are in the grand-canonical ensemble, we compute an average. By
inverting this relation, we obtain µ as a function of N for a given T , which we will come back to shortly.
The average energy is given by
Z Z
Eg(E)
E = dE g(E)EfBE (E) = dE −1 . (4.56)
z exp[βE] − 1
Finally, in the grand-canonical ensemble we can compute the pressure from the grand potential, as you
will do yourself in Exercise Q32. Identifying the energy via partial integration gives us pV = 2E/3.
This is referred to as the caloric equation of state — we will see this quantity again in Chapter 14, when
we cover structure factors and pair-correlation functions in dense fluids — and it holds in general for
bosonic, fermionic, and classical ideal gases. Note that E is still a function of z and that this depends
on temperature as well10 .
At high temperatures, the fugacity z ≪ 1. At first glance this result seems strange, z = exp(βµ), so
clearly, z → 1 as the temperature goes to infinity. What we are missing is that simultaneously we
must guarantee that ρΛ3 ≪ 1, such that we can employ the ideal-gas approximation. This implies
that as the temperature goes up, the chemical potential must tend toward −∞ faster. In practice,
we want to use a constant density, which implies that to leading order z/Λ3 should be constant. The
underlying issue is that we derived our statistics in the grand canonical ensemble, but we are interested
in quantities that conform to the canonical ensemble. This problem will crop up again when we study
Bose-Einstein condensation in Chapter 6. Unfortunately, there is no straightforward manner in which
to remedy this situation. Deriving the statistics in the N V T ensemble is not pleasant and we are better
off compromising here by introducing a temperature dependence to µ.
From the above discussion, we conclude that we must have that z ∝ T −3/2 ≪ 1 as T ↑ ∞. Using this
information, we can now expand Eq. (4.55) to
N z z
ρ= = gs 3 1 + √ + · · · , (4.57)
V Λ 2 2
and we can perform a similar expansion on Eq. (4.56) to obtain
E 3z z
= gs 1 + √ + ··· . (4.58)
V 2βΛ3 4 2
We can insert the inverted form of Eq. (4.57) into Eq. (4.58) to obtain an expression for E in terms of
ρ. That is, z is obtained as a function of ρ by truncating the series and solving for ρ, picking the correct
root of the quadratic equation and Taylor expanding this. The more elegant way of inverting P∞ Eq. (4.57)
relies on the concept of series reversion, where z is expanded as a series in ρ, i.e., z = n=1 cn ρn with
the cn unknown coefficients. Plugging this series into Eq. (4.57) and matching orders of ρ, allows one
to identify the values of the cn sequentially. If you use series reversion, you will need to go up to second
order in ρ to find the final result. As an intermediate step you will obtain
√
Λ3 Λ6 (9 − 4 3)Λ9 3
z(ρ) = ρ − 3/2 2 ρ + ρ − ··· (4.59)
gs 2 gs 36gs3
The expression E(ρ) can subsequently be substituted in the equation for the pressure pV = 2E/3 to
achieve the final result
Λ3 2
βp = ρ − √ ρ , (4.60)
4 2gs
10 The relevant expressions are provided at the end of Chapter 6, should you want to verify your results.
4.6. NON-INTERACTING BOSE GAS 69
which reproduces the equation of state for the classical ideal gas in the limit T ↑ ∞. Note that the
relation pV = 2E/3 is called the caloric equation of state and in the classical ideal gas can be readily
derived using equipartition and the ideal gas law. In quantum systems you can show that this holds for
both fermions and bosons using properties of the grand canonical partition function.
Note the second (quadratic in ρ) term in Eq. (4.60) is a low-density correction that accounts for quantum
effects. We will encounter this type of expansion again, when we examine the classical virial coefficients
in Chapter 13. In this case, however, the term emerges solely from quantum statistics, as we had
explicitly removed any interactions from our description. At high temperature, the effect of this term is
to reduce the pressure. This makes sense, because more bosons can occupy the same single-particle state
quantum mechanically than classical particles can. Referencing Fig. 4.3, we indeed see this reflected in
the difference in fraction of doubly occupied energy levels between the boson and classical system. What
is perhaps potentially more disturbing is the appearance of the De Broglie wavelength, which always
drops out of any classically measurable quantity. However, this expression charts how the quantum
system approaches the classical limit and in any non-ideal gases other effects will dominate over this
term in the classical limit. We will consider the low-temperature limit of the Bose gas in Chapter 6,
where we discuss Bose-Einstein condensation.
V ω2
g̃(ω)dω = dω, (4.61)
π 2 c3
with g̃ the photon-frequency DoS11 . Following our Bose-gas analysis, the partition function is written
1
Zω = , (4.62)
1 − exp(−βℏω)
where we use photons have chemical potential µ = 0. This is because the chemical potential is the
(rest-mass) energy cost for inserting a photon into the system, but photons are massless particles. That
is to say, for any vanishingly small energy, a photon can be created, albeit with a very long wavelength.
Evaluating the grand potential — referencing Eq. (4.51) and switching to an integral representation —
we thus have Z
V
βΩ = − log Ξ = 2 3 dω ω 2 log (1 − exp[−βℏω]) . (4.63)
π c
We know the relation between the average energy and the grand potential, which leads to
ω3
Z
∂ Vℏ
E=− log Ξ = 2 3 dω , (4.64)
∂β π c exp[βℏω] − 1
11 N.B. We might expect a factor of 2 in the denominator of our density of states, referencing Eq. (4.21). We also need
to multiply this result with a factor of gs as in the case of Eq. (4.54), to account for ‘spin’ degeneracy, which we had not
accounted for in obtaining Eq. (4.21). However, we do not find a degeneracy factor of gs = 3, as one might expect for a
spin 1 particle like the photon, based on our previous discussion. Instead, we have accounted for a factor of 2 degeneracy
to account for the two classical states of polarization. Why is this permitted? Photons are a massless boson and their
spin algebra works differently from that of bosons with mass, a photon turns out to have helicity rather than spin. A
full discussion of this goes beyond the scope of the notes, but we refer the interested reader to Wigner’s original paper
on unitary representations of the inhomogeneous Lorentz group. N.B. This is a rather advanced text, other treatments of
helicity may be found.
70 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
𝐸(𝜔) [MJy/sr]
𝜔 [1/cm]
Figure 4.5: Visualization of the measured cosmic microwave background radiation spectrum (red dots,
in MJy/sr, where “Jy” is Jansky and “sr” stands for steradian) fitted with a Planck distribution (blue)
that leads to an effective temperature T ≈ 2.725 K. The error bars (green) are expanded by a factor of
200 to make them visible. Data was obtained from [D.J. Fixsen et al., Astrophys. J. 473, 576 (1996)],
who processed the COBE satellite results.
from which the Planck distribution E(ω)dω can instantly be recognized. Note that this is the spectrum
that belongs to a thermalized photon gas!
This fact should strike a chord with any physicist: the (nearly perfectly uniform) cosmic microwave
background (CMB) radiation has the Planck spectrum, see Fig. 4.5. The implication is that the CMB is
thermalized across the entire observable universe! But this is strange. Suppose we observe light from the
CMB coming from two opposite directions. Then by Einstein’s theory of relativity, the matter emitting
this light cannot have interacted with each other, as it belongs to causally disconnected regions of the
universe. How can it be that it is at the same temperature? Or in the language of statistical mechanics,
how can it be that these regions have come into thermal equilibrium with each other? Discussion of the
potential solution to this problem (inflation) belongs to the realm of cosmology and will not be covered
in this course. However, you should take away from this small aside that statistical mechanics finds
application throughout physics.
Next, we use the Planck spectrum to relate temperature to the color of an object, as you may recall from
courses that discuss black-body radiation. First we can make a rough estimate for the relation between
temperature and the wavelength of the emitted light. Assume that the dominant emission comes from
the peak of the distribution, then we see that the associated angular frequency is given by
kB T
ωmax = ζ , (4.65)
ℏ
where ζ ≈ 2.822 solves 3 − ζ = 3e−ζ . Rewriting this expression, the maximum wavelength is given by
hc 5.1 10−3 m K
λmax = ≈ , (4.66)
ζkB T T
4.7. NON-INTERACTING FERMI GAS 71
where T is in Kelvin. For visible light, we have wavelengths of 380 nm to 700 nm, meaning that
we require temperatures of around 104 K. Care has to be taken in using this result, as there is an
entire spectrum of frequencies being emitted for a given temperature by a black-body radiator. Our
approximation generally overestimates the temperature required to produce a given color. In addition,
our eyes will interpret the spectrum, leading to unexpected results. For example, a black-body radiator
of 7 103 K will be perceived to emit white light, rather than the expected color green, if we were to use
the maximum wavelength result.
The total energy E emitted at a given temperature T can also be straightforwardly determined from
the Planck spectrum. We will outline the results here and leave the algebra to Exercise Q33. The full
calculation involves a Gamma function and leads to the energy-density expression
E π 2 kB
4
E≡ = T 4. (4.67)
V 15ℏ3 c3
From this the Stefan-Boltzmann law for the energy emitted by an object at temperature T can be
obtained, where j is the energy flux
Ec
j= ≡ σT 4 . (4.68)
4
Here, σ is the Stefan constant and the first identity follows from a geometric consideration, namely that
for a finite object photons are only emitted in one direction.
Lastly, we consider the high-temperature limit for the photon system. In this case, we mean ℏω ≪ kB T .
Then Planck’s distribution reduces to
V
E(ω) ≈ ω 2 kB T. (4.69)
2π 2 c3
Note that all hints of the quantum-mechanical character of light, as described by ℏ, have disappeared
from this expression. You may have encountered it before as the Rayleigh-Jeans law for the distribution
of classical radiation. This law had a serious problem — arguing from a pre-quantum perspective — as
the total energy diverges: an effect which is referred to as an ultraviolet catastrophe. The quantization
of light eliminated this catastrophe, as for ℏω ≫ kB T there is simply not enough energy to create a
single photon. This implies that the high-frequency modes remain unpopulated12 .
expressions are roughly the same, except that the speed of light is replaced by the speed of sound cs and that phonons
can possess three polarizations, rather than two. However, a lengthy discussion of this topic goes beyond the scope of
these notes and we refer you to a basic course on solid-state physics for a discussion of the Debye model for phonons and
its shortcomings, which were rectified by Einstein.
72 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
Following the steps from the section on bosons, it is straightforward to compute the particle number,
average energy, and pressure, using the DoS. Here too, the factor gs = 2s + 1 needs to be accounted
for. The equation of state in the high-temperature limit now becomes (following steps analogous to the
ones taken for the Bose gas)
Λ3 2
βp = ρ + √ ρ + ··· . (4.74)
4 2gs
Thus, the correction to the classical result is an increase in the ideal-gas pressure by the fermionic
character of the quantum particles. That the pressure should increase also makes sense, as each state
can only be occupied by a single fermion.
Let us start by considering the zero-temperature limit of an ideal fermion gas. This would seem like
a terrible approximation to the physics of actual electrons in a metal, as these experience Coulomb
interactions. However, trying this simplest of approximation will turn out to yield a surprisingly accurate
result. We note that taking the limit T ↓ 0 leads to reduced Fermi-Dirac distribution
1 if ϵ < µ
fFD (ϵ) → . (4.75)
0 if ϵ > µ
The interpretation is that each fermion added to the system (value of µ) settles in the lowest energy
state. As more particles are added, the states are successively filled. The Fermi energy is the energy
level associated with this state13 , i.e., EF = µ(T = 0). The value of EF is computed by determining the
relation between N and E from
3/2 Z EF
√
Z
V 2m
N = dE g(E)fFD (E) = gs 2 2
dE E. (4.76)
4π ℏ 0
13 This definition works well for metals, but can lead to confusion for semi-conductors and insulators, for which the
and inverting the expression to obtain E as a function of ρ. Then the limit T ↓ 0 is taken to obtain
2/3
ℏ2 6π 2
EF = ρ . (4.77)
2m gs
Using the Boltzmann constant an equivalent Fermi temperature TF ≡ EF /kB may be defined. This
temperature divides what is considered a high-T and low-T limit. Note, however, that it is not typically
a low temperature in an absolute sense: the Fermi temperature for electrons in a metal is typically
around 104 K. It also does not define a phase transition; we will find a transition for the ideal Bose gas
in Chapter 6, though that too will be more subtle than the classical examples that follow later.
𝑓FD
𝜀
𝐸F
Figure 4.6: Perturbations of the Fermi-Dirac distribution fFD (ϵ) away from temperature T = 0. At
absolute zero, the distribution is Heaviside-like (blue; T = 0). For small departures (red), kB T much
smaller than the Fermi energy EF , the profile is smoothed out.
When T = 0, it is easy to compute that the average energy is E = (3/5)N EF and the pressure is given by
βp = (2/5)N EF , again using the caloric identity. This implies that there is a residual pressure at T = 0
that comes from the exclusion principle, as both the classical and Bose ideal gas have vanishing pressure
at this temperature. The story becomes more complicated, when we perturb away from T = 0. The
usual approach is to intuit that only those states which are within kB T of the Fermi surface are affected
by temperature, see Fig. 4.6. This implies that the Heaviside-like form of Eq. (4.75) is only weakly
affected. Insisting that the number of particles remains the same, when the temperature is changed,
implies ∂N/∂T = 0. This leads to an expression for the heat capacity of the system, see Exercise Q34.
Here, we will instead use the more mathematical Sommerfeld expansion, which provides the prefactors to
the above argument. This analysis will also set the stage for our analysis of Bose-Einstein condensation
in Chapter 6, wherein we will make extensive use of families of functions. Note that the particle and
74 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
Making the substitutions y1 = βµ − x for the first integral and y2 = x − βµ for the second, we arrive at
βµ ∞
(log z)n (βµ − y1 )n−1 (βµ + y2 )n−1
Z Z
Γ(n)fn (z) = − dy1 + dy2 . (4.82)
n 0 1 + ey1 0 1 + ey2
where we used the relation between βµ and z. The integral can be evaluated analytically by identifying
a series or using an analytic integration software. The final result is written as
The above result can be plugged back into our original expressions for ρ and E/V . The former leads to
a relation between µ and ρ, which can be expressed in terms of the Fermi energy as
2 !
π 2 kB T
µ = EF 1 − + ··· . (4.86)
12 EF
4.7. NON-INTERACTING FERMI GAS 75
The chemical potential is maximal at T = 0 and decreases, when T is raised. This corresponds to our
intuition for the classical ideal gas, where the chemical potential can be at most zero and is otherwise
increasingly negative as particles are added to the system. Using the expression for E and a few lines
of algebraic manipulation, we arrive at a heat capacity of
π T
CV = N kB . (4.87)
2 TF
The above heat capacity form finds use in the study of metals. The conduction electrons, i.e., the ones
that carry current and are free to move, can be approximated as an ideal gas. This is slightly surprising,
because one might naively think that the long-ranged Coulomb potential would interfere with such
a description. However, the approximation turns out to work remarkably well. This ∝ T scaling is
experimentally recovered at very low temperatures and for many metals this prefactor is reasonably
close the one have just computed; to within 20%!
We refer the curious reader to a standard textbook solid-state physics for more information on the quality
of the ideal-gas approximation, the Drude model by which the linear scaling can also be computed,
and why these approximations work to begin with. The concepts set out above and more formaly
handled in a course on solid-state physics, maybe also be applied to more esoteric scenarios. You will
encounter the statistical mechanics of fermions in the study of white dwarves and the Chandrasekhar
limit. Additionally, the theory in this chapter can also be used to treat para- and diamagnetism with
some modifications; all of this, unfortunately, goes beyond the scope of these notes.
76 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
4.8 Exercises
Q26. Classical Ideal Gas
Calculate the grand-canonical partition function Ξ(µ, V, T ) of an ideal gas, and show that it can
be written as Ξ = exp(zV ) with z(µ, T ) = eβµ /Λ3 . Calculate the pressure p(z, T ) and the density
ρ(z, T ) of the classical ideal gas from Ξ. (Assume 3D)
where Ωd represents the surface are of the d-dimensional unit sphere: Ω1 = 2, Ω2 = 2π,
Ω3 = 4π, etc.
(b) Sketch the behavior of the three cases.
(c) Comment on the implication of this result for the possibility of having a Bose-Einstein con-
densate in dimensions d = 1 and 2, respectively. Hint: Page through the final section of
Chapter 6 to get an idea.
Q28. Particle in a Box
A quantum particle in a one-dimensional box of size L along the x-axis, say with x ∈ [0, L],
has wavefunctions ψk (x) ∝ sin(kx/2) with a discrete set of allowed wavenumbers k = πn/L
with quantum numbers n = 1, 2, 3, · · · , i.e., such that ψ(0) = ψ(L) = 0. The energy with
quantum number n is given by ϵn P= ℏ2 k 2 /(2m) = h2 n2 /(8mL2 ), such that the partition function
∞
at temperature T equals Zq (T ) = n=1 exp(−βϵn ).
(a) At sufficiently low T , the partition sum contains only a few terms that contribute substantially
to its value. Define the temperature T ∗ as the temperature below which the system is
essentially in the ground state.R In the case that T ≫ T ∗ there are many contributing terms
∞
to Zq (T ), and hence Zq (T ) ≃ 0 dn exp(−βϵn ) is an accurate approximation. Evaluate this
high-temperature limit of Zq (T ).
The classical Hamiltonian of this particle is H(x, p) = p2x /2m + V (x) with V (x) = 0 for x ∈ (0, L)
and V (x) = ∞ otherwise. The classical partition function is therefore
Z ∞ Z ∞
Zc = (1/Y ) dpx dx exp(−βH(x, px )), (4.89)
−∞ −∞
with 1/Y a prefactor such that the classical and high-temperature quantum result agree.
(b) Calculate Zc (T ) and choose Y such that it agrees with your answer in (a). Compare your
result with the one you obtained for the harmonic oscillator in Chapter 3.
4.8. EXERCISES 77
(a) Use this to calculate the canonical partition function Z1 of a single classical point particle in
a 1D box of length L (see exercise Q19) at temperature T , where Z1 is defined as
Z ∞ Z L
1
Z1 (T, L) = dpx dx exp[−p2x /(2mkB T )], (4.90)
h −∞ 0
Q30. Combinatorics
Suppose you had n indistinguishable balls and k distinguishable baskets. Enumerate the ways of
distributing the balls into boxes. Some boxes may be empty. To start, represent each distribution
as n stars and k − 1 vertical lines (the stars and bars representation). For example, for n = 7 and
k = 3, a valid distribution is: 3 + 2 + 2 = ∗ ∗ ∗| ∗ ∗| ∗ ∗. Convince yourself that in general this
representation enumerates the various configurations for a bosonic system. Argue how this leads
to the expression found in Eq. (4.30).
(a) Start with the particle number N in integral representation. For high temperatures T the
chemical potential µ becomes large but negative, such that z = eβµ ≪ 1. Expand the particle
number (or particle density ρ = N/V ) in orders of z. Hint: It is convenient to express your
result in terms of the De Broglie wavelength Λ and you might want to look up the Gamma
function Γ(x).
(b) Now, determine the energy E or the energy density ϵ = E/V analogously to the particle
number/density, i.e., via expansion in terms of z.
(c) Finally, determine an expression for the pressure p via pV = 2E/3, use the expression for the
particle number/density to eliminate z, and write your result analogous to the classical ideal
gas, such that the leading-order correction can be easily identified. Hint: an expansion up to
second order in z is sufficient.
78 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
This will require a Taylor expansion to capture the linear term in the DoS. Using the argument
from (b) to eliminate the constant term, argue that the integral takes on the form
Z ∞
3 2 x2
CV ≈ g(EF )kB T dx , (4.95)
2 −∞ 4 cosh2 (x/2)
where x = β(E − EF ).
(d) Consider the following intuitive argument. At low temperatures, only fermions with energy
within kB T are participating in the physics. What is (roughly) the number of such fermions?
Estimate the energy that these carry in total. Argue from the scaling of your result with T
that this implies that CV ∝ T .
80 CHAPTER 4. CLASSICAL AND QUANTUM IDEAL GASES
Chapter 5
Chemical Equilibria
In this chapter, we examine the equilibrium behavior of a system, where the elements can react with
each other via a chemical reaction. In this situation, chemical species are continuously exchanged, but
the macroscopic concentrations are unaffected. That is, the net reactive flux of species is zero. To
emphasize that processes are taking place in this equilibrium, the term dynamic equilibrium is often
used. We shall see that in chemical equilibrium, the fraction of each chemical species involved in the
reaction is controlled by the chemical potential of that species. Using statistical mechanics to describe
this situation leads to the equilibrium law of mass action: the ratio between the concentration of
reactants and products is constant. This is an extension of Le Châtelier’s principle: The equilibrium
in a chemical system responds to a change in concentration, temperature, or pressure by shifting in
the direction which partially counteracts the imposed perturbation. We will also briefly touch upon
the relation between the constant ratio and the processes that underlie a return to equilibrium upon
perturbing the system. We close this chapter by giving another example of the use of chemical equilibria
in a doped semiconductor.
where vi is the stoichiometric coefficient for species Xi and vj′ is the stoichiometric coefficient for species
Xj′ . The expression essentially says that each individual reaction that takes place, changes the number
of molecules of each species according to their respective stoichiometric coefficients. For example, the
combustion of methane into carbon dioxide and water would be written as:
Of course, the equilibrium concentrations of these molecules may be shifted far to one side of the reaction
or the other depending on the ambient pressure and temperature.
81
82 CHAPTER 5. CHEMICAL EQUILIBRIA
Assume that the reaction is taking place in a closed, isothermal system, for which the pressure is
fixed, i.e., at constant T and P . In this ensemble, the free energy that will govern the equilibrium
behavior is the Gibbs free energy. We can gain insight into the reactions taking place by isolating a
single species Nr and changing the system by dNr . Then the change in the Gibbs free energy due to
the dNr chemical reactions is
M
X M′
X
vr dG = − vi µi + vj′ µ′j dNr , (5.3)
i=1 j=1
where µi is the chemical potential of species Xi , and µ′j is the chemical potential of species Xj′ . Here, it
is important to note that the chemical reactions lead to interdependencies between the various species
participating in the reactions: changing the amount of species Nr modifies all other reactant species Xi
and product species Xj′ . Thus, the expression in Eq. (5.3) is obtained by applying the chain rule to G
and rearranging. Since our choice of species Nr was arbitrary, the result holds in general. As we will
discuss further in Chapter 6, when a system is in equilibrium, the relevant free energy is at a minimum.
Hence, at equilibrium we expect dG = 0, we have
′
M
X M
X
vi µi = vj′ µ′j . (5.4)
i=1 j=1
Equation (5.4) is an expression of matter conservation, but its practical use is limited as, experimentally
one can access the density or pressure, but not the chemical potential.
where fi is the free energy of a molecule of species Xi including its ground-state energy and any internal
degrees of freedom. Note that the last two terms of Eq. (5.5) correspond to the ideal-gas free energy, and
thus arise from the molecule’s (translational)degrees of freedom. That is, we have made an expansion
of the free energy around the ideal-gas free energy; this approach to dealing with free energies is quite
natural and we will pursue it further in Chapter 13. Similarly, the chemical potential of species Xi is
∂F
= βfi + log ρi Λ3i .
βµi = β (5.6)
∂Ni T,V
where the right-hand side of the equation has the features of an equilibrium constant for the concentra-
tion, if you recall your high-school chemistry; but it is not quite the constant you encountered there.
5.3. REACTION KINETICS AND EQUILIBRIUM CONSTANTS 83
Unfortunately, the density in Eq. (5.7) is measured in units of the individual De Broglie wavelengths (Λi )
of the molecules. In practice, it is more convenient to instead define the density in terms of a ‘standard’
reference density. Typically, for gases, this reference density is chosen to be that of an ideal gas at
temperature T and a pressure of 1 atmosphere. For aqueous solutions, the reference is a concentration
of one mole per liter at the same pressure of 1 atmosphere. Denoting this standard density by ρ0 , we
can rewrite Eq. (5.7) as
QM vi
i=1 [Xi ]
(0)
QM ′ ′ = exp −β∆µ , (5.8)
′ vj
j=1 [Xj ]
where
ρi
[Xi ] ≡ ; (5.9)
ρ0
′
M M
(0) ′(0)
X X
∆µ (0)
≡ vi µi − vj′ µj , (5.10)
i=1 j=1
and
(0)
βµi = βfi + log(ρ0 Λ3i ). (5.11)
(0)
Note that µi is the chemical potential of species i at the standard density ρ0 . Equation (5.8) is known
as the law of mass action; technically speaking, it is the equilibrium consequence of the more generalized
form of this law. The implication is that whenever you adjust the concentration of one of the species,
the reactions drive the system to such a state that the ratio again satisfies Eq. (5.8). The law of mass
action does not only apply to chemical reactions, but can be applied to a range of systems that form
bonds. For instance, some nuclear reactions can be explained using this simple expression.
Returning to our methane-combustion example of Eq. (5.2), the expression takes the familiar form
[XCH4 ][XO2 ]2
= Kc−1 , (5.12)
[XCO2 ][XH2 O ]2
with Kc the equilibrium constant for the concentration. It is important to keep in mind that many
equilibrium constants can be created by choosing the reference differently, e.g., instead of a reference
density ρ0 , a partial pressure could have been chosen. These equilibrium constants can be related to
each other using the equation of state. Lastly, it is clear that predicting Kc from microscopic theory is
complicated, as it requires knowledge of the relevant chemical potentials.
A ⇌ B, (5.13)
where the forward reaction A → B has reaction rate kf and the backward reaction B → A has rate
kb . Near equilibrium, most reactions are well described by linear proportionalities between the rates
84 CHAPTER 5. CHEMICAL EQUILIBRIA
conversion
concentration
𝜌A
𝑘𝑓 𝜌A
𝜌B
𝑘𝑏 𝜌B
time time
Figure 5.1: Sketch of the time evolution of a two species system tending toward the chemical equilibrium
A ⇌ B. We start with a situation where there is only species A; the respective densities are given by ρA
(red) and ρB (blue). (left) The effective conversion rates: A → B has reaction rate kf and the backward
reaction B → A has rate kb . The use of the cut marks indicates the behavior as t → ∞, where the two
rates are equal (purple horizonal line). (right) The evolution of the densities during the equilibration.
concentration of reactants:
ρ̇A (t) = −kf ρA (t) + kb ρB (t); (5.14)
ρ̇B (t) = kf ρA (t) − kb ρB (t). (5.15)
Here, ρA and ρB are the numbers of particles of each species, t is time, and the dot denotes the
time derivative. The equilibration process from a state of pure A is illustrated in Fig. 5.1. Clearly,
ρA (t) + ρB (t) is constant for all time and the equilibrium concentrations satisfy
ρeq kf
Kc′ ≡ B
eq = . (5.16)
ρA kb
Here, Kc′ is the equilibrium constant associated with the kinetic picture.
It is tempting to identify Kc′ = Kc for a given reaction, however, one has to be very careful. This sim-
plified approach does not take into account the conceptual differences between the true thermodynamic
equilibrium constant and the ratio of rate constants that is the kinetic equilibrium constant. These
constants are generally not equal, except at chemical equilibrium, for ideal systems, and elementary
reactions; and even then with some caveats. In particular, the kinetic reaction rates cannot simply be
(0)
obtained by exponentiating the βµi and the law of mass action only delimits the ratio of these rates
in general. Computing the rates requires the notion of out-of-equilibrium thermodynamics, which goes
beyond the scope of these notes.
instead occupied by a P, As, or Sb atom. These dopants have an additional electron and nuclear charge,
which is ‘left over’ when the 4 covalent bonds have formed with neighboring Si atoms. The electron
can be localized around the dopant nucleus, which has a binding energy of about I = 30 to 50 meV,
depending on the specific dopant. Chemical equilibria can be used to establish which fraction of the
dopant electrons is bound and which is freed into the conduction band. When the electron is in the
conduction band, it is able to traverse the Si lattice, thereby allowing for charge transport. Note that
this is also a case of simple adsorption dynamics.
Let us consider a single dopant atom and treat that as a system in contact with the reservoir formed
by the Si crystal. The dopant is assumed to be in thermal and chemical equilibrium with the Si crystal,
exchanging electrons and energy. There are now three possible configurations: (i) The dopant is ionized
and the electron is in the reservoir, (ii) the electron localized at the dopant with spin up, and (iii) the
same as (ii) but with spin down. These three configurations can be written as (N = 0, ϵ = 0) and
(N = 1, ϵ = −I), respectively, where we use that there is no energy difference between the two bound
states in the absence of a magnetic field. The Gibbs sum is given by Ξ = 1 + 2 exp(β(I + µ)), so that the
probability of being ionized is given by Pion = 1/Ξ. For small temperatures Pion ≈ exp(−β(I + µ)) ≪ 1,
such that the charge carriers from doping freeze out. This argument gives a simple relation for the
scaling of conductivity with temperature. Unfortunately, the real temperature dependence measured in
a semiconductor is more complicated. The model is thus in need of some refinement, as we will gain an
impression of in Chapter 4.
86 CHAPTER 5. CHEMICAL EQUILIBRIA
5.5 Exercises
Q35. Small Clusters
Consider a square-lattice model with A sites and n occupied sites (or particles). If two particles
are adjacent, they bond with bonding energy −ϵ (with ϵ > 0). However, a particle that is already
bonded cannot bond with any other particles. Thus, the system consists of monomers and dimers.
Here, we develop an approximate solution to the problem for low densities.
Assume that there is an average monomer density ρ1 = n1 /A and dimer density ρ2 = n2 /A, with
n1,2 the number of monomers and dimers. Also assume that the monomers and dimers do not
interact, so both behave like an ideal gas.
(a) What is the canonical partition sum Q1 (n1 , A, T ) for the monomers? And for the dimers?
Remember that the dimers can have two orientations, and assume periodic boundary condi-
tions.
(b) Calculate the Helmholtz free energy F (n1 , n2 , A, T ) of the system, using Stirling’s approxi-
mation log n! = n log n − n.
(c) Show that in equilibrium the chemical potentials of monomers and dimers should obey:
2µ1 = µ2 . Then show that this requires:
ρ2
= 2 exp(βϵ). (5.17)
ρ21
(d) What is the relationship between ρ1 , ρ2 , and ρ? Calculate ρ1 and ρ2 in terms of ρ and ϵ.
What happens in the limit where ϵ → 0 (assuming low density)? And ϵ → ∞?
Q36. Gas Adsorbing to an Interface
Consider a system of N ideal gas particles with mass m at a temperature T . The mean pressure
is P and Ng gas particles move freely in a volume V = L3 . Another Ns (identical) particles are
absorbed to a surface with area L2 (forming one of the faces of our cubic volume), where they also
behave as two-dimensional ideal gas. The total number of particles in this system is N = Ng + Ns .
The total energy of an absorbed particle is given by E = p2 /2m − ϵ0 , where p is a 2D momentum
and ϵ0 is the binding energy per particle.
(a) Calculate the partition functions of the free and absorbed gas, label these Zg and Zs , respec-
tively. The particles are to be treated as indistinguishable.
(b) Determine the Gibbs free energies Gg and Gs using the partition functions.
(c) Use this to derive the respective chemical potentials µg and µs .
(d) The two systems are in chemical equilibrium at temperature T . Compute the mean number
of gas particles absorbed per unit area in terms of the given variables.
(e) Keeping temperature T and total number of particles N the same, now the volume in which
the free gas particles can move is increased. What consequences does this have for the
equilibrium between the free gas particles and the gas particles absorbed on the surface, i.e.,
are more or less particles absorbed compared to the case with smaller volume? Justify your
answer using a few words only.
Q37. Saha’s Equation
In this exercise, we will use the law of mass action to examine the number of atoms that are
ionized in a stellar atmosphere. This is captured by the Saha equation, which was used to explain
the relatively rapid decay of Balmer-absorption-line intensity with increasing temperature. The
5.5. EXERCISES 87
spectroscopy details are not relevant here1 , we merely aim to illustrate how chemical equilibrium
can be used to describe physical phenomena across a wide variety of systems.
(a) Consider the reaction e− + p+ ⇋ H + γ, where e− is a free electron, p+ a proton (ionized
hydrogen), H is a neutral hydrogen atom, and γ a photon. Argue why chemical equilibrium
for this reaction satisfies µe− + µp+ = µH .
(b) Assume that all involved particles behave like an ideal gas and write their respective number
densities as ρe− , ρp+ , and ρH . Argue using a few words under what conditions we may assume
ρe− = ρp+ .
(c) Derive the equality βµi = βfi + log ρi Λ3i from the expression βFi = βNi fi + Ni log ρi Λ3i −
Ni . Here, Fi is the free energy of species i, β = 1/(kB T ) with T the temperature and kB
Boltzmann’s constant, Ni is the number of particles of species i, µi the associated chemical
potential, Λi the thermal wavelength, and fi the ‘excess’ (non-ideal) free-energy per particle,
which includes contributions from the ground-state energy and internal degrees of freedom.
(d) Show using (c) and properties of partition functions that
ρe− ρp+ Λ3
= 3 H 3 exp β fH − fp+ − fe− ;
(5.18)
ρH Λe− Λp+
3/2
2πme− kB T
≈ exp β fH − fp+ − fe− , (5.19)
h2
with me− the electron mass and h Planck’s constant. Explain in a few words why the
approximation in the second line is justified.
(e) Now we define the ionization energy to be χi that part of fp+ − fe− − fH that comes from the
ionization energy only. Assume that the remainder of the degrees of freedom can be captured
by an excess partition function Zi for each respective term. Show that this leads to the Saha
equation
3/2
ρe− ρp+ 2πme− kB T 2Zp+
= 2
exp (−βχi ) . (5.20)
ρH h ZH
Explain in a few words what the physical interpretation of the factor 2 is in Eq. (5.20).
(f) It turns out that to good approximation in a star, ZH = 2 (ground state) and Zp+ = 1. For
hydrogen we have that χi ≈ 13.6 eV. Estimate the temperature at which point you would
expect half the hydrogen atoms to be ionized and comment on — using only a few words
— whether this situation is applicable to the surface of a star, which you can assume is at
5000 K (our sun).
(g) The above basic estimate is quite inaccurate. You can refine it assuming charge conservation
ρe− = ρp+ and particle preservation, defining the total density as ρ = ρH + ρp+ + ρe− . Show
that this leads to
3/2
ρ2e−
2πme− kB T
= exp (−βχi ) . (5.21)
ρ − 2ρe− h2
This can be solved for ρe− in general. Here, you may consider ρe− = ρ/3 with ρ ≈ 2.3 1016 m−3
(photosphere) and me− ≈ 9.1 10−31 kg. Determine the temperature at which the photosphere
assumes this degree of ionization. you should find about 104 K, which is much smaller than
our estimate in (f) and readily attainable for stars hotter than our sun.
1 We refer the interested reader to chapter 8 of “An Introduction to Modern Stellar Astrophysics” by Caroll and Ostlie
Phase Diagrams
A phase diagram depicts the limits of stability of the various stable phases in a thermodynamic system
at equilibrium, with respect to variables such as: temperature, pressure, density, composition, etc.
For example, on the phase diagram of water, we would expect to see regions of a solid (ice), liquid
(water), and gas (water vapor). The former phase is characterized by a crystalline (periodic) positional
ordering of the constituents, while the latter two have an amorphous structure. This observation can
be generalized to most single-component simple elements or compounds, which typically exhibit these
three characteristic phases of matter.
A typical phase diagram in the pressure-temperature (pT ) representation is shown in the left-hand
panel to Fig. 6.1. The phase diagram depicts regions of stability of a solid phase, a liquid phase, and
a gas phase. Additionally, two extra points are indicated on this plot: (i) The triple point, where all
three phases occur at the same state point and (ii) the critical point, where the distinction between a
liquid and gas disappears. At the moment, all possible solid phases are grouped together, although in
practice a system may also have phase transitions between various solid phases characterized by different
underlying crystal structures. For instance, ice is one of 18 (presently) known forms of crystalline water.
A question that arises when examining Fig. 6.1 is: What is the distinction between a liquid and a gas?
Both phases are characterized by an irregular positioning of the constituents with the difference between
the two being simply the density; a liquid has a higher density than a gas. It will turn out that it is
not possible to determine with certainty in which of the two phases a system is, without knowing more
about the phase diagram, as we will argue in this chapter and return to in Chapter 13. The critical
point on the phase diagram is the point where the density difference between the liquid phase and the
gas phase disappears.
We can see this more clearly in the density-temperature (ρT ) representation, see the right-hand panel to
Fig. 6.1. Note that in this representation, large parts of the phase diagram are occupied by coexistence
regions, i.e., places where two phases occur simultaneously. For example, a large gas-liquid coexistence is
seen, and there are also respective gas-solid, solid-liquid, and solid-fluid regions. A well-known example
of such coexistence is that of water and ice at 0 ◦ C. We have indicated a fourth phase in this plot: a
fluid. The fluid appears above the critical temperature, where the presently rather nebulous distinction
between a liquid and a gas disappears. The curves enclosing the phase coexistence region are called
binodals, within the region, spinodal curves are also found, which we will return to in this chapter.
89
90 CHAPTER 6. PHASE DIAGRAMS
Temperature
Pressure
Fluid
Flu.
Critical +
Solid Liquid Point Sol.
0K
0
0K Temperature 0 Density
Figure 6.1: Phase diagram of a typical single-component system, i.e., for an element or a simple com-
pound. The temperature-pressure (left) and density-temperature (right) representation. In the former
representation, we see that: at low temperature, the solid phase is stable; for moderate temperature
and pressure, the liquid phase is stable; and for high temperature, the gas phase is stable. The lim-
its of stability of the three phases are shown using red curves. Next to these lines the type of phase
transformation is listed, with “Sublim.” standing for sublimation. Two extra points of interest are
marked in blue: the triple point and the critical point. In the density-temperature representation of
the phase diagram (right), the phase which appears above the critical point is typically called a fluid.
Abbreviations for solid (Sol.), liquid (Liq.), and fluid (Flu.) appear in some places on this plot. The
regions marked with two phases are unstable, here the system phase separates.
The remainder of this set of lecture notes will deal mostly with predicting and understanding phase
diagrams, i.e., predicting which phases are stable for a given state point. We will also describe in detail
the phase transformations between these stable phases. To give a first taste of the consequences of a
phase transition, we will conclude our excursion into the realm of quantum mechanics with a discussion
of Bose-Einstein condensation.
From the second law of thermodynamics, we have that the total entropy (S = S1 + S2 ) is a maximum
at equilibrium. At a maximum, we have dS = 0. Hence, dS/dU1 = 0, dS/dV1 = 0, and dS/dN1 = 0.
Rewriting the internal energy differential to one for the entropy
1 p µ
dS = dU + dV − dN, (6.1)
T T T
6.2. CLOSER LOOK AT COEXISTENCES 91
𝑁1 𝑁2
𝑉1 𝑉2
𝑈1 𝑈2
Figure 6.2: Sketch of two coexisting phases 1 and 2 in a closed system. In this case the total number of
particles N = N2 + N1 , the volume V = V2 + V1 , and energy U = U2 + U1 are fixed.
allows us to obtain
∂S ∂S1 ∂S2
0 = = + ;
∂U1 V1 ,V2 ,N1 ,N2 ,U ∂U1 V1 ,N1 ∂U1 V2 ,N2
∂S1 ∂S2
= − ;
∂U1 V1 ,N1 ∂U2 V2 ,N2
1 1
= − , (6.2)
T1 T2
where we have used the fact that dU = dU1 + dU2 = 0. From the last line of Eq. (6.2) we have T1 = T2 ,
which is commonly called thermal equilibrium.
Similarly, we can show that the condition dS/dV1 = 0 implies that p1 = p2 (mechanical equilibrium) and
that dS/dN1 = 0 implies that µ1 = µ2 (chemical equilibrium; sometimes called mass equilibrium). Note
that following a similar argument, we can show that (dU )S,V,N = 0 in equilibrium. These conditions
also hold for any number of phases which are in equilibrium, i.e., T1 = T2 = T3 = · · · , etc.
We will be studying coexistence between two phases that are in contact, therefore, we know a priori
that they must have the same temperature. Furthermore, we assume that at a number density ρ = N/V
the system divides into two phases with number densities ρ1 and ρ2 , respectively, also see Fig. 6.3. Let
us denote the fraction of the system at a number density ρ1 as x, i.e., if V is the total volume of the
system and V1 is the volume at number density ρ1 , then x = V1 /V . If V2 is the volume of the system
with a density of ρ2 , then we also have V2 /V = (1 − x). Furthermore, if N1 and N2 are the number of
particles at densities ρ1 and ρ2 respectively, then the total density of the system is given by
N N1 + N2 N1 V 1 N2 V2
ρ= = = + = ρ1 x + ρ2 (1 − x). (6.3)
V V V1 V V2 V
92 CHAPTER 6. PHASE DIAGRAMS
𝐹/𝑉
Convex
𝜌1 𝜌2
Concave 𝑉1 𝑉2
𝜌 Heat Bath
Figure 6.3: (left) Approximate free energy per volume F/V as a function of the density ρ. The free
energy has two convex regions corresponding to two phases with different densities. (right) Cartoon of
a system with two coexisting phases that this F/V curve can result in. One phase has density ρ1 in a
volume V1 , e.g., a gas, and one has density ρ2 in a volume V2 , which could, e.g., be a coexisting liquid.
Note that we have made no comment regarding the minimum free energy at ρ thus far. From thermo-
dynamics, we know that the equilibrium phase will have the lowest free energy. Hence, at a given ρ,
6.2. CLOSER LOOK AT COEXISTENCES 93
𝐹/𝑉
𝐹/𝑉
𝜌1 𝜌2 𝜌 𝜌1 𝜌2 𝜌
𝐹/𝑉
𝐹/𝑉
𝜌1 𝜌2 𝜌 𝜌1 𝜌2 𝜌
𝐹/𝑉
𝐹/𝑉
𝜌1 𝜌2 𝜌 𝜌1 𝜌2 𝜌
Figure 6.4: Red curves indicate the approximate free energy per volume F/V as a function of density
ρ for a homogeneous system. By homogeneous system we mean that the entire system has a single
density given by ρ. Solid blue lines indicate F/V as a function of ρ assuming that the system consists
of a coexistence between phases with densities ρ1 and ρ2 (indicated by the dashed blue lines). The six
plots correspond to different choices of ρ1 and ρ2 .
94 CHAPTER 6. PHASE DIAGRAMS
𝐹/𝑉
𝐹/𝑉
𝜌1 𝜌2 𝜌 𝜌1 𝜌2 𝜌
Figure 6.5: Common-tangent construction (solid blue line) for the approximate free energy per volume
F/V as a function of density ρ (red curve). (left) The common tangent does not connect the two
minima in the free energy! The coexistence densities ρ1 and ρ2 are indicated using the dashed blue
lines. In equilibrium, we have coexistence for ρ1 < ρ < ρ2 . The dashed green lines indicate the
inflection points to F/V , where the free energy per volume transitions from convex to concave. (right)
The same construction, now corrected for the ‘forbidden’ concave region to the F/V . This highlights the
separation between phase 1 and phase 2, as we will label the low- and high-density phase, respectively.
we are interested in determining how to choose ρ1 and ρ2 such that the free energy (fco ) is minimized.
One way to solve this problem would be to write the free energy per volume as a function of ρ, ρ1 , and
ρ2 , and find the minimum for each value of ρ as a function of ρ1 and ρ2 . However, we can also examine
Fig. 6.4 to intuit a graphical solution to this problem.
Figure 6.5 shows that below ρ1 , the free energy is always minimized if the entire system is in phase 1.
Above ρ2 , the solution also appears trivial: the energy is minimized when the entire system is in phase 2.
Between these two points, however, the free energy per volume is minimized by the line which is tangent
to both regions. Hence, the system is in coexistence when we can draw a line tangent to the free energy
at two different densities. This construction is often referred to as a common-tangent construction.
Interestingly, from this common tangent, we also recover mechanical and chemical equilibrium. It will
be left as an exercise to show that the common-tangent construction here ensures that (i) the pressures
in both phases are equal and (ii) that the chemical potentials in the two phases are equal. The points
where the common-tangent touches the free energy, give the coexistence densities and are also referred
to as binodal points. When such binodal points are obtained for multiple temperatures, the coexistence
region in a ρT phase diagram may be mapped out. This coexistence region is delimited by two curves,
which are appropriately called binodals.
Note that in our derivation we have ignored contributions from the interface between phases 1 and 2.
For the purpose of this analysis we are, in fact, free to ignore the interface, since we are only interested
in the thermodynamic limit (i.e., N → ∞). We will explore in an exercise why we are allowed to do
this. There is additional richness to the representation in Fig. 6.5, which we will turn to next.
6.3. EQUILIBRIUM AND METASTABILITY 95
Turning back to the general free energy A, we thus require that around a stable equilibrium
(∆A)x = d2 A x + d3 A x + · · · > 0.
(6.11)
For small enough displacements, the quadratic term will dominate, hence
in addition to the requirement
(dA)x = 0, for an equilibrium to be stable we must also have d2 A x ≥ 0. This is simply a higher-
dimensional, differential-form analogy to the standard requirements on the minimum of a scalar function.
where F is the Helmholtz free energy. Next, we assume that we have a system such that the number of
particles N , volume V , and temperature T are fixed. Now, let us split the system into two subsystems
(1) and (2), in a fashion reminiscent to the one shown in Fig. 6.2. Here, we split the system in half for
the number of particles, such that: N1 = N2 = N/2 and dN1 = dN2 = 0. For the volumes, we apply a
small perturbation, such that: V1 = V /2 + dV1 and V2 = V /2 + dV2 . Clearly, we must have that
d2 F N,V,T = d2 F1 + d2 F2
2 2 !
1 ∂ F1 2 ∂ F2 2
= dV + dV ,
2 ∂V12 N,T 1 ∂V22 N,T 2
2 2 !
1 2 ∂ F1 ∂ F2
= dV + . (6.14)
2 1 ∂V12 N,T ∂V22 N,T
We can use standard thermodynamic identities to rewrite terms appearing in the above as
2
∂ F ∂p 1
=− = , (6.15)
∂V 2 N,T ∂V N,T KT V
96 CHAPTER 6. PHASE DIAGRAMS
Note that our favorite approximate free energy — the one plotted in Figs. 6.3, 6.4, and 6.5 — does not
satisfy this criterion. This is why we have made careful use of the term “approximate”. The well-defined
variant of this free energy is shown in the right-hand panel to Fig. 6.5 and removes the concave part.
This region (negative compressibility) is bounded by two dashed green lines, and the associated densities
are typically referred to as spinodal points. These necessarily lie between the binodal points. Similarly
they trace out curves called spinodals in a ρT phase diagram, which delimit the region of unconditional
instability for the system to phase separate.
In the coming chapters, we will encounter many free energies that are locally non-convex, which derive
from approximative methods for determining the free energy of the physical system being modeled.
An additional concept arises in these approximate theories: Is there a difference between the unstable
locally convex and non-convex parts of the free energy? It turns out that the curvature of the free
energy gives rise to a change in the systems dynamics1 . Between the two spinodals, even the smallest
of fluctuations will destabilize a homogeneous system and cause it to instantaneously phase separate.
Because thermal fluctuations occur across space, spinodal decomposition of the system leads to distinct,
sometimes intercalating patterns. However, outside the spinodal region, i.e., between ρ1 and the left-
most green line in Fig 6.5 or between the right-most green line and ρ2 in the same figure, i.e., in the region
between the binodal and spinodal curve, the system can remain in a metastable equilibrium for some
time. That is, a significantly large fluctuation is required to cause the system to reach the equilibrium
state. In this part of the phase diagram and in the metastable state, small nuclei of the stable phase
constantly form and disappear. Only when these grow out to sufficient size, by a favorable arrangement
of particles, can the stable phase nucleate. You are undoubtedly familiar with this phenomenon from
the way supercooled water rapidly freezes by inserting needle. Here, the needle is the source of the
substantial (non-thermal) fluctuation that drives the water into its thermodynamically preferred state.
We will revisit this situation in Chapter 15.
1 The dynamics can be described by the Cahn-Hilliard formalism, but a full discusssion thereof unfortunately goes
The common-tangent construction we derive in this section, is for the free energy F as a function
of volume V , rather than F/V as a function of ρ. However, these two constructions are completely
equivalent, as can be readily shown using basic calculus.
Let us assume that we know the Helmholtz free energy in both phases, i.e., the F at constant T , N ,
and V . From the definition of F , we have in phase 1 that
∂F1 (N, V, T )
= −p1 , (6.18)
∂V
and similarly in phase 2
∂F2 (N, V, T )
= −p2 . (6.19)
∂V
However, in equilibrium we also have p1 = p2 , implying
Consequently, the slope of the free energy F as a function of V must be equal in coexisting phases.
This is the first part to obtaining a common-tangent construction. The tangents are equal, but we still
need to show that coexisting phases are described by a single or “common” tangent line. This means
that the F -axis intercepts of the two tangent lines through the respective coexistence points must be
the same, as they then define the same line.
The chemical potential in phase 1 (for a single-component system) is simply the Gibbs free energy in
phase 1 divided by the number of particles in the system, i.e., µ = G1 /N . Hence
1 1 ∂F1 (N, V, T )
µ1 (N, V, T ) = (F1 (N, V, T ) + p1 V ) = F1 (N, V, T ) − V , (6.21)
N N ∂V
and similarly
1 1 ∂F2 (N, V, T )
µ2 (N, V, T ) = (F2 (N, V, T ) + p2 V ) = F2 (N, V, T ) − V , (6.22)
N N ∂V
𝑔
𝐹
0 𝑉1 𝑉2 𝑉 0 𝑥1 𝑥 𝑥2 1
Figure 6.6: Common-tangent constructions for (left) the Helmholtz free energy F as a function of volume
V and (right) the Gibbs free energy per particle g as a function of the composition x in a binary mixture.
The free energy curves are indicated in red, the common tangent in blue, and the coexistence points
using the dashed lines. In the left panel, the slope of the common tangent gives the pressure in both
phases. The chemical potential is also the same in both phases and this is related to the F -axis intercept.
Note that either side of Eq. (6.23) represents exactly the intercept of the tangent line to the F -axis.
Namely, the function value minus the slope times the position with respect to V = 0. Hence, µ1 = µ2
implies that the intercepts of tangents to F1 and F2 as a function of V must be equal for phases 1 and
2 to be in coexistence, which in turn implies the existence of a common tangent.
To summarize, the two tangent lines to the free energy as a function of volume at the two coexistence
points should have both the same slope (due to equal pressures) and the same intercept (due to equal
chemical potentials). As a result, the coexistence points can be found by a common-tangent construction,
which is demonstrated in the left-hand panel to Fig. 6.6. When we know a range of isotherms similar
to the one shown in Fig. 6.6, we can determine the phase diagram associated with this system.
Let us consider the simplest situation first. Assume that we have a two component mixture and that we
have N i particles of type i ∈ {a, b}, as we will indicate using superscripts. The entire system can be in
one of two phases, 1 and 2, which we will indicate using subscripts (to avoid confusion with exponents).
Thus, the number of particles of species a in phase 2 is denoted N2a and so on. For example, we
could have a homogeneous distribution of two types of gas molecule throughout space above the critical
temperature. By lowering the temperature, our example system phase separates, due to the interaction
rules between the molecular species. This leads to a phase that is rich in species a and poor in species
b (phase 1), coexisting with a phase that is rich in b and poor in a (phase 2).
The conditions for thermodynamic equilibrium of the two phases require that the temperature, pressure,
and chemical potentials for the two species are equal in both phases. Specifically, we have that (assuming
6.6. PHASE TRANSITIONS 99
a homogeneous temperature T ):
p(ρa1 , ρb1 , T ) = p(ρa2 , ρb2 , T ); (6.24)
µ (ρa1 , ρb1 , T )
a
= µ (ρa2 , ρb2 , T );
a
(6.25)
µb (ρa1 , ρb1 , T ) = µb (ρa2 , ρb2 , T ), (6.26)
where ρij is the density of species i in phase j, p is the pressure, and µi is the chemical potential of
species i. It should be clear that there we do not work with partial pressures — the effective pressure of
a molecular species in a mixture, if it instead occupied the volume by itself — as the presence of both
species contributes to the pressure in one phase. Think in terms of non-ideal gasses, for instance, to
see why such a cross coupling should be present. However, the two phases can be viewed as chambers
separated by permeable membranes, which means that particles of species a and b can be exchanged
across the boundary. This implies that the chemical potentials of the individual species must be equal
to have coexistence between two phases. The interaction between the species is accounted for in the
dependencies of these parameters on the phase composition.
To determine phase coexistence, we turn to the Gibbs free energy. This is an extensive variable and so
it scales with the total number of particles in the system (N ) as
G(N a , N b , p, T ) = N g(x, p, T ), (6.27)
where g(x, p, T ) is the Gibbs free energy per particle, and we have introduced the compositional param-
eter x = N a /N . Since the chemical potentials are related to the Gibbs free energy by
i ∂G
µ (x) = , (6.28)
∂N i N j̸=i ,p,T
they can be written
µa (x) = g(x) + (1 − x)g ′ (x); (6.29)
b
µ (x) = g(x) − xg ′ (x), (6.30)
with the prime indicating the partial derivative with respect to x. Applying the chemical equilibrium
relations expressed in Eqs. (6.25) and (6.26), we obtain
g ′ (x1 ) = g ′ (x2 ); (6.31)
g(x1 ) − g(x2 )
g ′ (x1 ) = . (6.32)
x1 − x2
Summarizing, phase coexistence happens when for some x1 and x2 , the slopes of the Gibbs free energy
per particle for both phases (at constant temperature and pressure) are equal, and the lines tangent to
the free energy through both these points. That is, there is a common tangent to g, see the right-hand
panel to Fig. 6.6. Another way to think about this common-tangent construction is that the system
as a whole always tries to minimize the total free energy. Thus, for a given composition x, the system
will choose a linear combination of phases 1 and 2 such that the total Gibbs free energy is minimized.
Graphically, such a minimum corresponds to a common-tangent construction.
𝑝
solid
phase 2
fluid phase 1
0 𝜌 0 𝜌
Figure 6.7: Typical equation of state, i.e., pressure p as a function of density ρ, for a system with (left) a
first-order phase transition and (right) a continuous phase transition. Blue and red indicate the different
phases and green dashed lines coexistence densities (left), the transition point (right).
It should be further noted that a jump is not the only condition for identifying a continuous phase
transition. A divergence of a quantity, such as the heat capacity, can also signal a continuous phase
transition. We will see that continuous phase transitions generally have associated with them a diver-
gence of a length scale in a system, i.e., local correlations becoming long-ranged. Hence, we find it
2 There are quite a few subtleties here that you will encounter as you progress in your study of physics. In particular,
the role that symmetries and topology play in classifying a phase transition. It is therefore, at least for now, best to stick
to the convention of continuous versus discontinuous / first-order in characterizing transitions.
6.6. PHASE TRANSITIONS 101
Figure 6.8: Coexistence between a fluid and solid phase in hard cubes for a range of densities in the
coexistence region. Solid-like cubes are rendered large, whilst liquid-like cubes are rendered small. The
system is, however, monodisperse. Note that this system does not have attractions, yet it still phase
separates. We will return to this point in Chapter 14. Data kindly provided by F. Smallenburg.
102 CHAPTER 6. PHASE DIAGRAMS
appears the same on all length scales. The meaning of this will become apparent over the course of the
following chapters. This scale-invariant property of continuous phase transitions makes them amenable
to a mathematical technique referred to as renormalization group theory. This is one of the cornerstones
of modern theoretical physics and a small inroad toward this theory will be made in Chapter 10.
We pick up from our analysis of the Bose gas in Chapter 4. Recall that a non-interacting boson
gas satisfies the Bose-Einstein statistics. The chemical potential satisfies µ < 0, which implies that the
fugacity z ∈ [0, 1). Let us first consider a limitation of using the density of states as defined in Eq. (4.54).
In Chapter 4, we only examined the high-temperature limit, in which it is reasonable to expect that the
bosons are distributed over multiple energy levels, as we indeed found. However, due to their ability
to be in the same state, many bosons √ will fall into the ground state as the temperature is lowered.
The DoS provided in Eq. (4.54) ∝ E. This scaling seems to imply that there is no state with energy
E = 0. However, there is a crucially important state at E = 0, namely the (single-particle) ground
state. We did not have this ground state accounted for in our density-of-states expression, due to the
approximations we used in deriving it. Fortunately, because z ≪ 1 in the high-temperature limit, the
results in Chapter 4 are valid, despite the reduction. Here, we correct for the omission. Bose-Einstein
statistics inform us that the occupancy of the ground state is
−1
n0 = gs z −1 − 1 , (6.33)
Taking the ground state into account for the our analysis at low temperatures, the total number of
particles, as originally described in Eq. (4.55), becomes
∞
x1/2
Z
V 2 z V z
N = gs √ dx + gs = gs 3 g3/2 (z) + gs . (6.34)
Λ3 π 0 z −1 ex
−1 1−z Λ 1−z
Here, we have introduced x = βE and defined a family of functions that have the form
Z ∞
1 xn−1
gn (z) = dx −1 x . (6.35)
Γ(n) 0 z e −1
These are the Bose-gas equivalent of the fn that we found for the Fermi gas. Similarly, we find that
g5/2 appears in the expression for the energy (4.56).
Before we move onto the mathematical analysis of this expression, we should comment on the physical
interpretation of the result. On the right-hand side of Eq. (6.34), we have a term that accounts for
the number of particles found in the excited states, which scales with the volume. That is, the term is
extensive. The number of particles in the ground-state as captured by the second term on the right-
hand side, however, is (seemingly) intensive. Thus, we conclude that the number of particles in the
ground state typically does not significantly contribute to N . We have purposefully used the vague term
“seemingly”, as this does not hold in general. When the system is sufficiently cooled, particles are forced
into the ground state. In this case, the ground-state term becomes extensive. The transition between
6.7. A PHASE TRANSITION INVOLVING QUANTUM STATISTICS 103
intensive scaling and extensive scaling marks the formation of the Bose-Einstein condensate. We will
make this picture more mathematically rigorous next.
It should be noted that we have obtained the gn by converting a sum into an integral via the DoS. Now
we will convert the gn back to a sum — albeit with a different index — which will help us evaluate the
(energy) density:
Z ∞ ∞
z X
gn (z) = dx xn−1 e−x z m e−mx ;
Γ(n) 0 m=0
∞
1 X zm ∞
Z
= du un−1 e−u ;
Γ(n) m=1 mn 0
∞
X zm
= . (6.36)
m=1
mn
Here, the integral in the second line is nothing more than the gamma function, hence the final result.
The function gn is called a polylog and is a monotonically increasing function with z. It coincides with
the Riemann zeta function ζ(n) for z = 1. For n = 3/2, we have that ζ(3/2) ≈ 2.612. N.B. This is likely
the first time that you will see the zeta function appear in a physics context. It will not be the last, if
you delve into the realm of theoretical physics. To the best of the author’s knowledge, there is nothing
profound about the appearance of this function here.
Returning to the analysis of the low-temperature limit, we note that when T ↓ 0 the fugacity z =
exp(βµ) ↑ 1. The reasoning was given in Chapter 4, where we discussed the requirements on maintain-
ing a constant density in the grand-canonical ensemble under these conditions. In brief, constant N
necessitates that µ goes to zero faster than T . We recognize that z ↑ 1 in Eq. (6.34) is problematic. That
is, the ground-state term on the right-hand side diverges, which is in contradiction with a finite and
constant value of N as per our equality. Simultaneously, the excited-state term vanishes, as Λ ∝ T −1/2
and the integral tends toward ζ(3/2) ≈ 2.612. The intuition is that — in the limit, but not necessarily
at the transition temperature — the excited states become nearly devoid of particles, while the ground
state contains nearly all N particles. Clearly, to ensure this and avoid the divergence, we must ensure
that z does not reach the value 1. If we want to fully occupy the ground state (for fixed N ), the scaling
gs z/(z − 1) suggest that z must tend to 1 − gs /N , but that it may not exceed this threshold. This
can be achieved by simultaneously letting the chemical potential tend to zero faster than the inverse
temperature increases. It will turn out that there is a finite value of T , for which z reaches this value in
3D. The answer is different for spatial dimensions other than three, as partly explored in Exercise Q49.
How do we now determine at which temperature BEC occurs? First, we should consider the physical
intuition. At the start of Chapter 4, we remarked that quantum effects would not play a role provided
Λ ≪ ρ−1/3 . That is, the inter particle spacing should exceed the quantum coherence length, as specified
by the De Broglie wavelength. Clearly, BEC is predicated on the quantum nature of bosons, so it is
reasonable to expect that the De Broglie wavelength should grow to at least the inter-particle spacing
to observe it. Substituting ≪ by = and solving for T (Λ ∝ T −1/2 ), will give us an estimate for the
transition temperature
h2 ρ2/3
Tc ≈ . (6.37)
2πmkB
However, in the above expression there is no reference to degeneracy. We have thus identified a scaling
with density, but we have not quite reached the final result.
We can do slightly better than this coarse estimate by using Eq. (6.34). We have just used the ground-
104 CHAPTER 6. PHASE DIAGRAMS
state term on the right-hand side of Eq. (6.34) to establish the low-temperature limit to z. Examining
instead the excited-state term, we assume we are at a low (but sufficiently high) temperature that we
may assume that all particles are in the excited state. This approximation is reasonable above the
transition temperature, as the number of particles in the ground state remains intensive in this region.
We then approximate the integral over the Bose-Einstein statistics for the excited state using its zero-
temperature value of ζ(3/2). This is permitted, because the integral does not vary strongly near T = 0.
The resulting expression involving N can then be solved for T , as the temperature dependence only
remains present in Λ. Carrying out the necessary manipulations results in
2/3
2πℏ2
ρ
Tc = , (6.38)
kB m gs ζ(3/2)
which is more precise than our original estimate for Tc and involves the degeneracy. The accuracy of this
approximation is revealed in Fig. 6.9, which shows the numerically determined value of z as a function
of the reduced temperature T /Tc for gs /N = 0.001.
𝑧(𝑇/𝑇𝑐 )
𝑇/𝑇𝑐
Figure 6.9: The fugacity z as a function of the reduced temperature T /Tc (red), where Tc is as in
Eq. (6.38). The dashed blue line shows the dependency z ∝ t−3/2 , which sets in at high temperature.
At low temperatures, z tends to a constant slightly smaller than 1 ≈ 1 − gs /N . In this numerically
computed z curve, we used gs /N = 0.001.
To demonstrate that this temperature indeed defines a phase transition, we can consider the heat
capacity. However, first we examine the pressure using Ω = −pV . It follows that
gs gs
βp = g5/2 (z) − log(1 − z), (6.39)
Λ3 V
where the second term accounts for the ground-state contribution. This function is smooth on z ∈
[0, 1 − gs /N ], indicating that there is no first-order phase transition. Turning to the energy density, we
find that
E 3 kB T
= gs 3 g5/2 (z), (6.40)
V 2 Λ
6.7. A PHASE TRANSITION INVOLVING QUANTUM STATISTICS 105
where there is no ground-state contribution, because we set that energy level to E = 0. The associated
(constant-volume) heat-capacity per volume is given by
CV 15 kB 3 kB T ∂g5/2 ∂z
= gs g5/2 (z) + gs (z) ;
V 4 Λ3 2 Λ3 ∂z ∂T
15 kB 3 kB T g3/2 (z) ∂z
= gs g5/2 (z) + gs . (6.41)
4 Λ3 2 Λ3 z ∂T
The subtleties come from the term ∂z/∂T and we will discuss these next, in order do demonstrate that
the scaling of the heat capacity on either side of Tc is different.
The derivative of z with respect to T in Eq. (6.41) is interesting. For T < Tc , z ≈ 1 − gs /N (a constant,
as also revealed by Fig. 6.9) and the derivative should approximately vanish. This implies
15 kB
CV ≈ gs ζ(5/2) ∝ T 3/2 . (6.42)
4 Λ3
For T > Tc the scaling is z ∝ T −3/2 , because we are working at constant density, as we argued in
Chapter 4. Thus, we expect the derivative to contribute and working out the scaling, it is easy to find the
heat capacity tends toward a constant. This constant should be 3N kB /2 using the method of counting
degrees of freedom. In addition, monotonicity of the gn combined with the fact that ∂z/∂T < 0 in this
regime, indicates that CV is a decreasing function. Now, we have that CV increases with temperature
away from T = 0 and beyond Tc it decreases. Clearly, CV has a maximum somewhere around Tc .
𝐶𝑉 /32𝑁𝑘B
𝑇/𝑇𝑐
Figure 6.10: The constant-volume heat-capacity CV of an ideal Bose gas as a function of the temperature
T . Note that the function is continuous at the critical temperature Tc , but that it has a discontinuous
derivative. The behavior near Tc follows from our derivation in the main text. Here, the full curve was
solved for numerically using z = 1 below the transition.
We can do more than this. If we were to approach Tc from above, we know using Eq. (6.34) that
g3/2 (z) ≈ ρΛ3 , because we can ignore the contributions of the ground state. We can also Taylor expand
g3/2 (z) around z = 1 to gain a feeling for its behavior (scaling) around Tc . The Taylor expansion point
106 CHAPTER 6. PHASE DIAGRAMS
is sensible, as we can infer from Fig. 6.9 that in the thermodynamic limit, we have a kink in z(T /Tc )
at T = Tc . That is, we transition from z = 1 (T < Tc ) to something decreasing in t. The expansion
is analytically involved — it can, however, be straightforwardly done with an algebraic manipulation
software such as Mathematica — but eventually leads to the form
√ √
g3/2 (z) ≈ ζ(3/2) + 2 π 1 − z + · · · , (6.43)
where we have confirmed our intuition in approximating the integral at low temperature, as ζ(3/2) is
the leading-order term. Together with our previous relation, the expansion allows us to write
1 2
z ≈ 1− ζ(3/2) − Λ3 ρ ;
4π
3/2 !2
ζ(3/2)2 T
= 1− −1 ;
4π Tc
2
9ζ(3/2)2 T − Tc
≈ 1− . (6.44)
16π Tc
Note that here we see a feature of power-law departure away from a critical point.
Substituting this expression into Eq. (6.40) and taking a derivative with respect to temperature, we
arrive at the following
3/2
T
T < Tc
Tc
15 ζ(5/2)
CV = N kB Tc 1 T = Tc , (6.45)
4 ζ(3/2)
9 ζ(3/2)3
3 T
1−
− −1 T > Tc
20π ζ(5/2) 2 Tc
which holds close to T = Tc to first order. This expression clearly has a kink, but not a divergence as one
might expect on the basis of our intuition for phase transisions, when we think about the Ising model,
see Chapter 7. The associated (full) expression for the heat capacity — in the thermodynamic limit —
is shown in Fig. 6.10. Note that the curve shown in this figure is continuous, as we would have expected
on the basis of Eq. (6.45). However, its derivative will not be, which is one of the hallmarks of a phase
transition. That is, we have a phase transition from a phase wherein bosons occupy many states to a
‘condensed’ phase, wherein nearly all particles are in the ground state: the Bose-Einstein condensate.
As the constant-volume heat-capacity can be related to a second-order derivative of a thermodynamic
potential, this transition is indeed continuous. It will turn out that Bose-Einstein condensation can be
classified as a second-order transition via a theoretical analysis that goes beyond the notes.
6.8. EXERCISES 107
6.8 Exercises
Q38. Equilibrium Conditions
Use the second law of thermodynamics to show that in equilibrium p1 = p2 (mechanical equilib-
rium) and that µ1 = µ2 (chemical equilibrium) for a single-component system.
Q39. Heat Capacity (Again)
Show that the heat capacity CV = T (∂S/∂T )V,N is always greater than or equal to zero for a
stable system. Start by writing the heat capacity in terms of U and the natural variables of the
internal energy.
Q40. Coexistence: Surface versus Volume
Assume that we have two phases in coexistence, labeled 1 and 2. Explain why or show why we
can ignore the contributions from the interface between phases 1 and 2, when we are deriving the
equilibrium conditions for coexistence between the two phases. Provide a scaling argument. Hint:
For coexistence we are interested in the thermodynamic limit (N → ∞).
Q41. Equilibrium: One Component from David Chandler (Introduction to Modern Physics)
Consider a single component system with two possible phases: one labeled 1 and one labeled 2.
When the material is in phase 1, the equation of state is given by
where β = 1/(kB T ), and a(β) and b(β) are positive functions of β. In the 2 phase, the equation
of state is given by
2
βp = c(β) + d(β) (βµ) , (6.47)
where c(β) and d(β) are positive functions of β and d > b and c < a. Determine the density change
(ρ2 − ρ1 ) that occurs when the material undergoes a phase transition from phase 1 to phase 2;
assume that the density ρ1 is smaller than ρ2 at coexistence. Also determine the pressure at which
the transition occurs. Hint: The Gibbs-Duhem equation might be useful. The final answer for
both questions should be an expression of a, b, c, d, and β only.
Q42. Coexistence from a Common-Tangent Construction
Show that the common-tangent construction as depicted in the right-hand panel to Fig. 6.5:
(a) ensures that the pressures in the coexisting phases are equal.
(b) ensures that the chemical potentials in the coexisting phases are equal.
This should follow the line of argument set out in Section 6.4, but for F/V as a function of ρ,
rather than F as a function of V .
Q43. Phase Diagram
A theorist develops an approximate theory to calculate the free energy of the system. The free
energy that she finds, is shown in Fig 6.11 for four different temperatures. Note that here she has
plotted F/V as a function of the density ρ = N/V . Use the four plots to sketch a phase diagram
of the system.
Q44. The Maxwell Equal-Area Rule
In this exercise, we examine the construction of binodals attributed to Maxwell, who considered
the presence of an unstable region in the van-der-Waals (vdW) equation of state (EoS). We will
108 CHAPTER 6. PHASE DIAGRAMS
𝑇 = 0.13 𝑇 = 0.14
𝐹/𝑉
𝐹/𝑉
𝜌 𝜌
𝑇 = 0.15 𝑇 = 0.16
𝐹/𝑉
𝐹/𝑉
𝜌 𝜌
Figure 6.11: The approximate Helmholtz free energy per volume F/V as a function of density ρ for four
values of the (reduced) temperature T .
return to the vdW gas in Chapter 13, for the purpose of this exercise we only require to know
the shape of the isotherms. The vdW EoS is given by βp = ρ/(1 − bρ) − βaρ2 , with ρ = N/V
the particle number density, b an excluded volume per particle, and a a measure for attraction
between particles.
(a) Rewrite the above EoS to obtain (v-p) isotherms, i.e., the pressure p as a function of the
volume per particle v = V /N .
(b) Sketch a (v-p) isotherm below the critical temperature T < Tc . Indicate an approximate
lower bound for v in terms of the vdW gas parameters. Label your axes!
(c) Why should one replace the “wiggly part” of the isotherms with T < Tc ? By what should it
be replaced?
(d) Maxwell argued that peq must be chosen such that within the wiggle the areas between the
constant pressure line and the isotherm are equal. This leads to the expression
Z vl
dv (p(v) − peq ) = 0, (6.48)
vg
where integration takes place between the two coexistence volumes per particle, labelled g and
l for gas and liquid, respectively. Use the definition of the pressure to rewrite p(v) in terms of
the free energy and evaluate the above integral to obtain −peq = (f (vl ) − f (vg )) / (vl − vg ),
where f = F/N is the free energy per particle.
6.8. EXERCISES 109
(e) Explain how this proves that the Maxwell equal-area rule is equivalent to the common tangent.
(f) There is a preference for the common-tangent construction over the equal-area rule. Explain
what is problematic with the equal-area rule, referencing the spinodal region.
Figure 6.12: A phase diagram for an ouzo-like mixture of oils, water, and alcohol.
The ternary phase diagram has the pure phases at each corner, the two-phase diagrams at each
edge, with the arrows indicating the increasing (weight) fraction of one of the phases, and the
three-phase states in the body of the triangle.
Q46. More on Phase Diagrams from David Chandler (Introduction to Modern Physics)
A hypothetical experimentalist measures the hypothetical equation of state for a substance near
the liquid-solid phase transition. She finds that over a limited range of temperatures and densities,
the liquid phase can be characterized by the following formula for the Helmholtz free energy per
unit volume:
1
F/V = a(T )ρ2 (6.49)
2
where ρ is the number density and a(T ) is a function of the temperature given by a(T ) = α/T
with α a constant. Similarly, in the solid phase she finds
1
F/V = b(T )ρ3 (6.50)
3
110 CHAPTER 6. PHASE DIAGRAMS
with b(T ) = γ/T , where γ is a constant. At a given temperature, the pressure of the liquid can be
adjusted to a particular pressure ps at which the liquid freezes. Calculate the coexistence densities
associated with ps , and determine ps as a function of temperature.
Q47. Phase Transition from David Chandler (Introduction to Modern Physics)
Consider a hypothetical system consisting of N “partitions”. For simplicity, assume that the
system forms a closed ring. A small section of it is pictured in Fig. 6.13. Each “cell” contains
exactly two atoms. One atom is always at the top of the cell and one at the bottom. However,
each atom can either be in the right or left position. The walls between the cells we will call a
partition. The energies of the possible configurations are given by the following rules, also see
Fig. 6.13):
(i) Unless exactly two atoms are associated with each partition in the system, the energy of the
configuration is +∞.
(ii) If two atoms are on the same side of a partition, then the energy contribution to the config-
uration is zero, i.e., ϵi = 0.
(iii) If two atoms are on opposite sides of a partition, then the energy contribution of that partition
to the configuration is a constant: ϵi = ϵ.
Questions:
(a) Using the above rules, what are the energy levels possible for a system of N partitions and
associated atoms?
(b) How many states are present for each level? That is, what is the degeneracy?
(c) Using the results of (a) and (b), what is the canonical partition function for the system?
(d) Compute the free energy per particle in the thermodynamic limit, and show that the energy
(U ) per particle becomes discontinuous at some temperature in the thermodynamic limit.
Determine the entropy per particle at the same temperature. Is it discontinuous?
(e) What is the transition temperature? Is the free energy discontinuous at the transition tem-
perature? Is the first derivative (with respect to temperature) of the free energy discontinuous
at the transition temperature?
(f) What type of phase transitions is this? Why?
⋯ cell ⋯
partition
𝑖 : 𝜀𝑖 = ∞ 𝑖𝑖 : 𝜀𝑖 = 0
𝑖𝑖𝑖 : 𝜀𝑖 = 𝜀
Figure 6.13: Visualization of the a piece of the closed ring comprised out of partitions. Each cell contains
two atoms and the system’s energy is determined on a partition basis, according to the rules provided.
6.8. EXERCISES 111
A 1
= kx2 (6.51)
M 2
with x the length per unit mass and M the mass of the spring. The free energy A = U − T S obeys
dA = −SdT + f dL + µdM , with f the tension and L the length. After breaking, the free energy
is given by
A 1
= h(x − x0 )2 + c. (6.52)
M 2
The constants k, h, x0 , and c are all independent of x but do depend on T . Furthermore, we
assume k > h, c > 0, and x0 > 0 for all T .
(a) Determine the equation of state f = f (T, x) for the spring at small and large extensions.
(b) Similarly, determine the chemical potentials µ.
A
(c) Show that µ = M − fx
(d) Find the force that at given temperature will break the spring.
(e) Determine the discontinuous change in x when the spring is breaks.
(f) Is this like a first order phase transition or second order phase transition?
(a) Start by writing down the particle number N in integral form with density of states g(ϵ) and
Bose-Einstein distribution f (ϵ). Draw the functions of the integrand; that is, draw f (ϵ), g(ϵ),
but also f (ϵ)g(ϵ).
(b) For non-conserved bosons, answer the following questions:
(i) What happens to the particle number N if you increase/decrease the temperature T ?
(ii) What happens to N if you increase/decrease the chemical potential µ?
(iii) Can non-conserved bosons undergo a Bose-Einstein condensation? Hint: consider con-
served bosons first.
112 CHAPTER 6. PHASE DIAGRAMS
(iv) Is the distribution of non-conserved bosons at high temperatures well described by the
Maxwell-Boltzmann distribution? Hint: consider conserved bosons first.
(b) For conserved bosons, answer the following questions:
(i) What happens to the chemical potential µ if you increase/decrease the temperature T ?
(ii) What happens when the chemical potential is at the single-particle-ground-state energy
µ = ϵ0 and you decrease the temperature? Where do particles accumulate and what does
it have to do with Bose-Einstein condensation?
(iii) What happens to eβµ in the limit of large temperature? Show that in this case, the Bose-
Einstein distribution can be well approximated by the Maxwell-Boltzmann distribution.
Chapter 7
In this chapter, we will learn about the Ising model, named after Ernst Ising, which dates back to
1920. The model was proposed by Wilhelm Lens as a simple model for ferromagnetism and solved
in one dimension by his student Ernst Ising. We will study several approaches to obtaining analytic
expressions for its behavior here. It was later solved by Lars Onsager in two dimensions, which we
will touch upon, but remains unsolved in three dimensions. In the years since the model was first
developed, it has seen wide use. Not only is it suited to describe systems of spins, but the concept of
universality allows it to be applied to an amazing variety of systems including, e.g., protein folding,
biological membranes, and social behavior. We will comment on universality further in Chapter 10.
Additionally, since the Ising model has been solved analytically in one and two dimensions, it provides
valuable insight into the properties of phase transitions for any (theoretical) physicist.
Looking more closely at the two terms which occur in Eq. (7.1), we see that for J > 0, the first term
is minimized when the spins on sites i and j are pointing in the same direction. This is the situation
encountered in a ferromagnet, see Fig. 7.1. If J < 0, the interaction is minimized when neighboring
spins point in opposite directions, as in an antiferromagnet, see Fig. 7.1. The second term appearing in
Eq (7.1) is the coupling between the spins and the external magnetic field. When H > 0 this term is
minimized when the spins point in the direction of the field; for H < 0 the energy is minimized when
the spins point in the direction opposite to the field.
113
114 CHAPTER 7. THE ISING MODEL
Figure 7.1: Ground states of the Ising model depend on the sign of the spin-field coupling parameter J.
Positive values of J (top) correspond to ferromagnetic behavior and negative values of J (bottom) to
anti-ferromagnetic behavior.
We characterize the system’s order by examining the average magnetization. The net magnetization is
defined as
XN
M= Si , (7.2)
i=1
while the average magnetization is given by
M
m= . (7.3)
N
In the ground state, i.e., T = 0, all the spins are aligned and the magnetization is given by M = N .
At high temperatures, the spins are not aligned, and the average value of the magnetization is simply
M = 0. So what about finite temperature?
Let us first examine the Ising model in one dimension with N spins. Because there is no field, it is
equally favorable to have all spins pointing up or down, the only source of deviations away from the
ground state comes from defects. A single defect is formed when the right-hand side of the spin chain
has an opposite magnetization to that of the left-hand side, see Fig. 7.2, with the defect being localized
at the disconnection point. Note that a single flipped spin constitutes two defects, one on either side
of the flipped spin, and is energetically even less favorable than the defect state shown in Fig. 7.2. We
thus find that in one dimension, a single defect removes all correlations between the left- and right-hand
side of the chain. That is, it would not be possible to grow a single, system spanning cluster with even
a single flipped spin.
7.2. INTUITION FOR THE ZERO-FIELD BEHAVIOR 115
single defect
Figure 7.2: Defects in the ferromagnetic Ising model (J > 1). (top) One-dimensional Ising model ground
state and a state with a single defect. The defect disconnects the right-hand side of the spin chain from
its left-hand side. (bottom) Two-dimensional Ising ground state (left) and a state with a single defect
(right). In contrast to the one-dimensional case, the spins neighboring the defect can remain pointing
upwards, since they each interact with three other neighbors which point up. Hence, the defect does
not appear to have a long-range effect on the ordering in the system.
116 CHAPTER 7. THE ISING MODEL
This might not immediately present itself as a problem, because the two half spaces in Fig. 7.2, are
essentially very large in the thermodynamic limit, i.e., when N → ∞. However, at any non-zero
temperature we expect to have a finite fraction of all spins flipped. This means that at any T > 0, the
string of spins will be subdivided into a number of different regions with spins “up” and “down”. Spins
know only about their nearest neighbors, so that the length of these domains is random. That is, there
is no energetic penalty for having a longer or shorter domain with spins pointing opposite to those of its
neighboring domains. Therefore, we cannot distinguish this defect-rich state from a state with random
magnetization. Hence, there is no phase transition from a disordered to an ordered system. While not
exactly a hard proof, our intuition in this case is accurate, we will show shortly.
We can also examine the two-dimensional Ising model in a similar fashion. If we start in the ground
state and introduce a single defect, see Fig 7.2, the remaining spins on all sides of the flipped spin are
still connected to three up-pointing spins. Hence, the energy is minimized if they stay pointing upwards.
In other words, a cluster can survive a single defect spin and hence we expect that the system would
be able to stay ordered for a range of finite temperatures. It therefore seems likely that the system will
undergo a phase transition at a finite temperature. Again, while this is not a hard proof, it does indicate
the behavior of the 2D system will be significantly different from the 1D case. Our intuition will again
turn out to be correct and a similar argument holds for three (and higher) dimensions.
The expression in square brackets can now be solved exactly by summing over S1 = ±1:
" #
X
e βJS1 S2
= eβJS2 + e−βJS2 = 2 cosh (βJS2 ). (7.5)
S1
However, since S2 can only take on values of +1 and -1, and since cosh (x) = cosh (−x), we can simplify
the above expression to
" #
X
βJS1 S2
e = 2 cosh (βJ) (7.6)
S1
7.4. ZERO-FIELD ISING MODEL IN TWO AND THREE DIMENSIONS 117
From the partition function, we can determine the Helmholtz free energy
1
F (N, β) = − log Z(N, β);
β
1
N −1
= − log 2 (2 cosh (βJ)) ;
β
1
= − [log 2 + (N − 1) log (2 cosh (βJ))] . (7.8)
β
Finally, in the thermodynamic limit, i.e., N → ∞, we have
N
F (N, β) = − log (2 cosh (βJ)). (7.9)
β
As discussed in Chapter 6, for there to be a phase transition, there must be a discontinuity in a derivative
(1st derivative, 2nd derivative, etc.) of F . Clearly, F is perfectly smooth, so there is no phase transition
in the 1D Ising model with H = 0. This holds for both signs of J, so that our intuition of Section 7.2,
also holds for the antiferromagnetic 1D Ising model.
where Z π/2 h
1 1 1/2 i
I= dϕ log 1 + 1 − κ2 sin2 ϕ , (7.11)
π 0 2
with κ = 2 sinh(2βJ)/ cosh2 (2βJ). While we will not do so here, it is possible to show that the √ above
partition function leads to a spontaneous magnetization for T below Tc = 2J/ kB log 1 + 2 ≈
2.27J/kB where kB is Boltzmann’s constant. Above Tc the system is disordered. Hence, there is a
phase transition in the 2D Ising model. As we will explore later, it turns out that this phase transition
is an example of a second-order phase transition. In a second-order phase transition, the phase transition
is characterized by a discontinuity in the second-order derivative of the free energy.
118 CHAPTER 7. THE ISING MODEL
𝑀/𝑁
𝐶/𝑁
𝑇/𝑇𝑐 𝑇/𝑇𝑐
Figure 7.3: Critical phenomena in the zero-field, two-dimensional Ising model. (left) The heat capacity
per particle C/N as a function of the temperature T normalized by the critical temperature Tc . (right)
The magnetization per particle M/N as a function of T /Tc . The full solution by Onsager is shown in
thick blue, while the asymptotic behavior near the critical point is given by dashed red lines.
The three-dimensional Ising model still has not been solved analytically. However, numeric work on the
system has found that there is also a phase transition in the 3D Ising model, which is similar to that of
the 2D system, and that the corresponding critical temperature is approximately 4.52J/kB . Near the
critical temperature, the heat capacity has the form
C −α
∝ |T − Tc | (7.14)
N
with α ≈ 0.1096. For T < Tc , the magnetization is given by
M β
∝ (T − Tc ) , (7.15)
N
with β ≈ 0.32653. The numbers come from [Campostrini, et al., Phys. Rev. E 65, 066127 (2002)]. The
exponents α and β are referred to as critical exponents and are typically used to characterize the phase
1 Note, we have also previously seen a finite, but spiked heat capacity for Bose-Einstein condensates in Chapter 6, which
𝑆𝑁 𝑆
1 𝑆2 𝑆
3
Figure 7.4: Sketch of the 1D Ising model with N spins and periodic boundary conditions.
transition. This characterization is not simply used in discussing the Ising model, but rather critical
points in general, as we will return to in Chapter 12.
N N
X HX
H = −J Si Si+1 − (Si + Si+1 ). (7.16)
i=1
2 i=1
Here, we use periodic boundary conditions, namely SN +1 = S1 , see Fig. 7.4. This contrasts with our
solution approach for the Ising model without a field, where we used open boundary conditions. The
partition function for this system is given by
PN
(JSi Si+1 + H2 (Si +Si+1 )) ;
XX X
Z (β, H) = ··· eβ i=1
S1 S2 SN
N
eβ (JSi Si+1 + 2 (Si +Si+1 )) ;
XX XY H
= ···
S1 S2 SN i=1
XX N
XY
= ··· TSi ,Si+1 , (7.17)
S1 S2 SN i=1
We know from matrix algebra that the square of matrix A with matrix elements Ai,j can be written
X
A2 i,j = (AA)i,j =
Ai,k Ak,j . (7.23)
k
Therefore we can rewrite the partition function further as
!
XXX X X
Z (β, H) = ··· TS1 ,S2 TS2 ,S3 TS3 ,S4 · · · TSN ,S1 ;
S1 S3 S4 SN S2
XXX X
= ··· TS21 ,S3 TS3 ,S4 · · · TSN ,S1 ;
S1 S3 S4 SN
X
= TSN1 ,S1 ;
S1
= Tr T N , (7.24)
where Tr (A) is the trace of matrix A. Again, from matrix algebra, we know that the trace of a matrix
is simply the sum of its eigenvalues2 . Moreover, if the eigenvalues of a matrix A are λ+ and λ− , then
the eigenvalues of AN are λN N
+ and λ− .
the orthogonalizing matrices that put A into its diagonal form represent.
7.5. TRANSFER MATRIX METHOD FOR 1D ISING MODEL WITH FIELD 121
N
However, from Eq. (7.27), we have that λ+ > λ− always. Therefore, in the limit of large N , (λ− /λ+ )
goes to zero and the expression for the reduced free energy per particle in the thermodynamic limit
reads
βF
= − log λ+ . (7.29)
N
Filling in λ+ and simplifying, we obtain
1/2
βF (β, H) = −βN J − N log cosh (βH) + e−4βJ + sinh2 (βH) . (7.30)
Once we have the free energy, we can calculate the other thermodynamic properties of the system. For
example, the magnetization is given by
∂F N sinh (βH)
M (β, H) = − = 1/2 . (7.31)
∂H T e −4βJ + sinh2 (βH)
Note that the free energy in Eq. (7.30) indeed reduces to Eq. (7.8) in the zero-field limit. The same
limit is not useful for the magnetization.
122 CHAPTER 7. THE ISING MODEL
7.6 Exercises
Q51. Gas on a Lattice
Consider a simplified model for a gas. Assume that the gas consists of atoms which sit at lattice
sites (the lattice has M sites and there are N ≤ M particles). Furthermore, assume that each
lattice site can have at most a single gas particle. Neighboring particles interact with energy ϵ
according to the following Hamiltonian:
M
ϵ X X′
H= ni nj , (7.32)
2 i=1 j
where ni is 1 if there is a single particle on site i or 0 if there is no particle. Note that the sum
is over the number of sites M , which is a bit strange. The number of particles N is the relevant
parameter for a Hamiltonian, so naively one would expect a Hamiltonian that depends on the
position of the N particles. Hint: Does it matter if you sum over the number of (occupied) sites
instead?
(a) Write the grand canonical partition function for the lattice gas model in one dimension. Note:
You should not try and evaluate this, but just write down the expression.
(b) Using the substitution Sj = 2nj − 1, show that this is equivalent to the canonical partition
function for the Ising model with a field.
(c) You have seen that there is a mapping between the canonical Ising model and the grand-
canonical lattice gas. Discuss in a few words what having a continuous phase transition in
the 2D Ising model implies for the transition in the lattice gas. Reference your understanding
of a general gas-liquid phase diagram to see where this mapping may be most useful.
This exercise is a setup toward the concept of universality as we will discuss in Chapter 12.
Q52. Transfer Matrix for a Spin-1 Model
Consider a 1D system of spins Si which can have values −1, 0, 1. Assume that the spins interact
via the Hamiltonian X
H = −K Si Si+1 , (7.33)
i
where K is the coupling constant and assume K > 0.
(a) What is the transfer matrix for this problem?
(b) What are the eigenvalues? Using Mathematica is probably wise.
(c) Plot the eigenvalues using Mathematica. Which is the largest eigenvalue? Note that regard-
less of K, it is always the same eigenvalue which is largest.
(d) What is the partition function in the thermodynamic limit?
(e) What is the resulting free energy?
Q53. Ising Model with Nearest-Neighbor and Next-Nearest-Neighbor Interactions
The zero-field Ising model with nearest and next-nearest neighbor interactions can be written
N
X −1 N
X −2
H = −Jnn Si Si+1 − Jnnn Si Si+2 . (7.34)
i=1 i=1
2 exp(−4βJnnn )
1− 1/2 h 1/2 i .
2
exp(−4βJnnn ) + sinh (βJnn ) cosh(βJnn ) + exp(−4βJnnn ) + sinh2 (βJnn )
(7.37)
(d) What are ⟨Si Si+1 ⟩ and ⟨Si Si+2 ⟩ in the limit as Jnnn → 0.
(e) What are ⟨Si Si+1 ⟩ and ⟨Si Si+2 ⟩ in the limit as Jnn → 0.
J X X′ X
H=− Si Sj − H Si (7.38)
2 i j i
where H > 0. Define the number of “up” spins as N+ and the number of “down” spins as N− .
Then the total number of spins in this system is given by N = N+ + N− . Furthermore, define
the number of pairs of neighboring spins with both spins up to be N++ , one up and one down to
be N+− , and both down to be N−− . Finally, assume periodic boundary conditions, that is to say,
that the last spin in the chain (in any direction) is paired with the first spin in the chain. Hence,
if the system is one dimensional, and there are three spins in the chain, then there are also three
spin pairs.
where z is the number of neighbors of each lattice site. For simplicity, assume that in 2D
we have a square lattice where each spin has 4 neighbors, and in 3D we have a cubic lattice,
where each spin has 6 neighbors. (In 1D, each spin has two neighbors). Note that a single
spin can be part of multiple pairs and that a, b c, and d are constants and not functions of z.
(b) Show that the Hamiltonian, for a fixed number of spins N , can be rewritten in terms of N+
and N++ as
1
H = −J zN − 2zN+ + 4N++ − H (2N+ − N ) . (7.41)
2
(c) In the random mixing approximation, we assume that the probability of finding a spin up is
given simply by p = 1+L 1−L
2 , and the probability of finding a spin down is q = 2 . Show that
the Hamiltonian (within this approximation) can then be written
1
H = − JL2 zN − HLN, (7.42)
2
124 CHAPTER 7. THE ISING MODEL
(d) Within the above approximation, the partition function of the system becomes
X 1 2
Z(β) = g(L)eβN ( 2 zJL +HL) , (7.43)
L
where g(L) is the multiplicity factor associated with a particular value of L and is given by
g(L) = N !/((N p)!(N q)!). Explain why this is the multiplicity.
(e) Show that the value of L that maximizes the summand in Eq. (7.43), is given by L = L∗
with L∗ given by
1 + L∗
1
log = β(zJL∗ + H). (7.44)
2 1 − L∗
Hint: You will need to use Stirling’s approximation.
(f) Approximating the partition function by its value at L∗ , determine the free energy and the
internal energy, and show that the entropy takes the value
where c measures the interaction strength. Let L = N −1 i Si , and rewrite the Hamiltonian in
P
terms of L. Show that in the limit of N → ∞ and c → 0, with the limits taken such that N c
is kept constant, this gives the random mixing Hamiltonian from Exercise Q54 with N c = Jz.
Interpret this result.
Chapter 8
Mean-Field Theory
In this chapter, we cover mean-field theory. This is an approximative approach for describing systems
with a known Hamiltonian, but for which an exact partition function is not available or cannot be
readily derived. At its core, mean-field theory isolates a single object, for instance a single spin in the
Ising model, and treats it exactly. However, this object’s interactions with the rest of the system are
treated ‘on average’. In terms of the Ising model, this could, e.g., be the single spin coupling to the
average orientation of the spins in the rest of the system. In other words, the system described by an
exact many-body partition function is approximated by ignoring correlations between the objects, which
results in a one-body partition function that approximates the behavior in the real system. This will
become more clear when we examine our first example.
An equivalent way of describing mean-field theory is simply to rewrite the Hamiltonian ignoring corre-
lations between the “objects” we treat exactly. To see explicitly how this works, we write the spins i
as
Si = ⟨S⟩ + δSi , (8.2)
where ⟨S⟩ is the average spin of the system and δSi is the difference between the spin Si and the average.
We can then rewrite Si Sj in the following manner:
Si Sj = (⟨S⟩ + δSi ) (⟨S⟩ + δSj ) ,
2
= ⟨S⟩ + ⟨S⟩ (δSi + δSj ) + δSi δSj . (8.3)
125
126 CHAPTER 8. MEAN-FIELD THEORY
𝐻eff
𝑆𝑖 𝑆𝑖
Figure 8.1: Cartoon showing a mean-field approximation for a two-dimensional Ising spin system. (left)
We focus on a single spin Si (red outline). (right) The rest of the system is treated as an effective
field (green outline). When this process isPrepeated for all of the particles in the system, the resulting
Hamiltonian will be of the form H = Heff i Si .
The approximation in mean-field theory ignores correlations between spins; hence within this approxi-
mation Si Sj becomes
2
Si Sj ≈ ⟨S⟩ + ⟨S⟩ (δSi + δSj ) . (8.4)
Finally, rewriting δSi and δSj in terms of Si and Sj , see Eq. (8.2), we obtain
2
Si Sj ≈ ⟨S⟩ (Si + Sj ) − ⟨S⟩ . (8.5)
Putting this back into the Hamiltonian for the Ising model we obtain
J X X′
H = − Si Sj ;
2 i j
J X X′ 2
≈ − ⟨S⟩ (Si + Sj ) − ⟨S⟩ ;
2 i j
JN z 2
X
≈ ⟨S⟩ − H̃ Si , (8.6)
2 i
where we have summed over the nearest neighbors j; z is the number of nearest neighbors around a
single spin and H̃ is given by
H̃ = Jz ⟨S⟩ . (8.7)
The average spin ⟨S⟩ is the magnetization per particle, which we will denote m ≡ M/N = ⟨S⟩.
Now, we want to use Eq. (8.6) to determine whether or not there is a phase transition in the Ising model,
and if it exists, determine the mean-field prediction for the transition temperature Tc . As discussed in
the introduction to the Ising model, there is a phase transition if there exists a finite temperature where
the magnetization becomes non-zero, i.e., when the systems orders. Hence, we now use the Hamiltonian
8.1. MEAN-FIELD ISING MODEL 127
given in Eq. (8.6), i.e., the mean-field approximation to the original Hamiltonian, to calculate the
magnetization per particle (canonical ensemble):
P P P
S1 S2 ... SN Si exp (−βH)
m = ⟨Si ⟩ = P P P ;
S1 S2 ... SN exp (−βH)
= tanh (βJzm). (8.8)
Equation (8.8) is called a self-consistent equation, since the magnetization (on the left-hand side) is a
function of itself. While we cannot solve this expression analytically, we can solve it graphically.
1.5 Α=2 Α =1
1.0
0.5 m= 0
m=- m f Α = 0.3
0.0
y
m= m f
- 0.5
- 1.0 1
Α=
- 1.5 ΒJz
-4 -2 0 2 4
x
Figure 8.2: Graphical solution to the self-consistent mean field approximation to the Ising model as
given by Eq. (8.8). The solid black line corresponds to y = tanh (x) while the dashed lines correspond
to y = x/(βJz) ≡ αx for various choices of βJz. Note that for α > 1, there is only the trivial solution
m = 0 while when α < 1 there are three possible solutions m = 0, m = mf and m = −mf . The two
solutions at m = ±mf are equivalent and simply represent states with the spins predominately pointing
either “up” or “down”.
To solve Eq. (8.8) graphically, we first rewrite the expression. Letting x = βJzm we have
x
= tanh (x). (8.9)
βJz
In Fig. 8.2 we plot y = tanh (x) as well as y = x/(βJz), for various choices of βJz; the crossings of
the two curves indicate solutions to Eq. (8.8). We see that for small βJz (the fraction is larger than
1), there is only a single solution corresponding to m = 0. However, as we lower the temperature, i.e.,
increase β and thereby decrease the fraction, eventually there arise three possible solutions: m = 0,
m = mf and m = −mf . Here, the value of mf is determined by the graphical solution (or numerically).
The solutions m = mf and m = −mf are equivalent solutions and are simply a result of the “up/down”
symmetry of the zero-field Ising model. This graphical solution, however, still leaves us with a dilemma:
which of the two solutions (m = 0 or |m| = |mf |) should we use when the coupling is large (βJz > 1)?
If the answer is m = 0 then we would conclude that there is no disorder to order transition in the Ising
model. However, if there exists a temperature for which we go from m = 0 to m = mf then we could
conclude that mean-field theory predicts a phase transition in the Ising model. In order to answer this
question, we have to determine which solution corresponds to the minimum in the free energy.
128 CHAPTER 8. MEAN-FIELD THEORY
The free energy per spin is obtained using the much reduced expression for the mean-field partition sum
!
1 X
βJzmSi −(1/2)βJzm2
f = − log e ;
β
Si =±1
1 2 2
= − log eβJzm−(1/2)βJzm + e−βJzm−(1/2)βJzm . (8.10)
β
The free-energy difference between m = 0 and m = mf is given by
Z mf
∂f
∆f = dm ;
0 ∂m
1 mf
Z
= dm (βJzm − βJz tanh (βJzm)) . (8.11)
β 0
Using our previous change of variable x = βJzm, we obtain
1 mf /βJz
Z
1
∆f = dx x − tanh (x) ;
β 0 βJz
1
= (Alinear − Atanh ) , (8.12)
β
where Alinear is the area under the dashed line in Fig. 8.2 from the m = 0 point to the m = mf point
while Atanh is the area under the tanh-function for the same range in m. For all cases where there is more
than a single solution to Eq. (8.8), the area under the linear line is less than under the tanh-function.
Consequently, Alinear < Atanh and ∆f < 0 for this range in β, and the solution corresponding to m = mf
has a lower free energy. We conclude that mean-field theory predicts the system to be ordered for all
temperatures where the graphical solution in Fig. 8.2 finds more than one solution. The transition
temperature Tc is the temperature associated with the transition between a single solution and three
solutions to to Eq. (8.8). This transition occurs when βJz = 1, so that the transition temperature
predicted by mean-field theory is kB Tc = Jz. For the 2D square-lattice Ising model, we have that
z = 4, which means that the mean-field approximation overestimates the critical temperature, which
according to Onsager is given by Tc ≈ 2.27J/kB . This overestimate makes sense, as correlations between
spins, which have been ignored in mean-field theory, should become relevant when the system becomes
ordered. In the coming chapters, we will see just how important they are.
One should note that we have made no reference to the dimension of the system in the entire mean-field
treatment of the Ising model. Hence, our prediction of a phase transition is independent of dimension.
However, as we saw in Chapter 7, an exact treatment of the Ising model finds no phase transition in
the one-dimensional Ising model, hence the mean-field treatment fails to capture the 1D behavior. It
will be left as an exercise to explain why this is the case.
Figure 8.3: Cell theory for a the face-centered cubic (FCC) crystal formed by hard spheres, adapted
from original by the University of Washington. (left) Representation of the FCC crystal showing several
cut planes and 3D dimensional views to indicate the stacking of the spheres and the cubic nature of
the unit cell. (right) Two-dimensional hexagonal arrangement of spheres in a single layer from a FCC
crystal. In our mean-field approximation we calculate the partition function for the light colored disk
assuming that the other particles all remain at their lattice sites. Note that σ is the diameter of a sphere
and a is the spacing between the centers of mass of nearest particles.
where r is the 3D distance between the particles and σ is the particle diameter.
The crystal phase which is stable for a hard sphere system can be shown to be a face centered cubic
crystal, as depicted in the left-hand panel to Fig. 8.3. In this case, the object we treat exactly is a single
hard sphere. We assume that the other spheres are located at their “ideal” lattice positions. Hence, we
can write the partition function as a product of volumes, that is
N
Y Vi
ZN = (8.14)
i=1
Λ3
where Vi is the free volume in which the center of mass particle i can move assuming all the other
particles are frozen to their lattice sites. This is sketched in the right-hand panel to Fig. 8.3. The sphere
of interest is thus confined to a single “cell” of the idealized crystal.
For a face centered cubic lattice, all nearest neighbors are a distance a from the target particle. We
make the additional assumption that the center of mass can only move within a sphere of radius a − σ.
In reality, the volume in which the center of mass can move is not exactly spherical, but as you can
judge from Fig. 8.3, this is a fairly decent approximation. The free volume per particle is then given by
4π 3
Vi =(a − σ) . (8.15)
3
Using the above expression and the approximate partition function, we obtain the mean-field free energy
βF = − log (ZN ) ;
3 !
4π a−σ
= −N log . (8.16)
3 Λ
130 CHAPTER 8. MEAN-FIELD THEORY
We will now reduce this free-energy expression further by using scaling arguments. When all the
spheres are touching,
√ i.e., when the system is in its close-packed arrangement, the number density is
σ 3 ρcp = σ 3 N/V = 2, where N is the number of particles in a volume V /σ 3 . The lattice spacing scales
with the close-packed density as
a3 ρ = σ 3 ρcp , (8.17)
so that we can rewrite the free energy as
!3
1/3
σ3
4π ρcp
βF = −N log −1 . (8.18)
3 ρ Λ3
8.3 Exercises
Q56. Binary Lattice Gas
Consider a binary gas of particles (labeled A and B) at temperature T where the particles are
restricted to moving on a simple cubic lattice. Note that in a simple cubic lattice, each lattice
site has 6 neighbors. Assume that the lattice has M lattice sites. Assume that the density of
particles of type A is ρA and that the density of particles of type B is ρB . Also assume that
the particles do not interact with any of the other particles of the same species at all, but if two
particles of different species are on the same site or on adjacent sites (horizontally or vertically),
they repel, leading to an interaction energy ϵ. The total number of particles is N = NA + NB ,
with Ni the number of particles of type i. In this exercise, we will explore the phase transitions
for this system using a mean-field approximation, and as such ignore all correlations between the
particle positions.
(a) Given a homogeneous density ρB of particles of species B, what is the average energy of a
single particle of species A? And what is the average energy ⟨E⟩ of the entire system?
(b) Within such a simple mean-field approximation, what is the canonical partition function of
the full system?
(c) Define the composition x of the system is the fraction of particles that belongs to species A:
x = NA /(NA + NB ). Show that the Helmholtz free energy can be written as
βF (N, x, V, T )
= x log x + (1 − x) log(1 − x) + Kx(1 − x) + log V0 ρ − 1, (8.20)
N
where ρ = (NA + NB )/V is the total density and V0 is a unit of lattice volume, and K is a
function of β, ϵ and ρ.
Now consider a large system containing this binary gas at composition x = 1/2, so NA = NB =
N/2. We divide this system into two parts, marked 1 and 2. Both subsystems have the same
volume V1 = V2 = V /2 and contain the same number of particles N1 = N2 = N/2, so they have
the same total density ρ. However, the two subsystems can have different compositions x1 and x2 ,
and therefore also different numbers of particles of each species (NA,1 and NA,2 can be different
from NB,1 and NB,2 ). The total free energy can then be written as a function of either x1 or x2 .
In terms of x1 , the total free energy is given by
βF
f (x1 ) = = x1 log x1 + (1 − x1 ) log(1 − x1 ) + Kx1 (1 − x1 ) + log V0 ρ − 1. (8.21)
N
(d) Show that x1 = 1/2 is always an extremum of this free energy. Explain what this means.
Expanding the free energy around x1 = 1/2 up to fourth order gives the following:
2 4
1 4 1
f (x1 ) ≃ C + (2 − K) x1 − + x1 − . (8.22)
2 3 2
Here, C is independent of x1 . All higher order terms in the expansion are positive even powers of
(x1 − 1/2). We can see this as a Landau expansion of the order parameter m = x1 − 1/2, we will
discuss these in detail in Chapter 9.
(e) Sketch the landau free energy for this system for (i) T > Tc , (ii) T = Tc , and (iii) T < Tc ,
where Tc is the critical temperature.
132 CHAPTER 8. MEAN-FIELD THEORY
(f) Sketch the order parameter m as a function of temperature T , associated with the global
minimum of the landau free energy.
(g) What type of phase transition occurs in this system? Explain.
(h) What is the value of K associated with the phase transition?
(i) What does the system look like for T > Tc ? What does the system look like for T < Tc ?
That is, describe the phases associated with the phase transition.
(j) In this exercise we used an approximate method to determine the phase behavior of this
system. Describe one way in which we could improve this approximation. There are many
possible correct answers to this question.
(a) Explicitly calculate the partition function Zc for the cluster in one and two dimensions (q =
2, 4). Show that both cases can be expressed as
2q e−j
⟨s1 ⟩ = ⟨s0 ⟩ + (coshq−1 (−j + b)eb − coshq−1 (j + b)e−b ). (8.27)
Zc
Hint: sinh(x) = cosh(x) − exp(−x) = − cosh(x) + exp(x).
(d) Since the system is translation invariant, we know that ⟨s0 ⟩ = ⟨s1 ⟩. The only freedom we
have to satisfy this equation is in the value of Beff . Show that
(e) Note that Beff = 0 is always a solution of the above equation. What are the limits of the
left-hand and right-hand sides of this equation for Beff → ∞? Argue that if the slope of the
left-hand side (as a function of Beff ) is greater than 2β at Beff = 0, there will be another
solution at finite Beff .
8.3. EXERCISES 133
(f) Set the slope of the left-hand side to 2β with Beff = 0, and calculate kB Tc /J for one and two
dimensions in this case. Explain why this can be used to determine the critical temperature
Tc . Are the results better than for the mean-field approximation including only a single spin?
simple cubic:
face-centered cubic:
Figure 8.4: Simple-cubic and face-centered-cubic lattice (left) and their respective unit cells (right).
(a) What is the highest packing fraction possible for hard spheres organized on an FCC lattice?
a simple cubic lattice? Note that the packing fraction is the amount of space occupied by
the spheres. Assume that the spheres have diameter σ.
(b) Use cell theory to determine the free energy of both an FCC and simple cubic lattice of hard
spheres as a function of the number density ρσ 3 = N/V .
134 CHAPTER 8. MEAN-FIELD THEORY
(c) Show that within cell theory the hard sphere FCC crystal has a lower free energy than
the hard sphere simple cubic crystal. Note: You may want to plot the free energies using
Mathematica.
(a) Show that the highest√area packing fraction (ηcp ) possible for hard disks on a hexagonal
lattice is given by π/(2 3). And that when the disks are organized on a simple square lattice
it is π/4.
(b) Explain how to use a mean-field approximation to compute Ai as a function of η; you may
2
want to make a sketch. You should arrive at Ai ≈ π (a − σ) with a the lattice spacing. Does
it matter whether you do this for a square or hexagonal crystal?
(c) Show that within cell theory the hard-disk hexagonal crystal has a lower free energy than
that of the simple-square crystal. Why is that expected?
Assume that the Helmholtz free energy of the perfect hexagonal crystal is given by Fperf (N =
M, A, T ), where M is the number of lattice sites and T is the temperature. Note that in a perfect
crystal, the number of lattice sites M , is the same as the number of particles N . Additionally,
assume that the Helmholtz free energy associated with changing a specific particle into a vacancy,
i.e., removing that specific particle, is given by fvac (ρM , T ), where ρM = M/A. Define Nvac as
the number of vacancies, such that M = N + Nvac and Nvac ≪ M, N . Finally, we assume that
the vacancies do not interact with each other and that they are randomly distributed throughout
the crystal.
M
(d) Argue that Zvac = exp (−βNvac fvac ) and provide the Helmholtz free energy of the
Nvac
system with vacancies.
Lastly, assume that the equation of state is not affected by the presence of vacancies. In other
words, the pressure P (M, N, A, T ) is not dependent on Nvac . The Gibbs free energy is then given
by
N M −N
βG(M, N, P, T ) = βN µperf (P, T ) + β(M − N )µvac (P, T ) + N log + (M − N ) log ,
M M
where µperf (P, T ) is the chemical potential of the perfect crystal and µvac = µperf (P, T )+f˜vac (P, T ).
Here, f˜vac (P, T ) equal to fvac (ρM , T ), evaluated at ρM corresponding to P in a perfect crystal.
(e) Demonstrate that βG(M, N, P, T ) is as above and indicate where the assumptions are used.
(f) When µvac is approximately 9kB T (close to the melting point of the crystal), show that the
equilibrium vacancy concentration is approximately ⟨(M − N )/M ⟩ = 1 · 10−4 .
y-axis, the spins in this model can be described by an angle θ ∈ [−π, π] with respect to the x-axis.
This angle is defined by cos θi = si · x̂ with x̂ the unit vector pointing along the x-axis and ‘·’ the
inner product. The Hamiltonian for this system can be written as
J X X′ X J X X′ X
H=− si · sj − h · si = − cos(θi − θj ) − h cos θi , (8.29)
2 i j i
2 i j i
P′
where j indicates a sum over nearest neighbors, and h ≡ hx̂ is an external field pointing along
the x-axis. For the first few subproblems, we will assume h = 0 and that the sites are located on
a one-dimensional (1D) chain.
(a) Assume a chain of N sites, which is not periodic. Sketch a few microstates.
(b) Demonstrate that the expression for the canonical partition sum Z(N, T ) for this open-ended
1D chain — N is the total number of sites and T the temperature — can be written as
Z π N −1
Z(N, T ) = 2π dθ exp (βJ cos θ) ≡ (2π)N I0 (βJ)N −1 , (8.30)
−π
We now consider the same rotating-spin model on a two-dimension (2D) square lattice of N spins.
Assume h ̸= 0 from here on. Define zi = exp(ιθi ) with i the index and ι the imaginary unit. Let
zi = w + δzi with w ≡ ⟨zi ⟩ the statistical average.
(e) Write down the complex conjugate relations for zi using the standard ∗ notation. Show that
cos(θi − θj ) = Re(zi∗ zj ), with Re indicating the real part.
(f) Use the result from (e) to expand the Hamiltonian in Eq. (8.29) up to O(δz 2 ), i.e., neglecting
terms such as δzi δzj . Show that the mean-field Hamiltonian is given by:
X hX ∗
HMF = 2JN |w|2 − 2J Re (w∗ zi + wzi∗ ) − (zi + zi ) . (8.31)
i
2 i
(g) Explain in a few words why the above free energy should be minimized when w points in the
same direction as the external field, i.e., w ∈ R and, assuming h > 0, w > 0. Show that this
leads to X
MF
Hmin = 2JN w2 − (h + 4Jw) cos θi , (8.32)
i
Minimizing Eq. (8.33) leads to w = I1 (β(h + 4Jw)) /I0 (β(h + 4Jw)), with I1 the derivative with
respect to the argument of I0 .
136 CHAPTER 8. MEAN-FIELD THEORY
(h) What kind of equation is this for w and what role does w serve in describing the phase
transition? Explain in a few words.
(i) Based on the above analysis, do you think the 2D rotating-spin model has a phase transition?
Explain why using only a few words.
(a) Show that the above exponential inequality holds. Hint use the result ex ≥ 1 + x for all x ∈ R
and start from ex−⟨x⟩ .
(b) Use the definition of the partition function in terms of the Helmholtz free energy F to show
that F ≤ TrρH + kB T Trρ log ρ. The first term on the left-hand side is simply the average of
the Hamiltonian with respect to the distribution ρ and can also be written as ⟨H⟩ρ .
(c) Give an interpretation to the second term on the right-hand side. The resulting expression
is also known as the Bogliubov inequality.
(d) What happens when we assume that ρ = exp (−βH) /Z?
The inequality gives a handle on how to approximate the free energy of a system of interest. A
functional form with some free (unspecified) parameters is chosen to approximate ρ. The inequality
is subsequently minimized with respect to the free parameters to obtain the optimal approximation.
Mean-field theory is obtained by assuming a distribution ρ that can be factorized into a product
of independent single particle distributions. That is, if ρi is the proability distribution belonging
to the degrees of freedom of particle i, then the mean-field distribution reads
X
ρ= ρi . (8.34)
i
One route toward choosing an appropriate ρi requires the use of an order parameter, see Chapter 9
for more information. The other is to choose another non-interacting Hamiltonian H0 and minimize
with respect to it.
We will take the latter route for the Ising model
N N
J X X′ X
H=− Si Sj − H Si . (8.35)
2 i=1 j i=1
P
Let H0 = −h i Si with h our free parameter. We have that the partition function for this
Hamiltonian is Z0 = 2N coshN (βh).
8.3. EXERCISES 137
(e) Show that this implies ⟨H⟩0 = −N zβJ⟨Si ⟩20 + βH⟨Si ⟩0 , with z the number of nearest
neighbors as before and the subscript 0 denoting the average with respect to the Boltzmann-
based probability density coming from H0 .
(f) Recall the expression for ⟨Si ⟩0 and compute kB T Trρ log ρ.
(g) Minimize the resulting right-hand side of the inequality with respect to h to obtain a familiar
expression.
138 CHAPTER 8. MEAN-FIELD THEORY
Chapter 9
Landau Theory
Thus far, we have discussed two different types of phase transitions, namely first-order phase transitions
and continuous phase transitions. A first-order phase transition is characterized by a discontinuity in
the first derivative of the free energy, with an nth-order (n > 1 implies continuous) phase transition
having a discontinuity in the nth derivative. First-order phase transition is special in a sense, as one
of the discontinuities appears in the equation of state. This implies that such a phase transition can
exhibit phase coexistence, see Chapter 6. Hence, one of the most fundamental problems associated with
phase transitions is identifying whether it is first-order or continuous. Specifically, it would be extremely
valuable if we were able to write down a simple theory that allows us to determine the order of the
transition. Landau addressed this question with a mean-field theory in 1937, the result of which today
is commonly referred to as Landau theory, which is the subject of this chapter.
9.1 Introduction
In Landau theory, we assume that a phase transition can be characterized by an order parameter that
measures the degree of ordering in the system. For instance, if we were looking at the ferromagnetic
transition of the Ising model, we would use the magnetization; for a system which displays a phase
transition between a fluid and a phase separated liquid-gas mixture, the order parameter could be the
density difference between the liquid and the gas: ρl −ρg . Each transition will have its own order param-
eter and there might be multiple order parameters that equivalently describe a single transition, e.g.,
there are various ways to measure crystalline structure for a solid-to-solid transition. Frequently, the
order parameter is chosen such that it is zero in the disordered phase and non-zero in the ordered phase.
There are, however, cases where it is simpler to relax this condition.
Critical to the understanding of order parameters is that these are quantities that emerge by setting the
state variables. However, be very careful, they are not the same as the state variables! This is readily
seen when considering the magnetization in the Ising model, which results from picking a temperature
(and coupling parameter). The difference becomes somewhat muddied when we consider, for instance,
a density difference ∆ρ between a gas and a liquid, as one would set N , V , and T in a system that
phase separates, i.e., density and temperature. The distinction is that the density gap emerges δρ by
setting ρ and T ; this is not the same as the ρ you imposed. In addition, the gas-liquid transition is a bit
unusual, as the density gap can only be defined a posteriori, knowing that the system phase separates
by a first-order transition.
139
140 CHAPTER 9. LANDAU THEORY
One of the main assumptions in Landau theory is that we can expand the relevant free energy of
disordered phase in terms of the order parameter. We emphasize here that this is an assumption and it
is most definitely not obvious that we can do this. In fact, near a critical point, such as we encountered
for the Ising model in Chapter 7, the free energy cannot be written in this form. In Chapter 10 we
will study a method, which can be used to study the behavior at and close to critical points. However,
despite the evident shortcomings of such an expansion, the Landau expansion of the free energy turns
out to be extremely useful in understanding phase transitions. Specifically, we write for the expansion
where a, b, c, d, · · · are the coefficients associated with the expansion and m is the order parameter.
For a given system, the coefficients are chosen such that they respect the symmetry of the disordered
phase. We will return to the concept of symmetry in last sections of this chapter, but it is this which
lies at the heart of the usefulness of Landau theory.
Before continuing, let us examine Eq. (9.1) more closely. In this equation, we have not made any
reference to the volume, strength of a magnetic field, etc. For simplicity, they have been intentionally
left out of this expression, since the relevant variables depend on the system in question. However,
we assume that all these variables are fixed such that we are looking at the free energy for a given
single state point, e.g., for a given temperature. The only variable — in the equation sense; not in
the thermodynamic sense — left is the order parameter m. Hence, we are comparing the free energy
of different possible phases at the same state point. For example, in the Ising model this means that
we are comparing states with different levels of magnetization (disordered and ordered) for a given
temperature, field, and coupling. Going back to what we know from thermodynamics, the stable phase
will be the phase with the lowest free energy, which can be obtained by studying the behavior of f in
Eq. (9.1) as a function of m.
From our definition of the order parameter, the disordered phase always occurs at m = 0. Thus, it
follows that there must be a minimum in the free energy at m = 0 for T > Tc , i.e., ∂f /∂m|m=0 = 0.
Hence, the linear term in the expansion must be zero: a = 0. Furthermore, since m = 0 corresponds to
a stable phase for T > Tc , we must have f (T, m = 0) be a local minimum, i.e., ∂ 2 f /∂m2 |m=0 > 0 hence
b > 0 for T > Tc .
𝑓 𝑓
𝑚 = 𝑚0
𝑚
𝑚=0
Figure 9.1: Examples of Landau free energies with a single minimum. (left) The minimum is located
at m = 0, hence at the state point corresponding to this Landau free energy, we would predict the
disordered phase to be stable. (right) The minimum is located at m = m0 ̸= 0, hence in this state point,
the ordered phase is stable.
9.2. CASE 1: A CONTINUOUS PHASE TRANSITION 141
Now, let us consider some of the possibilities which may arise when we look at example expressions for
the Landau free energy. In the left-hand panel to Fig. 9.1 we see a Landau free energy, for which there
is a single minimum located at m = 0. Hence, the stable phase in this case will be the disordered phase.
In the right-hand panel to Fig. 9.1, we have the opposite scenario, there is a single minimum, but in
this case it is at m = m0 . This constitutes an example of a stable ordered phase.
𝑓 𝑓
𝑚 = 𝑚0
𝑚 = 𝑚0
𝑚 𝑚
𝑚=0 𝑚=0
Figure 9.2: Examples of Landau free energies with two minima. (left) A state point for which the
disordered phase m = 0 is stable and the ordered phase m = m0 is metastable. (right) A state point
with the opposite stability conditions.
However, the situation is not always this simple. Sometimes the Landau free energy has more than one
minimum, such as in Fig. 9.2. In both cases, the stable phases are still easy to identify. In the left-hand
panel, the disordered phase corresponds to the global minimum in the free energy, so that the disordered
phase is stable. Equivalently, in the right-hand panel, the global minimum occurs at m = m0 , which
means that the ordered phase characterized by m = m0 is stable. The “local” minima occurring in
these plots are generally referred to as extra minima or metastable states. Note also that because f
in Eq. (9.1) is dependent on an order parameter, not on state variables, you cannot and must never
perform a common-tangent construction on the Landau free energy!
To quickly summarize, if we were able to write a Landau expansion of the free energy, we would be able
to determine which state was stable at each state point. This still does not answer the question we set
out to solve: Can we say something about the type of the phase transition (first order or continuous)
by the form of the Landau free energy? In the following two sections we are going to examine two “case
studies” in the hope of answering this question. In the last sections of this chapter, we see how Landau
theory can be applied by writing the Landau free energy of two models, namely the Ising model and a
new system which we have not yet encountered: liquid crystals.
where we have made the temperature dependence explicit and we assume d(T ) > 0 for all T . In principle,
all of the coefficients of the Landau free energy could be temperature dependent, however, we will find
142 CHAPTER 9. LANDAU THEORY
𝑓 𝑓
Figure 9.3: A zoom-in on the Landau free energy near m = 0. (left) For b(T ) > 0 corresponding
to T > Tc , the shape implies that the disordered phase is stable or metastable. (right) The opposite
situation with b(T ) < 0 corresponding to T < Tc ; the disordered phase is neither stable nor metastable.
it useful for the purpose of studying the type of phase transition to only examine the temperature
dependence of b(T ) and assume that the others are temperature independent, i.e., d(T ) = d.
The assumption that d > 0 ensures that the free energy goes to infinity as m goes to infinity in both
the positive and negative directions, i.e., the global minimum
p is found for a finite m. The minima of
Eq. (9.2) are given by m = 0 for b(T ) > 0 and m = ± −b(T )/2d for b(T ) < 0. If we zoom in on the
minimum around m = 0, we see that changing the sign of b(T ) from positive to negative takes it from
being a stable or metastable state, i.e., at least a local minimum in the free energy, to being completely
unstable. If we identify the temperature associated with changing the sign of b(T ) to be Tc then we can
approximate b(T ) by b(T ) ≈ b′ (T − Tc ) where b′ is a positive constant. Thus for T > Tc , b(T ) > 0 and
the disordered phase is (meta)stable, while for T < Tp c the disordered phase is not stable, see Fig. 9.3.
With this rewrite, the minima are given by m = ± −b′ (T − Tc )/2d. The free energies for T > Tc ,
T = Tc and T < Tc are shown in Fig. 9.4. The minimum as a function of T − Tc continuously goes away
from m = 0. As a result, the global minimum in the free energy goes continuously away from m = 0, a
clear signature of a continuous phase transition.
Note also that due to the symmetry of the problem, there are two global minima. This could, for
instance, correspond to the two solutions associated with spins pointed up and spins pointed down in
the Ising model. Additionally, note that the Ising model is an example of a system which obeys this
symmetry. We can go even further here and look at the properties of the order parameter near the
critical temperature Tc . Here, our Landau theory predicts that the order parameter behaves as
𝑓 𝑚
𝑇 > 𝑇𝑐 −𝑏′(𝑇 − 𝑇𝑐 )/2𝑑
𝑇 = 𝑇𝑐 𝑇𝑐 − 𝑇
𝑚
− −𝑏′(𝑇 − 𝑇𝑐 )/2𝑑
𝑇 < 𝑇𝑐
Figure 9.4: (left) Free energy as a function of m where the Landau free energy is given by Eq. (9.2).
(right) Minima of the Landau free energy as a function of temperature with respect to the critical
temperature T − Tc . Note that the minimum goes away from m = 0 continuously. This is an example
of a continuous phase transition. Note that for T < Tc two minima appear, due to the symmetry in this
problem. This could, for instance, correspond to the Ising model where one minimum would correspond
to all spins pointing “up” while the other minimum would correspond to all spins pointing “down”.
where c(T ) > 0 and d(T ) > 0. Again for simplicity, let us assume that both c and d are not temperature
dependent. Additionally, as in the previous case, the disordered phase becomes completely unstable
when the sign of b(T ) changes from positive to negative. This time we will call this temperature T0 and
write b(T ) ≈ b′ (T − T0 ). Remembering that a minimum requires ∂f /∂m = 0 and ∂ 2 f /∂m2 > 0, we
find that for b(T ) > 9c2 /32d there is only a single minimum located at m = 0. This situation changes
at b(T ) = 9c2 /32d where a second minimum appears at m = mf corresponding to an ordered state, see
Fig. 9.5. The temperature associated with the appearance of this extra minimum we denote as T1 .
𝑇 = 𝑇1
𝑇 > 𝑇𝑐
𝑇 = 𝑇𝑐
𝑇 < 𝑇𝑐
Figure 9.5: Landau free energy given by Eq. 9.4 as a function of m for a range of temperatures. For
T > T1 the only minimum corresponds to the disordered state. For T0 < T < T1 , there are two minima,
but the disordered state (m = 0) is the global minimum. Hence, the disordered state is stable, while
the ordered state is metastable. At Tc we see a jump in the global minimum from the disordered state
to the ordered state. Hence, at Tc , we see a first-order phase transition between the disordered state
(m = 0) and the state characterized by the minimum at finite m.
144 CHAPTER 9. LANDAU THEORY
For the region c2 /4d < b(T ) < 9c2 /32d there continue to be two minima located at m = 0 and m = mf ,
respectively. However, within this range the free energy at m = 0 is always less than the free energy at
m = mf and so the disordered state is stable. However, at b(T ) = c2 /4d, the global minimum in the free
energy jumps discontinuously from m = 0 to m = mf . This discontinuity is a signature of a first-order
phase transition, and we define the temperature associated with this transition to be Tc . Finally, at
T = T0 the sign of b changes and the disordered phase is no longer even metastable, i.e., there is only
a single minimum present in the Landau free energy corresponding to the ordered phase.
For the Ising model, we know that the free energy must be the same under a global flip of all spins,
because the (zero-field) Hamiltonian is invariant under this operation. The magnetization, however,
changes sign (m → −m) under a global spin flip. Thus, a Landau free energy expansion describing
the Ising model should satisfy: f (T, m) = f (T, −m). This can only be the case when the expansion
exclusively has terms even in m:
f (T, m) = b(T )m2 + dm4 + · · · . (9.5)
Finally, the free energy must go to infinity as m goes towards both plus and minus infinity. As we saw
in Cases 1 and 2, the coefficient b(T ) changes sign when the disordered phase is no longer stable nor
metastable. Consequently, we know that the free energy must contain at least a fourth-order power in
m. When we assume d > 0, we can terminate the expansion at m4 , which leads to
f (T, m) ≈ b(T )m2 + dm4 . (9.6)
This constitutes the minimal Landau expansion that satisfies the underlying microscopic symmetry
property of the Ising model.
Referencing the form we studied in Case 1, we immediately see that this expansion describes a second-
order phase transition. Hence, considering only the underlying microscopic symmetries, we have pre-
dicted that the phase transition in the Ising model is second order. A natural question would now be:
What if we added further even terms, would the order of the transition change? One of the exercises
will show you that this is indeed the case. Clearly, one should exercise caution in using Landau theory
and interpreting its result.
Along that line of thinking, we should remark that from Eq. (9.3) we obtain a critical exponent of
β = 1/2. This is clearly not the right value for two- and three-dimensional Ising models, also see
Chapter 7. As a mean-field description, Landau theory lacks predictive power near the critical point,
so this is not unexpected.
randomly distributed throughout space. In a crystal, the particles are situated on a lattice, i.e., they are
positionally ordered. If we know the average position of a particle at one site of a lattice, we can fairly
accurately infer the positions of the other particles. Similar observations can be made about the phases
of particles that interact via a spherically symmetric pair potential. By such a pair potential, we mean
that if particle 1 is at position r 1 and the position of particle 2 is given by r 2 : ϕ(r 1 , r 2 ) = ϕ(|r 1 − r 2 |).
𝑧 𝒑
𝒓 𝑦
Figure 9.6: Cartoon of an ellipsoid showing the position of its center of mass r together with a vector
defining its orientation p. Note that both are required to fully describe the particle.
However, there are many situations which can arise where the interaction potential is not spherically
symmetric. One way to introduce anisotropy into the pair potential is simply to consider non-spherical
particles, we will return to this topic in Chapter 17. Then the interaction potential between the two
particles will depend on both the orientation of the particles, as well as their center-to-center distance.
One of the simplest examples of a non-spherical particle is an ellipsoid. Here, we denote its center of
mass by r and its orientation by p, see Fig. 9.6.
Extra phases compared to those associated with spherically symmetric potentials emerge as a con-
sequence of the added rotational degrees of freedom needed to describe the ellipsoid. In particular,
we encounter situations that have neither orientational nor positional order, the isotropic phase, see
Fig. 9.7a. When the particles are globally aligned along a director n, but do not possess positional
order, the system is in the nematic phase, see Fig. 9.7b. The particles can be orientationally ordered,
as well as positionally ordered into layers, which is called a smectic phase, see Fig. 9.7c. Finally, we can
have full positional and orientational order, the crystal phase, see Fig. 9.7d. The difference between a
smectic and a crystal is that in the smectic the ellipsoids are disordered in orthogonal direction, i.e.,
the layers are fluid-like.
Here, we study the isotropic to nematic phase transition in ellipsoids. In both these phases, the particles
are not positionally ordered. The main difference is found in the orientational order, absent in the
isotropic phase and present in the nematic phase. Specifically, in the nematic phase, the particles on
average point in a specific direction, which we will denote by the unit vector n. For this transition,
we do not have an immediate inroad to a symmetry argument. In fact, we do not even have an order
parameter, yet, so let us define that first.
To describe this system, we require an order parameter m which is zero in the case where the particles
are not orientationally ordered and non-zero whenP the particles point along the director n. As a first
attempt, we might try something like mtrial = 1/N i pi ·n where pi is a unit vector which indicates the
orientation of the ith particle, and N is the number of particles in the system. However, the symmetry
146 CHAPTER 9. LANDAU THEORY
of the particle is such that we cannot distinguish between a particle pointing in the n direction from a
particle pointing in the −n. Hence, we need the order parameter to also be equivalent
P for n and −n. One
2 2
way to accomplish this is to take the square of the dot product: msquare = ⟨1/N i (pi · n) ⟩ = cos (θ) ,
where θ is the angle between n and pi . However, we still have a problem: we want to order parameter
to be zero in the disordered phase and nonzero in the ordered phase. To calculate the expected value
of the order parameter in the disordered phase we write
Z 2π Z π
1 2 1
msquare = dϕ dθ sin θ cos (θ) = , (9.7)
4π 0 0 3
which means that msquare is not zero in the disordered phase. We can correct this however, by
subtracting
P 2
this part from our order parameter. We are now left with m ∝ 1/N i cos (θ) − 1/3 . Finally,
although not strictly necessary, we can choose m such that its maximum value is 1. Note that this does
not change the behavior in the disordered phase, but it is typically done in literature and so we will do
the same here. We thus obtain the definitive order parameter for the isotropic-nematic transition
1 X1 2
m= 3 cos (θ) − 1 . (9.8)
N i 2
(a) (b)
𝑧
(c) (d)
Figure 9.7: Possible phases for ellipsoidal particles. (a) The isotropic phase is characterized by the
absence of positional and orientational order. (b) The nematic phase possesses average orientational
order, but no positional order. The particles are orientated in the direction of the blue arrow, which
represents the nematic director n. (c) The smectic phase has orientational order along n and positional
order in one direction, i.e., it is layered in one direction and disordered (positionally) in the orthogonal
planes. (d) The crystal phase possesses both threefold positional order and orientational order.
Now that we have an order parameter for the isotropic to nematic phase transition, we would like to write
down the free energy. We know that there should be two possible minima in the free energy (depending
9.5. EXAMPLE: ISOTROPIC TO NEMATIC PHASE TRANSITION 147
on the temperature), one when the system is disordered and one when the system is ordered. Unlike in
the Ising model, here we keep both the even and odd ordered terms. Hence, up to forth order we obtain
with c and d positive; c is positive since there should be some lowering of the free energy for aligned
ellipsoids while d is positive to ensure that the free energy goes to infinity for m → ±∞. We conclude
that the free energy is of the form that studied in case 2. This Landau theory therefore describes a first-
order phase transition between the isotropic phase and the nematic phase. This turns out to accurate,
as we shall see using Onsager theory in Chapter 17.
But wait, what symmetry argument motivates this? Is the ellipsoid not also front-aft symmetric? Why
do we not therefore eliminate the odd m3 term? This is not the case, because flipping each ellipsoid
individually is not a natural symmetry operation on the entire space inhabited by the ellipsoids. The
global symmetries that preserve the ellipsoid Hamiltonian are rotations. However, these do not impose
constraints on the existence of an m3 term, as the nematic director rotates along with the entire system,
leaving our m invariant as well. In the case of the Ising model, the global operation did not leave its m
invariant, i.e., m → −m in that case.
The above line of argument may not be entirely satisfactory, as we only touch upon the concept of
symmetry in these notes, rather than formalize it. However, we hope to have given you a flavor of
what Landau theory is and why researchers would be enthusiastic about a method that allows for the
prediction of the order of a phase on the basis of symmetries that leave the Hamiltonian invariant.
148 CHAPTER 9. LANDAU THEORY
9.6 Exercises
Q62. Landau Theory and the Ising model
(a) Starting from the mean-field expression for the free energy of the Ising model without a field,
show that the Landau expansion is of the form of Case 1.
(b) What is the transition temperature associated with this Landau expansion. How does it
compare to the transition temperature we determined from mean field theory?
Q63. Landau Theory from Pathria and Beale (Statistical Mechanics)
Consider a system where the Landau free energy can be expanded as
f (T, m) = rm2 + sm4 + um6 (9.10)
where u > 0. Define α = −(3ur)1/2 , γ = −(4ur)1/2 , and ξ = (r/u)1/4 . Then as we did when we
considered the two possible cases, characterize the phase transition associated with this expansion.
Proceed along the following route:
(a) Plot the function using Mathematica for several different values of r, s and u. What types
of phase transitions seem possible for this Landau free energy?
(b) What are the five possible minima associated with this Landau free energy?
(c) Show that for r > 0 and s > α , m0 = 0 is the only real solution.
(d) Show that for r > 0 and γ < s ≤ α, m0 = 0 or ±m1 , where
p
2 −s + (s2 − α2 )
m1 = . (9.11)
3u
However, the minimum of f at m0 = 0 is lower than the minimum at m0 = ±m1 , so the
global minimum is at m0 = 0.
(e) Show that for r > 0 and s = γ, m0 = 0 or ±ξ. Now, the minimum of f at m0 is of the same
height as the ones at m0 = ±ξ, so a nonzero spontaneous magnetization is as likely to occur
as the zero one.
(f) Show that for r > 0 and s < γ, m0 = ±m1 . Explain how this indicates a first-order phase
transition. Note that the line s = γ, with r > 0 is generally referred to as a “line of first-order
phase transitions”.
(g) Show that for r = 0 and s < 0, m0 = ±(2|s|/3u)1/2 .
(h) Show that for r < 0, m0 = ±m1 for all s. As r → 0, m1 → 0 if s is positive.
(i) Show that for r = 0 and s > 0, m0 = 0 is the only solution. Explain why the line with r = 0
and s positive is a line of second order phase transitions?
Q64. Simple Model for the Isotropic to Nematic Phase Transition
In this exercise, we examine a simple model system that undergoes an isotropic-like to nematic-like
phase transition. We consider a system of particles on a square lattice. Each particle covers L
lattice sites, arranged in a line. There are two species of particles A and B, oriented horizontally
and vertically, respectively, see Fig. 9.8.
All A particles are identical and all B particles are identical. Whenever two particles overlap,
they have an interaction energy of ϵ > 0. Particles that do not overlap have an interaction energy
of 0. Note that each particle can interact with multiple other particles, even at the same lattice
site. Assume there are NA particles of species A and NB particles of species B, in a volume V at
temperature T , so that the number density of the two species are ρA = NA /V and ρB = NB /V .
Additionally, assume periodic boundary conditions.
9.6. EXERCISES 149
Figure 9.8: A horizonal A-type particle (left) and a vertical B-type particle (right) for L = 3.
(e) In the disordered phase, pA = 1/3. Define an order parameter m such that m = 23 (pA − 1/3).
Note that is it 0 in the disordered phase and non-zero in the ordered phase. Rewrite the free
energy in terms of m. Show that the Landau expansion for the free energy can then be
written
m3 m4
βF K 2
+ 1 − K m2 − + O m5 .
= − log(3) − + (9.19)
N 3 3 3 2
(f) Using the Landau free energy, determine the transition temperature for this system. Is the
phase transition continuous or discontinuous? Explain.
152 CHAPTER 9. LANDAU THEORY
Chapter 10
In Chapter 7, we saw that for the 2D Ising model there is a logarithmic divergence of the heat-capacity
as a function of the temperature T near the critical point, i.e., the temperature T = Tc for which the
phase transition takes place. The magnetization exhibits a power-law behavior in T near this point.
Divergences and power-law behavior are a common feature of systems that undergo a continuous phase
transition. Unfortunately, the 2D Ising model is rather involved to study and the 1D Ising model has no
finite-temperature phase transition. In this chapter, we will therefore consider another (simple) model
that has such a phase transition: site percolation. This model will allow us to gain intuition into the
features of a critical system that underlie these divergences. It will turn out that these are correlations
and a loss of ‘scale’ in the system. A more precise definition of the concept of correlation will be given in
Chapter 11. However, the notion of scale invariance will be all that is required to set up the theoretical
machinery to study the critical point, as we will see.
In most cases in physics, we extract information from complicated systems by first determining which
length scales are important. This allows us to coarse grain the model so that we can treat the system
correctly with respect to these important length scales: such procedures typically lead to perturbation
theories. However, at critical points scale invariance interferes with this paradigm: there is no set scale
around which to perturb the system. Scale invariance also means that the system has self-similarity.
That is, if there is no intrinsic scale, then the system must ‘look similar’ irrespective of the size we
are considering. Here, we introduce a completely different and rather beautiful method for studying
continuous phase transitions that exploits this scale invariance: renormalization-group theory (RG).
RG was developed in the early 1970s by Kenneth Wilson, and its importance was recognized with the
1982 Nobel Prize in Physics. Wilson’s Nobel lecture can be found on the Nobel Prize website. RG has
been applied to an enormous variety of systems. It should be mentioned that RG is more of a way
of looking at a system than a method in itself. That is, the exact implementation will depend on the
system in question. Hence, the main goal of this chapter is to give the reader a flavor for what is possible
within the RG framework.
153
154 CHAPTER 10. REAL-SPACE RENORMALIZATION GROUP
𝑝 < 𝑝𝑐 𝑝 > 𝑝𝑐
Figure 10.1: A square lattice where grey squares are occupied and white ones are not. On the left, we
see that the occupied squares do not percolate the system, while in the system on the right they do.
Here, p is the probability of coloring a square grey.
of interest. Let us assume that we have a 2D square lattice with periodic boundary conditions, e.g.,
see Fig. 10.1. Each block on the lattice is either occupied or unoccupied. In this percolation model
we assume that each block is occupied with a probability p. Note that we assume that the blocks do
not interact. Then there exists a probability pc above which the occupied blocks will percolate the
lattice, i.e., form a connected path from left to right and/or top to bottom, and below which they do
not. This probability is referred to as the percolation transition or percolation threshold. Where the
threshold lies exactly depends on the specific choice of percolation. One could imagine demanding that
only a path from left to right to exist, though for this system here unidirectional percolation coincides
with bidirectional percolation in the thermodynamic limit.
In Fig. 10.2, typical configurations are shown for various system sizes of a system below, at, and above
the percolation threshold, with occupied sites in the same cluster indicated by the same color. Here,
clusters are determined by the requirement that any two occupied sites that share a horizontal or
vertical border are in the same cluster. Note that the structure of the clusters is special at the critical
point. The structure appears frayed and it is not easy (or even possible) to order the snapshots by size,
without zooming into the smallest scale of a single site for reference. This lack of intrinsic order is a
characteristic of scale invariance as at each scale the system is self-similar, i.e., it looks like itself. You
may have already encountered scale invariance in studying fractal systems. It is possible to assign a
fractal-like quality to systems at the critical point. This will impact how they behave, more so than
their microscopic interaction rules, as we will see in Chapter 12.
Let us now gain a feeling for RG by applying it to calculate the phase transition in this system. To
do so, we need to make a mapping between the original blocks and a new set of super blocks, which
are simply groupings of blocks. The mapping we choose is shown in Fig. 10.3. Note that this mapping
is only an approximation. That is, we make a choice for what we consider connecting and what we
do not. The consequence is that our outcome for percolation depends on the choice. We now look at
the behavior of the system as we apply consecutive renormalization steps, each corresponding to the
mapping in Fig. 10.3. Each step can be thought of as ‘zooming out’: we look at the system at a larger
length scale. Since at the critical point the system looks the same at all length scales, we will look for
10.1. SITE PERCOLATION 155
Figure 10.2: Example of critical behavior in a percolation model, see Section 10.1. Sites are considered
occupied or empty with a probability p and (1 − p), respectively. The sites are then clustered based
on nearest neighbors, and sites belonging to the same cluster have been given the same color. Below
the critical point, for p = 0.45 (left column), there are no system-spanning clusters, as revealed by
the small streaks representing clusters in the highest-magnification (bottom row); from top to bottom
the magnification increases by a factor of 5 for each row. Above the critical, point for p = 0.63 (right
column), there is a system spanning cluster. The smallest zoom-in shows little islands of unoccupied
sites within it. However, at the critical point p = pc ≈ 0.592746 (middle column), the relative size of
the clusters compared to the box does not change noticeably. The system looks similar on all length
scales. Yellow hues correspond to empty sites, clusters are indicated with reds.
156 CHAPTER 10. REAL-SPACE RENORMALIZATION GROUP
(1 − 𝑝)4
4𝑝(1 − 𝑝)3
6𝑝2 (1 − 𝑝)2
4𝑝3 (1 − 𝑝)
𝑝4
Figure 10.3: Graphical representation of the renormalization-group transformation that we are using
for the percolation system. In the column titled Original blocks we have the system before a RG step,
while the mapping for that set of blocks is show in the column Super blocks. The probability of finding
the super blocks is provided in terms the original probability p for being occupied in the last column.
values of p where this zooming out does not change the structure of the system. The implication is that
we are searching for that point where the structure is self-similar, which is exactly the property of the
critical phase. In detecting this point, we are thus able to localize the phase transition.
We proceed in this direction by determining a relationship between the probabilities of the original
blocks being occupied (p) and the super blocks being occupied (p′ ):
p′ = R(p). (10.1)
These probabilities are also shown in Fig. 10.3. When we sum the total probabilities for the occupied
super blocks we obtain the following:
R(p) = p4 + 4p3 (1 − p) . (10.2)
To determine the probabilities for which the system looks the same at all length scales, we look for
the fixed points associated with Eq. (10.1). Since R(p) is given by Eq. (10.2), Eq. (10.1) is simply a
polynomial, which we can solve exactly, either √ by hand, using Mathematica, or numerically. We obtain
three fixed points: p∗ = 0, 1 and (1/6)(1 + 13) ≈ 0.767592. In Fig. 10.4, we summarize the RG
procedure. The top panel shows that a system with p < pc will become less populated by making super
blocks, while for p > pc it will more populated (middle panel). This leads to the RG flow diagram shown
in the bottom panel, which indicates the three fixed points, two stable ones at p∗ = 0 and p∗ = 1 and
an unstable fixed point at p = pc . The arrows depict the direction of flow from p to p′ that occurs when
performing a renormalization step in the two regions. We observed that p∗ = pc corresponds to the only
non-trivial fixed point, and is therefore the percolation threshold.
The mapping described in Fig. 10.3 is only an approximation. This should be clear from the fact that
our RG theory predicts another value of pc than we used in Fig. 10.2. The result is dependent on the
way the RG procedure is set up, i.e., RG makes errors by introducing a specific super block format.
Figure 10.5 shows an example of two original blocks side-by-side which are not system spanning, but a
two resulting super blocks that do. The mapping we used does not properly address these configurations.
10.1. SITE PERCOLATION 157
𝑝 < 𝑝𝑐
𝑝 > 𝑝𝑐
0 𝑝 = 𝑝𝑐 ≈ 0.7676 1
Figure 10.4: Graphical example of a single renormalization step for a system below the percolation
transition (top) and above the percolation transition (middle). The associated RG flow diagram for
the square√lattice with the mapping show in Fig. 10.3 is shown at the bottom. The value p = pc =
(1/6)(1 + 13) ≈ 0.7676 represents an unstable fixed point, with the flow going toward 0 and 1 to the
left and right of this point.
Figure 10.5: Graphical example of one of the errors introduced by the renormalization transition de-
scribed in Fig. 10.3. The system on the left does not percolate while the system on the right does.
158 CHAPTER 10. REAL-SPACE RENORMALIZATION GROUP
Summarizing, we have shown an example of RG in action and how it can be used to determine the critical
point. We have also come to the realization that RG is an approximative procedure, which exploits the
nature of the critical phase, but which is not going to give the exact point where the transition occurs.
Referring back to Chapter 7, we see that the canonical partition function of the zero-field Ising model
can be written as
XX X
Z(N, β) = ··· e−βH(S1 ,S2 ,...,SN ) ;
S1 S2 SN
XX X βJ PN P′
Si Sj
= ··· e 2 i=1 j ;
S1 S2 SN
XX X
= ··· eβJ(S1 S2 +S2 S3 +···+SN −1 SN ) . (10.3)
S1 S2 SN
Performing the sum of the even spins, i.e., S2 = ±1, S4 = ±1, etc., we obtain
XX X
Z(N, K) = ... eK(S1 +S3 ) + e−K(S1 +S3 ) eK(S3 +S5 ) + e−K(S3 +S5 ) · · · (10.5)
S1 S3 SN
Now, in a renormalization group treatment, we would like this evened-out partition function, which now
depends on N/2 particles, to be of the same form as the original partition function, which concerned N
particles. In other words, we want the result the same after eliminating half of the degrees of freedom.
Explicitly, we would like the be able to write
where g(K) is an as-of-yet-unknown function. Next, we would be able to do the same thing for
Z(N/2, K ′ ), which leads to
Z(N/2, K ′ ) = g(K ′ )Z((N/2)/2, K ′′ ), (10.7)
and so on. Equation (10.6) thus defines a recursion relation for Z(N, K). The procedure we are following
here is also often referred to as a decimation-based RG transformation1 , and this is visualized in Fig. 10.6.
1 Note that “decimation” does not solely refer to ‘reduction’ by 10% in its historic context.
10.2. RG FOR THE ZERO-FIELD 1D ISING MODEL 159
𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾 𝐾
𝐾′ 𝐾′ 𝐾′ 𝐾′ 𝐾′ 𝐾′ 𝐾′ 𝐾′
Figure 10.6: Visualization of the decimation procedure for the zero-field, one-dimensional Ising model.
From top to bottom, we show two recursion steps in for the RG.
Examining Eq. (10.5), we note that the relation described by Eq. (10.6) is obeyed if
′
+S ′′ ) ′
+S ′′ ) ′ ′ ′′
eK (S + e−K (S = f (K)eK (S S ) . (10.8)
Specifically, if we use Eq. (10.8), the new partition function can be written as
N/2
Z(N, K) = [f (K)] Z(N/2, K ′ ). (10.9)
Now, we are left with two unknowns in Eq. (10.9): f (K) and K ′ . However, note that we have only four
possible choices for sets of S ′ and S ′′ . We can have S ′ = S ′′ = ±1 and S ′ = −S ′′ = ±1. Substituting
S ′ = S ′′ = ±1 into Eq. (10.8) we obtain
′
e2K + e−2K = f (K)eK , (10.10)
which simplifies to ′
2 cosh (2K) = f (K)eK . (10.11)
In the case where S ′ = −S ′′ = ±1, Eq. (10.8) reduces to
′
2 = f (K)e−K . (10.12)
We now have two independent equations, namely Eqs. (10.11) and (10.12), and two unknowns K ′ and
f (K). Solving this system results in
and
1
K′ =
log [cosh (2K)]. (10.14)
2
Note that at this stage, we can already partially solve for a fixed point, namely one to Eq. (10.14).
Unsurprisingly, K ∗ = 0 and K ∗ = ∞ are the only fixed points for the parameter K. The former is
attractive and stable, corresponding to the high-temperature limit, the latter is unstable and corresponds
to the limit T ↓ 0. This makes sense, as we know that the 1D Ising model should not undergo a phase
transition and that the fully aligned state, which can only exist at zero temperature, is unstable to small
increases in the temperature. The associated RG recursion and flow are shown in the left-hand panel
160 CHAPTER 10. REAL-SPACE RENORMALIZATION GROUP
to Fig. 10.7. The right-hand panel to Fig. 10.7 shows the corresponding result for the 2D Ising model,
which does have a fixed point. The recursion relation for this model is given by the logical extension of
the 1D decimation scheme
′ 3
K2D = log [cosh (4K2D )]. (10.15)
8
The graphical representation in Fig. 10.7 immediately suggests a criterion for stability of a fixed point,
namely based on the value of the derivative at K ∗ . This can be generalized to more complex flows.
𝐾′ 𝐾′
𝐾 𝐾∗ 𝐾
Figure 10.7: RG for the zero-field 1D (left) and 2D (right) Ising model. The recursion relation (red
curve) is shown using the staircase construction (black line) for some initial choice of K, with the
direction of the flow indicated using the black arrows. The 1D RG transformation only has two fixed
points K ∗ = 0 and K ∗ = ∞, the former being stable. 2D RG has a third fixed point, indicated using
a back dot, which is non-trivial and unstable. This corresponds to a fixed-point value of K ∗ ≈ 0.507,
as indicated by the dashed line. The blue line K = K ′ is an RG flow together with the blue arrows
indicating the direction; the full construction is provided here to give additional insight in the recursion
process.
We could stop at this point, as we have recovered our physical intuition. However, we have technically
not completed the full RG argument, since we do not have an expression for the partition function or
free energy in terms of K yet. Recall that the free energy is extensive, i.e., proportional to the number
of particles in the system and it may therefore be written as
Note that the function f1 (K) depends only on K and not the number of particles in the system. This
is a special result, because it implies that features of the single particle interactions are washed out in
favor of some effective measure that accounts for the structure of the system, i.e., its self-similarity at
the critical point. Combining Eqs. (10.16) and (10.9) we obtain
1 1
f1 (K) = log (f (K)) + log (Z(N/2, K ′ )). (10.17)
2 N
However, from Eq. (10.16) we also have
N
log (Z(N/2, K ′ )) = f1 (K ′ ). (10.18)
2
10.2. RG FOR THE ZERO-FIELD 1D ISING MODEL 161
which allows us to eliminate the explicit dependence on the partition function and rewrite Eq. (10.17)
to
1 1
f1 (K) = log (f (K)) + f1 (K ′ ). (10.19)
2 2
Using Eqs. (10.19) and (10.13), we can further express f1 (K ′ ) in terms of f1 (K) and K only
f1 (K ′ ) = 2f1 (K) − log 2 cosh1/2 (2K) . (10.20)
Here, the factor of 2 is due to the decimation length scale we have employed. This procedure, originally
by American researcher Leo Philip Kadanoff, can also be applied to larger sets of spins, say of length
s, which implies that factors of s would naturally appear in the above expressions. Equations (10.14)
and (10.20) together comprise the renormalization-group equations. In the exercises, we will go through
these calculations ourselves.
162 CHAPTER 10. REAL-SPACE RENORMALIZATION GROUP
10.3 Exercises
Q66. Vertical Spanning Cluster
Consider a square lattice. Assume that a cell spans if there is a vertically spanning cluster. Show
that R(p) = 2p2 (1 − p)2 + 4p3 (1 − p) + p4 . Find the corresponding nontrivial fixed point.
Q67. Renormalization in 3x3
Enumerate all the possible spanning configurations for a cell on a square lattice where the “super
blocks” are 3 × 3 instead of 2 × 2 as we did in the notes. Assume that a cell is occupied if a cluster
spans the cell both vertically and horizontally. Determine the probability of each configuration and
find the renormalization transformation R(p). What are the fixed points? Explain the significance
of the fixed points. Hint: Use the computer to tabulate these configurations.
Q68. Renormalization on a Triangular Lattice
Assume that particles can move on a two-dimensional (2D) triangular lattice and that they are not
able to overlap. The particles can occupy sites with probability p. Take hexagonal groupings of
7 lattice sites as the “super blocks” for a renormalization group argument, see Fig. 10.8. A super
block is filled whenever at least 4 of its 7 sites are occupied by particles. Compute the non-trivial
fixed point of this RG recursion.
Figure 10.8: (left) A single super block is comprised of 7 sites. (right) Several super blocks superimposed
on the triangular lattice being subjected to real-space renormalization.
Table 10.1: Evaluation of the renormalization group recursion relations associated with Eqs. (10.21)
and (10.22). The exact solution can be evaluated from Eq. (7.9). Note that a small error in the initial
guess of K and f1 (K) leads to increasingly smaller errors in the sequential evaluations.
Q70. RG for the 1D Ising Model with an External Field from David Chandler (Introduction to
Modern Physics)
Consider the one-dimensional Ising model with an external magnetic field. With suitably reduced
variables, the canonical partition function is
−1
" N N
#
XX X X X
Z(K, h, N ) = ... exp h Si + K Si Si+1 . (10.23)
S1 S2 SN i=1 i=1
where
1 cosh(2K + h)
h′ = h+ log ; (10.25)
2 cosh(−2K + h)
1 cosh(2K + h) cosh(−2K + h)
K′ = log , (10.26)
4 cosh2 (h)
and 1/4
cosh(2K + h) cosh(−2K + h)
f (K, h) = 2 cosh(h) . (10.27)
cosh2 (h)
(b) Discuss the flow pattern for the renormalization equations, h′ = h′ (h, K), K ′ = K ′ (K, h), in
the two-dimensional parameter space (K, h).
(c) Start with the estimate that at K = 0.01,
Before we can continue with our analysis of the critical point, we need to discuss correlation. This will
enable us to distinguish between various forms of critical behavior in Chapter 12. We have run into this
concept a few times during this course. For example, we used a lack of correlation to argue that there is
no phase transition in the zero-field 1D Ising model. We also encountered the concept of correlations in
our discussion of mean-field theory, where we purposefully made the approximation to ignore these. This
greatly simplified the theoretical description. However, this simplification came at a price, as mean-field
theory predicts a phase transition for the zero-field 1D Ising model. Thus, clearly, there are cases were
we must deal with correlations in the system in order to describe it accurately. Continuing this line of
thought, it should also come as no surprise that self-similarity at the critical point is a strong indicator
of (divergent) correlation.
In this chapter1 , we will make the concept of correlations — generally speaking statistical associations,
which are not necessarily causal — more precise and discuss correlation functions and their relation
to phase transitions. We will also introduce a new concept: correlation length as a means to quantify
spatial correlation2 . Correlation length can be understood as follows. Imagine cutting a material up
into smaller pieces. Most of the time, if you cut it up, the smaller pieces behave much the same as the
“whole”. However, if we cut it up sufficiently small, the properties of the material will change. The
length scale associated with this change is associated with the correlation length. In addition, this length
depends on external conditions, such as the temperature, pressure, etc. This will become more clear
shortly as we discuss several physical examples.
“Scaling and Renormalization in Statistical Physics”. Parts of this chapter are inspired by the discussion in his book.
2 The temporal equivalent is the decorrelation time as for time-dependent quantities one is often interested in how
165
166 CHAPTER 11. CORRELATION LENGTH AND FUNCTIONS
nature of your measurement. Now imagine a similar graph of the number of shark attacks as a function
of temperature. These can also be shown to be positively correlated; more people tend to go swimming
as the temperature of the water increases. This then implies that when you graph the number of ice
cream sales against shark attacks there will also be a positive correlation. Clearly, the latter is a case
of correlation not implying causation.
In physics, we typically wish to have a more useful notion of correlation, which is why the correlations we
study are often associated with a microscopic quantity of interest. For example, in a molecular system,
we can look at correlations in the positions of particles, as we will do next. In the case of the Ising
model, we examine correlations in the orientation of the spins separated by a given number of lattice
sites. In both cases, we will find that the correlations that we measure contain important information
about the microscopic organization in the system. This is directly tied into its structure, which relates
to the nature of its phases, as well as ultimately the associated transitions between these. Thus, here,
we are interested in correlations for which the causation is baked into their form.
𝑟/𝜎
Figure 11.1: The radial distribution function for three phases of water, adapted from figures by
Christophe Rowley. (left) Determining the number of molecules that are a distance r away from a
chosen molecule, i.e., the number of molecules of which the center is in the shell with radius r and
infinitesimal width dr. (right) The resulting (averaged) radial distribution function g(r) for the gas,
liquid, and solid phase. The main text explains what g(r) measures.
If we consider a fluid and a crystal, clearly there is a difference, which is associated with the organization
of particles. The fluid has little positional order, while the crystal is periodic3 . But that does not mean
there is no positional ordering in the fluid. How can we properly assess this?
3 The situation can also become more complicated. For instance, in the case of liquid crystals there is both positional
and orientational ordering: the isotropic system had neither, while the nematic phase had orientational ordering but no
positional ordering. The smectic phase even has orientational and partial positional order.
11.2. CORRELATIONS AND ORDERING 167
Let us take a single fluid particle, then on average other fluid particles are arranged in ‘shells’ around this
particle, as the molecules in the fluid cannot (fully) interpenetrate, see the left-hand panel to Fig. 11.1.
That is, knowledge of the position of a single molecule imposes limitations on where the other molecules
can be, at least locally. We can quantify this by determining a histogram of the number of particles
found at a radial distance r away from a particle of choice. When we average this over all particles in the
system, we can obtain a smooth curve in the thermodynamic limit. We expect the number of particles
in a bin to increase with r, and thus the function that we have obtained shows a similar trend. This
volumetric scaling can be divided out to arrive at the radial distribution function g(r), shown in the
right-hand panel to Fig. 11.1; note it tends to 1 due to the normalization. We will define g(r) exactly
in Chapter 14 and provide the simple relation between the radial distribution function and the pair-
correlation function for the fluid and gas phase. In a few words the relation is as follows. If you know
the vector-based positional correlation between a pair of particles, and the environment is homogeneous,
then taking one particle as the frame of reference and radially averaging gives g(r). Thus, we will use
g(r) as a stand in for the proper pair correlation function. If you do not enjoy this level of hand-waving
descriptiveness, please page ahead to Chapter 14.
It should be obvious that for a gas there is no real structure, beyond that particles cannot be at the
same position; the density is too low to promote shell formation. This is expressed by a nearly flat
g(r) which is zero in the region where the particles overlap. The small bump in g(r) indicates that you
are slightly more likely to find a particle one diameter away from the center of another, because they
cannot overlap. For a typical fluid we see a set of peaks that decrease in height. This implies that some
distances are more common than others, i.e., the particles are arranged in shells around our central one.
These give the peaks, while the valleys correspond to the mid-points between shells which do not readily
accommodate the presence of a particle, due to exclusions. Note that the height of the peaks decreases
with distance, implying that the system has only a short-range correlation. Or, in other words, the
shells become more spread out, as the particles have significant freedom to move in a fluid, leading to
loss of correlation. We can now readily imagine that the correlation length in the gas and fluid phase is
associated with the decay of the peaks. It turns out that this decay is exponential and thus leads to a
single natural length scale.
The behavior in the gas and fluid is distinct from that in a(n idealized) crystal, where the correlations
do not decay to zero, even at very long distances. That is, in a crystal the particles organize themselves
onto a lattice, such that knowledge of a single unit cell gives information on the position of a particle
many unit cells away. As a result, the correlations in a crystal are often called long-range. However,
it should be understood that long-range correlations do not have to stem from long-range interactions!
Even a system as simple as hard spheres can undergo a disordered-to-ordered (fluid-crystal) transition
as the density is increased.
From the above it should be clear that correlations and correlation functions give us another way to
characterize phases, as well as phase transitions, beyond the description provided by order parameters.
Types of ordering are usually divided into three groups: (i) short-ranged ordering where the correlations
decay rapidly to zero (typically exponentially), (ii) long-ranged ordering where the correlations decay to a
constant, and (iii) quasi-long ranged order where the correlations decay slowly (typically algebraically) to
zero. The slow-decaying correlations can, for instance, be found in the 2D hexatic phase that lies between
a fluid and a crystal of hard disks moving in a plane, see, for example, the work by Thorneywork et
al. [Phys. Rev. Lett. 118, 158001 (2017)]. A phase transition should be sought in a change of the
correlation, or equivalently in the behavior of the correlation length; let us turn to this aspect next.
168 CHAPTER 11. CORRELATION LENGTH AND FUNCTIONS
For a continuous phase transition, however, the result is markedly different. To illustrate this, consider
the 2D or 3D Ising model without an external field. If we smoothly lower the temperature, the system
abruptly changes a disordered to an ordered state at a temperature Tc . This is is associated with a
change in the average magnetization: above Tc the average magnetization is zero, while below Tc , the
magnetization is finite. In other words, the spins point up and down and randomly and intuitively there
should be limited correlation above Tc , while below Tc the spins are aligned and there is more significant
correlation in the system. Yet, looking at it like this, there does not appear to be a major difference
between a first-order and continuous phase transition. So what makes the continuous transition special?
The order parameter characterizing the system does not have a jump. The system accomplishes this
by being a single phase that has features of both the disordered and the ordered phase at the critical
point. This phase is often referred to as the critical phase. The critical phase has features of both phases
on all length scales, as has been visualized for the site-percolation model in Chapter 10. This leads to
correlations over all length scales. For every piece of ordered phase, you will encounter some order at
some larger length scale, and vice versa. The correlation length for such a structure turns out to be
divergent, as we will compute for the Ising model in the next section. This divergent correlation length
is a general property of continuous phase transitions and together with the structure of the system it
gives an inroad towards describing these systems theoretically.
For a ferromagnetic system, it is logical to study the spin-spin correlation function defined by
which measures the degree of correlation between spins at sites i and j. This definition is an extension
of the concept of variance you have encountered in probably theory. We expect G(i, j) to be a useful
quantity, as at infinite temperature, knowing the orientation of a spin at i does not inform us of the
orientation of a spin at j, G(i, j) is identically zero. In the ground state (all spins up or all down),
knowing the spin at i implies that we know the spin at j, and G(i, j) is also uniformly zero. Anywhere
in between, there will be some finite range of decay of the correlation, as the interaction tends to align
neighboring spins. We will quantify this range next.
4 The development of this section follows closely the discussion in Pathria and Beale, “Statistical Mechanics”.
11.4. CORRELATION FUNCTION FOR THE 1D ISING MODEL 169
Because in the 1D Ising model only the ground state is fully ordered, we have that ⟨Si ⟩ = 0 for any
T > 0. Thus, we can reduce Eq. (11.1) to
Now recall from Chapter 7 that we can write the partition function of the Ising model as
XX −1
X NY
Z(N, β) = ... eβJSi Si+1 . (11.3)
S1 S2 SN i=1
Next, we generalize this expression such that the pairwise interaction parameter J becomes site depen-
dent: Ji . This will allow us to perform a mathematical trick, as we will see shortly. In this case, the
partition function becomes
X X X NY −1
Z(N, β) = ... eβJi Si Si+1 . (11.4)
S1 S2 SN i=1
These sums can be carried out in exactly the same manner as we did in Chapter 7 resulting in
N −1
1 1 X
log Z(N, β) = log 2 + log (cosh(βJi )) . (11.5)
N N i=1
You can see that by taking the derivative of Eq. (11.4) and noting that the second term above becomes
the definition of the average of Sk Sk+1 . Hence, applying the derivative to the evaluated partition
function of Eq. (11.5), the correlation function for nearest neighbors becomes
We can now expand the (not nearest neighbor) spin-spin correlation function as follows
However, in the case of the simple Ising model that we studied in Chapter 7, we had Ji = J, i.e., the
coupling constant was not site dependent. Consequently, the spin-spin correlation function reduces to
We now define the correlation length ξ by writing Eq. (11.10) in the form
which assumes that there is a single, natural length scale in the system over which (on average) the
system looses the information that a spin at a given site k is pointing in a certain direction. This
assumption leads to the following expression for the correlation length
1
ξ= . (11.12)
log (coth(βJ))
In the limit of low temperature, i.e., when βJ ≫ 1, the correlation length simplifies to
1 2βJ
ξ≈ e , (11.13)
2
which implies that as T goes to 0, ξ diverges. This is consistent with what we found in Chapter 7. In
particular we found that the 1D Ising model (at zero field) did not have a phase transition, and more
specifically, that the 1D Ising model was ordered only when T = 0, i.e., any thermal fluctuation is
sufficient to disorder the system.
There is a subtlety here, however. At the start of this section, we noted that G(i, j) = 0 for T = 0.
In our analysis, a divergent correlation length ξ reintroduced in Eq. (11.11) would lead to a value of
G(i, j) = 1, i.e., the infinite correlation length crushes any finite positional difference in the exponent.
This seems in contradiction, until we note that for T = 0, ⟨Si ⟩ = ̸ 0, which we assumed in the above
calculation. In general, the correlation function in an ordered material does not decay to zero. For
example, in the 2D Ising model finite temperature defects below Tc imply a finite value of 0 < ⟨S⟩ ≤ 1.
However, we know G(i, i) = 1 and that spins close to our central spin are more likely to be pointed in
the same direction. The decay toward the finite limiting value of G defines the correlation length in this
case. Note that there is a second divergence of ξ as the temperature is lowered to T = 0 for the 2D
Ising model, as the correlation function flattens with the defects becoming further spaced out. In the
1D Ising model, Tc and T = 0 pathologically coincide.
First, note that G(i, j) depends only on the distance between spins i and j. Hence, we can write
G(i, j) = G(r). Then, the correlation function for the Ising model in d dimensions can be written
2 (d−2)/2
a
G(r) ∝ K(d−2)/2 (r/ξ) , (11.14)
ξr
where the correlation length is given by ξ = a(c′ /t)1/2 , with a the effective lattice constant, c′ a number
of order unity which depends on the exact structure of the lattice, and Kµ (x) a modified Bessel function.
Additionally, t = (T − Tc )/Tc , with Tc the transition temperature. In three dimensions (d = 3), the
Bessel function is given by: r
−r/ξ πξ
K1/2 (r/ξ) = e . (11.15)
2r
The correlation function then reduces to:
r
a π −r/ξ
G(r) = e . (11.16)
r 2
11.5. CORRELATIONS AND MEAN-FIELD THEORY FOR THE ISING MODEL 171
As T approaches Tc , the correlation length ξ ∝ (T − Tc )−1/2 diverges. Thus, correlations in the system
occur at all length scales. This is typical of continuous phase transitions. Note that this scaling defines
a critical exponent, which we will return to in Chapter 12. Lastly, note that the above expressions reveal
that there is a short-ranged (relative to the exponential decay) power-law component to the correlation
function. This is a result that holds in general.
172 CHAPTER 11. CORRELATION LENGTH AND FUNCTIONS
11.6 Exercises
Q71. Correlations and Eigenvalues of the Transfer Matrix
As we will prove in exercise Q72, the correlation length can also be found using the eigenvalues
of the transfer matrix, whenever a system is amenable to such a method for obtaining an exact
form of the partition function. Here, we will first show this to be true for the 1D Ising model by
evaluating the expression for the transfer-matrix-based correlation length:
−1
λ+
ξ = log , (11.17)
λ−
where λ+ is the largest eigenvalue and λ− is the second largest eigen value of the transfer matrix
T . That is, in general, an N × N transfer matrix has N eigenvalues λi , for which we require
λ+ > λ− > · · · > λN . In the case of the 1D Ising model there are only 2 eigenvalues, see Eq. (7.27).
Perform the necessary algebraic manipulations to recover the expression in Eq. (11.13). Hint: Note
the condition on the magnetic field that results in Eq. (11.13). Plot how the correlation length
departs from this zero-field result for several values of H.
Q72. The Transfer-Matrix Method
In this exercise, we will show that Eq. (11.17) holds for a general partition function that can be
decomposed by a transfer-matrix approach. Let H be a Hamiltonian, which depends on interac-
tions between N particles on a lattice that can all assume the same k states, labelled here as m.
Then we may write the Hamiltonian as follows
X N
N X N
X
H= mi Cmi mj mj + fmi mi . (11.18)
i=1 j=1 i=1
Here, Cmi mj indicates the coupling between states mi and mj and fmi represents interaction of
state mi with an external field.
(a) What would the form of m, C, and f have to be in order to recover the 1D Ising model?
(b) Write down the partition function for the general model.
Next, we assume that the labeling i corresponds to some 1D positional ordering with fixed separa-
tion on a ring, i.e., there are periodic boundary conditions. In addition, we assume that there are
only nearest-neighbor interactions and that the field is homogeneous across space. In this case,
we can use the transfer-matrix approach.
(c) Start by transforming the system to an orthonormal representation of the mi . That is, the
sth possible value of mi is identified with the unit vector ês of length k, where the sth element
is 1 and the rest is zero. What is the expression for mi in terms of these unit vectors, assume
that there is a diagonal matrix M that brings the orthonormal basis into the regular basis.
Write down H in the matrix form, as well as the partition function in this new basis.
(d) Let T i denote the ith transfer matrix for this model. Show that the individual terms in the
partition sum are given by êTi T i êi+1 , where the superscript T denotes transposition. Provide
an expression for the full transfer matrix. Use the properties of the unit vectors to show that
the partition function may be written
N
! N
!
X T Y Y
ZN = ê1 T i ê1 = Tr Ti . (11.19)
e1 i=1 i=1
11.6. EXERCISES 173
Finally, assume that all T i are the same, i.e., write T . Introduce the orthogonal matrices O
by which T can be diagonalized, i.e., T = OΛO T with Λ the diagonal matrix with eigenvalues
λ+ > λ− > · · · > λk . We will now compute the correlation between particles that 1 and L, which
are a reduced distance L apart. We are now in a position to derive the correlation length.
(e) What is entailed in the assumption that all T i are the same?
(f) Write down the expression for the correlation function ⟨m1 mL ⟩ in terms of the partition
function ZN and the Hamiltonian H in our orthonormal basis.
(g) Rewrite this using the transfer matrices to read
where the v are eigenvectors to Λ and the sums are over the elements.
(h) Take the thermodynamic limit to reveal
X λ k L
lim ⟨m1 mL ⟩ = m̃+ m̃+ + m̃k m̃+ , (11.22)
N ↑∞ λ+
k̸=+
Pk
where m̃i = q=1 mq v Tq v i , i.e., the value of m with respect to the eigenvectors of T .
(i) Argue that analogously, in the thermodynamic limit, ⟨mj ⟩ = m̃+ and that consequently the
correlation function reads
X λk L
G(L) = lim (⟨m1 mL ⟩ − ⟨m1 ⟩⟨mL ⟩) = m̃k m̃+ . (11.23)
N ↑∞ λ+
k̸=+
(j) Finally, take the limit L ↑ ∞ of the function −(1/L) log G(L) to obtain ξ −1 and thereby
recover the result in Eq. (11.17).
174 CHAPTER 11. CORRELATION LENGTH AND FUNCTIONS
Chapter 12
In Chapter 10, we discovered that continuous phase transitions are special in the sense that there is a
scale invariance. This gave us a handle on how to compute the critical point (approximatively) using RG
theory and obtain expressions for the free energy and partition function at this point. The latter were
dependent on the self-similar structure of the system, rather than on the interaction details. This then
implies that near a critical point, many properties of a system are largely independent of the microscopic
details of the interactions between the individual particles. Admittedly, that is an oversimplification, as
the 2D and 3D Ising models have different critical exponents and are therefore not quite the same in
this regard. However, it turns out that there are systems that appear vastly different from a microscopic
perspective, which nonetheless possess the same critical exponents. These systems are said to belong to
the same universality class. In this chapter, we will briefly explore what this means.
Using these definitions, we can write down the following five critical exponents for the Ising model:
for which it can be shown that α = α′ . Note that A is not typically equal to A′ , but (we will
not show this in these notes) renormalization-group theory predicts that the ratio A/A′ is the
175
176 CHAPTER 12. CRITICAL EXPONENTS AND UNIVERSALITY
same for members of the universality class. For the 2D Ising model α = 0, which implies that the
divergence is not power-law. There is a lower-order divergence, however, which is logarithmic in
nature, as we have seen previously for the 2D Ising model in Chapter 7. For the 3D Ising model
the exponent is given by α = 0.11008.
for t < 0. That is, the magnetization is defined for temperatures smaller than the critical temper-
ature. For the 2D and 3D Ising models β = 1/8 and 0.326419, respectively.
It follows from renormalization-group theory that γ = γ ′ . For the 2D and 3D Ising models γ = 7/4
and 1.237075, respectively.
δ: The magnetization at the critical temperature, i.e., when t = 0, follows the power law
1/δ
M ∝ |h| . (12.4)
For the 2D and 3D Ising models ν = 1 and 0.629971, respectively. Above and below the critical
point the power may be different in general, i.e., there can be a ν and ν ′ . Note that the correla-
tion length is related to the asymptotic behavior of the correlation function associated with the
fluctuations in the local magnetization by
e−r/ξ
G(r) = , (12.6)
r−(d−2+η)
with d the dimension of space and η another critical exponent, given by η = 1/4 and 0.036298 for
the 2D and 3D Ising models, respectively.
In the case of a system that undergoes a gas-liquid phase transition, we can determine the same critical
exponents. This might require some explanation. The liquid-gas transition is a first-order transition,
after all. However, recall from Chapter 6 that there is a temperature, the critical temperature Tc ,
above which the distrinction between a gas and a liquid vanishes. The phase above this temperature is
referred to as a fluid. At the critical temperature, the density gap between gas and liquid vanishes and
the transition between the two becomes continuous. Therefore, we can examine critical exponents for a
gas-liquid phase tranition, provided we are in the rather special case of T = Tc .
Clearly, our variables need to be changed to suit the system. The pressure with respect to the critical
pressure P − Pc takes the place of the applied field H. The magnetization m is replaced by ρl − ρc ,
where ρl is the density of the liquid and ρc is the density at the critical point. We now find
12.2. LANDAU THEORY CRITICAL EXPONENTS 177
−α
α: The specific heat is given by CV ∝ |t| when ρl = ρc .
β
β: The shape of the coexistence curve in the vicinity of the critical point is given by (ρl − ρg ) ∝ (−t)
for t < 0.
γ: The zero-field susceptibility is replaced by the isothermal compressibility, which behaves as follows
−γ
near the critical point kT ∝ |t| .
which describes the shape of the critical isotherm in the vicinity of the critical point.
ν: The correlation length has the same shape as that of the ferromagnet. However, the correlation
function is now a density-density rather than spin-spin correlation function.
Interestingly, the experimentally obtained critical exponents for argon are α = 0.108±0.010, β = 0.339±
0.006, γ = 1.20 ± 0.002, and η = 0.045 ± 0.010, from which it follows δ ≈ 4.5 and ν ≈ 0.62 [Anisimov et
al., Sov. Phys. JETP 49, 844 (1979)]. Similar analyses have been performed on xenon, carbon dioxide,
and hydrogen [Green et al., Phys. Rev. Lett. 8, 1113 (1967)], which led to comparable numbers. Lastly,
for the archetypal Lennard-Jones simulation model, see Chapter 13 for more information, the following
numbers were obtained: α ≈ 0.11, β = 0.3285(7), γ ≈ 1.23, δ ≈ 4.7, ν = 0.63(4), and η = 0.043
[Watanabe et al., J. Chem. Phys. 136, 204102 (2012)]. Here, we have used scaling relations
δ+1
νd = 2 − α = 2β + γ = β (δ + 1) = γ ; (12.8)
δ−1
γ δ−1
2−η = =d , (12.9)
ν δ+1
to obtain the numbers that were not provided, as indicated using ≈ signs. We will touch upon the
concept of scaling relations in one of the exercises.
The take-away message is that there are only a finite number of exponents and these are not fully
independent. The startling conclusion is that the critical behavior of the 3D Ising model, which is a
lattice-based description of (anti)ferromagnetism, and a general off-lattice molecular system undergoing
gas-liquid phase separation have the same critical exponents! Or in other words, these seemingly different
systems behave identically near the critical point and therefore belong to the same universality class.
The transition to the Bose-Einstein condensate in a Bose gas belongs to a different class.
a continuous phase transition. When h = 0, we can calculate the equilibrium value of m by finding the
minimum of f . Taking the derivative with respect to m gives us
When h = 0 and t < 0, i.e., in the ordered phase, the above expression implies that the magnetization
scales as m ∝ (−t)1/2 , which yields the exponent β = 1/2. The zero-field susceptibility (h → 0+ ) is
given by
−1
∂h 1
χ = lim = ′ . (12.12)
h↓0 ∂m t 2b t + 12dm2
When t > 0, the system is disordered and m = 0. Hence, the zero-field susceptibility is given by
1
χ∝ (12.13)
2b′ t
implyingp
the γ = 1. In contrast, when t < 0, i.e., when the system is ordered, the magnetization scales
as m = b′ |t| /2d. The positive root is chosen, since we are assuming that h ↓ 0. In this limit, we
obtain
1
χ∝ ′ , (12.14)
4b |t|
and therefore γ ′ = 1. Lastly, at the critical temperature, t = 0, the magnetization can be read off from
Eq. (12.11), which gives us
1/3
h
m∝ , (12.15)
4d
and δ = 3. Using the scaling relations we can summarize the critical exponents for the Landau expression
of Eq. (12.10) as: α = 0, β = 1/2, γ = 1, and δ = 3. The values of ν and η depend on the dimension of
the system and cannot be obtained in pure Landau theory. It turns out that this Landau theory is in
the same universality class as the four-dimensional Ising model. This might not be entirely surprising
as a mean-field theory should become better as the dimensionality increases.
12.3. EXERCISES 179
12.3 Exercises
Q73. Critical Exponents from Pathria and Beale (Statistical Mechanics)
Consider a system with a Landau free energy given by
with u a positive constant and t = (T − Tc )/Tc . Approach the tri-critical (three-phase critical)
point along the r-axis by setting r(t) = r′ t. Show that the critical exponents are β = 1/4, γ = 1,
and δ = 5. What is the value of α?
using the magnetic analogue of a Maxwell relation. Here, cM is the specific heat at constant
magnetization, which is defined analogously to cB . Hint: You need to make use of Maxwell’s
relations, Legendre transforms, and the tripple-product rule.
2
(b) Use stability of the system to write down the inequality for cB , namely cB > (T /χT ) (∂M/∂T )B
from Eq. (12.19).
(c) Assume H = 0 and t < 0 and plug in the near-critical power-law expansions for the M , cB ,
and χT to obtain α′ + 2β + γ ′ ≥ 2.
Further inequalities may be derived from the convexity of the free energy, such as the Griffiths
inequality α′ + β(δ + 1) ≥ 2. Proving the equalities requires a more sophisticated approach.
180 CHAPTER 12. CRITICAL EXPONENTS AND UNIVERSALITY
Chapter 13
The classical ideal-gas laws pV = N kB T and E = 23 N kB T , see Chapter 4, do not hold for real gases at
finite density ρ = N/V , because there are interactions between the atoms or molecules comprising the
gas. In theoretical descriptions of such systems, it is often assumed that the interactions are pairwise
additive, i.e., the interaction energy is a sum of terms characterized by a pair potential ϕ(r i − r j ) that
depends on the relative coordinates of particle i and j. For simple fluids the pair potential is radially
symmetric, which means that the Hamiltonian can be written as
N N
X p2i X
H(Γ) = + ϕ(rij ), (13.1)
i=1
2m i<j
with rij = |r i − r j | the radial distance between particle i and j and Γ the full 6N-dimensional phase-
space coordinates, as in Chapter 3. Radial symmetry is a good approximation for the noble gases, and
a fair approximation for small molecules like CH4 (methane), N2 , and O2 . The typical form of ϕ(r) for
such simple atomic systems is depicted in Fig. 13.1.
Over the years, many empirical ‘laws’ have been introduced to account for the deviations from ideality
for systems with real interactions. One of the most fundamental of these is due to Kamerlingh-Onnes,
who introduced, on empirical grounds, the virial expansion for the pressure. It is written as
p(ρ, T ) = kB T ρ + B2 (T )ρ2 + B3 (T )ρ3 + · · · ,
(13.2)
where the temperature dependent virial coefficients Bn (T ) can be obtained by fitting to the observed
deviations from ideal-gas behavior. In this chapter, we will derive the functional form of Eq. (13.2)
using the ensemble formalism developed in previous chapters, and find explicit expressions for the virial
coefficients in terms of the interaction potential between the particles.
181
182 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
ϕ(r)
repulsion
σ r
attraction
-ϵ
Figure 13.1: Typical pair potential ϕ as a function of the particle separation r for molecules of ‘diameter’
σ comprising a simple fluid. ϵ sets the attraction strength.
• The interaction is steeply repulsive for r > σ, with σ a measure for the diameter of the particle.
For simple fluids σ is typically 2 to 5 Å. This short-ranged repulsion is due to Pauli exclusion (and
Coulomb repulsion) between outer shell electrons of two particles in close proximity.
• The interaction is attractive for r ? σ, with Van der Waals interactions ϕ(r) ∝ −r−6 when
r ≫ σ. These attractions are caused by correlated (induced) dipole fluctuations between the two
particles. The range over which these attractions are appreciable is typically ≃ 2σ. The depth of
the minimum, −ϵ, that occurs at r ≃ σ, depends on the chemical species.
Physically, one wishes to capture these salient features of the molecular interactions using relatively
simple potentials, as we will return to shortly. The question we will address in this chapter is how to
relate microscopic interactions to macroscopic observables such as the pressure, and to phenomena such
as liquid condensation.
which gives good agreement with many experiments by adjusting the well depth ϵ and ‘size’ σ. The
term with exponent 12 captures the short-ranged repulsive nature, but this value of the exponent has
no physical motivation; it (still) is simply computationally efficient to compute. A less realistic, but an
analytically more tractable form is the square-well potential
∞ r<σ
ϕSW (r) = −ϵ σ < r < λσ , (13.4)
0 r > λσ
13.3. VAN DER WAALS THEORY 183
where σ denotes the hard-core diameter and where λ > 1 is a measure for the range of the attractive
well. Another important potential is the hard-sphere potential
∞ r<σ
ϕHS (r) = , (13.5)
0 r>σ
where σ is the hard-sphere diameter. The hard-sphere system does not contain any attraction, but does
describe the short-ranged atomic repulsions crudely. Neglecting attractions may seem unphysical at first
sight, but we will see that the hard-sphere fluid plays a crucial role as a reference zeroth-order approx-
imation in perturbation theory of liquids, where the attractions are treated as a small modification of
the purely repulsive short-ranged interactions. The hard-sphere fluid itself is also extremely interesting,
as it undergoes a fluid-solid phase transition that is exclusively driven by entropy. For this reason the
hard-sphere fluid has been of great theoretical importance. Moreover, due to advances in the synthesis
of colloidal particles — mesoscopic solid particles with diameter in the range from 1 nm to 1 µm —
experimental realizations of hard-sphere systems actually exist in the form of colloidal suspensions.
Figure 13.2 shows a plot of the pressure as a function of density at several temperatures, the curves
follow from Van der Waals’ expression [Eq. (13.6)]. At sufficiently high temperatures, T > Tc , the
pressure increases monotonically with density, whereas at low enough temperatures, T < Tc , there is
a density regime with (∂p/∂ρ)T < 0. The critical isotherm, at temperature Tc , separates these two
regimes, and shows a point of inflection with zero slope at the critical density ρc . The critical point
(ρc , Tc ) follows from the conditions
∂p
= 0
∂ρ Tc
ρc b = 1/3
2 =⇒ (13.7)
∂ p
= 0
kB Tc = 8a .
2 27b
∂ρ Tc
This result will be worked out in detail in one of the exercises. As discussed in Chapter 6, a negative
slope in the isotherm p(ρ), i.e., a negative compressibility, signifies a thermodynamic instability. That
184 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
p*
0.5
0.4
0.3
1.3
0.1 1.15
1.0
0.85
bρ
0.0 0.2 0.4 0.6 0.8
Figure 13.2: The reduced Van der Waals’ pressure p∗ = b2 p/a as a function of the reduced density ρb
for temperatures T above, at, and below the critical temperature Tc , as indicated by the labeling.
is, this is an example of an approximative free energy mentioned in that chapter. Critical exponents
can also be derived for the Van der Waals model, but we will not do so here.
We will now determine the Helmholtz free energy for this system. From p = −(∂F/∂V ) = −f +
ρ(∂f /∂ρ)T , with f = F/V the free-energy density, it follows from a straightforward integration that
ρΛ3
FVdW
fVdW ≡ = ρkB T log − 1 − aρ2 , (13.8)
V 1 − bρ
where the integration constant is chosen such that the ideal-gas free energy is obtained in the limit
ρ → 0. A plot of fVdW , in reduced units, is shown in Fig. 13.3.
Note that the convexity of the free energy is directly related to the slopes of the equation of state in
Fig. 13.2, since
∂2f
∂p ∂ ∂f
= −f + ρ =ρ . (13.9)
∂ρ T ∂ρ ∂ρ T T
∂ρ2 T
A negative compressibility is therefore equivalent to a concave part in f (ρ). Recall from Chapter 6
that the set of points ρ(T ) for which (∂ 2 f /∂ρ2 )T = 0 is called the spinodal. The spinodal densities
at T = 0.85Tc are indicated by the arrows in Fig. 13.3. Following the common-tangent methodology
described in Chapter 6, the full phase diagram of this system can now be determined. This will be left
as an exercise for the reader.
13.4. VIRIAL EXPANSION 185
f*
bρ f * + p*c - bρμ*c
0.2 0.4 0.6 0.8
0.08 0.50
-0.1
0.06
-0.2
0.5
1.0 0.04
spin.
-0.3 T/Tc = 1.15
spinodal
0.02
liquid
-0.4
1.0
0.00 bρ
gas
0.2 ρ*c 0.4 0.6 0.8
-0.5
-0.02
-0.6 T/Tc = 1.15
Figure 13.3: The reduced Van der Waals’ free-energy density f ∗ = b2 f /a (Λ = b1/3 ) as a function of the
reduced density ρb. The function is plotted for three temperatures T above (T = 1.15Tc ), at (T = Tc ),
and below (T = 0.5Tc ) the critical temperature Tc , as indicated using the labeling. The left-hand graph
shows regular form of the reduced free-energy density. The position of the coexistence points (black)
and the inflection points (gray) is indicated using points. The right-hand graph shows the same data,
but the reduced free-energy density is now shifted by the reduced critical pressure p∗c and a linear form
involving the reduced critical chemical potential µ∗c has been added. The latter makes it slightly easier
to see the common-tangent line for the sub-critical free energy. Note that this form implies that at the
critical point, the common-tangent coincides with the bρ-axis and that the critical point is located on
this axis, as indicated using ρ∗c .
with Cn (T ) the coefficients to be determined. Note the subtle difference in the way we define f here
with respect to in Eq. (13.8). Recalling that p/kB T = −f + ρ(∂f /∂ρ)T , we have
∞
p − pid ∂fex X
= −fex + ρ = (n − 1)Cn (T )ρn , (13.11)
kB T ∂ρ T n=2
which in combination with the virial expansion for the pressure, Eq. (13.2), yields
Bn (T )
Cn (T ) = n ≥ 2. (13.12)
n−1
The Helmholtz free energy can thus be written as
F B3 (T ) 3
= f = ρ log(ρΛ3 ) − ρ + B2 (T )ρ2 + ρ + ··· . (13.13)
V kB T 2
186 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
P∞ (−1)n+1 n
Combining Eq. (13.14) with the Taylor expansion log(1 + x) = n=1 n x , and collecting terms
with the same power of z̃, we can write
∞
X
log Ξ = V bj z̃ j , (13.16)
j=1
Inserting the expansion of Eq. (13.16) into the expressions for p(z̃, T ) = kB T (log Ξ)/V and N =
z̃(∂ log Ξ/∂ z̃)T = ρ(z̃, T )V yields
∞
X
p(z̃, T ) = kB T bj z̃ j ; (13.21)
j=1
∞
z̃ ∂ log Ξ X
ρ(z̃, T ) = = jbj z̃ j . (13.22)
V ∂ z̃ j=1
13.4. VIRIAL EXPANSION 187
Now we have both p and ρ as a power series in z̃, whereas much experimental data involves the density
dependence of the pressure, e.g., see Eq. (13.2). It is our task now to eliminate z̃ between the Eqs. (13.21)
and (13.22). This can be accomplished algebraically by writing
z̃ = a1 ρ + a2 ρ2 + a3 ρ3 + · · · , (13.23)
where the yet unknown coefficients aj follow by inserting Eq. (13.23) into Eq. (13.22) and equating the
resulting coefficients of each power of ρ on both sides of the equation. This gives
a1 = 1; (13.24)
a2 = −2b2 ; (13.25)
a3 = −3b3 + 8b22 . (13.26)
Higher-order terms become more complicated but are, in principle, tractable as well, though probably
using a computer to keep track of all the coefficients. Inserting the density expansion of z̃, given by the
Eq. (13.23) and the above ai , into the fugacity expansion of p, Eq. (13.21), yields a density expansion
of the pressure as phenomenologically provided in Eq. (13.2), but now with explicit expressions for the
virial coefficients,
B2 (T ) = −b2 ; (13.27)
B3 (T ) = 4b22 − 2b3 . (13.28)
B2 (T ) = −b2 ;
Z
1
= − dr 1 dr 2 exp[−βϕ(r12 )] − 1
2V
Z
1
= − dr f (r), (13.29)
2
where we used translational invariance of the pair interaction, ignored (small) surface effects that arise
when r 1 and/or r 2 are close to the wall of the container, and where we introduced the Mayer function
(named after the couple that first performed this analysis)
We have now expressed the lowest-order correction to the ideal-gas pressure in terms of the pair interac-
tion ϕ(r). The Mayer function is temperature dependent, and is shown in Fig. 13.4 for the Lennard-Jones
potential (13.3).
We first remark that f (r) for r < σ is rather insensitive to the details of ϕ(r); it equals −1 as long
as ϕ(r) ≫ kB T . At r ≈ σ, the Mayer function changes sign quite abruptly, goes through a positive
maximum and decays to zero as f (r) ≃ −βϕ(r) for r ≫ σ. The latter implies that B2 exists, i.e., is finite,
if ϕ(r) decays to zero more rapidly than r−3 . This convergence criterion excludes application of the
Mayer theory to the important cases of Coulombic fluids (electrolytes, ionic liquids, plasmas, etc.), as well
as dipolar fluids (water, magnetic colloids). However, systems interacting through, e.g., Lennard-Jones,
188 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
f(r)
βϵ = 2
eβϵ -1 1
1/2
σ r
-1
Figure 13.4: Mayer function f (r) for the Lennard-Jones potential, at several values of the reduced
temperature (inverse thermal energy) βϵ.
square-well, hard-sphere, and screened-Coulomb potentials, can be treated within this framework. Note
that the temperature dependence of f (r) implies that B2 (T ) can change sign at the Boyle temperature
TB , i.e., B2 (TB ) = 0. At low temperatures T < TB , we have B2 (T ) < 0, signifying that the Van der
Waals attractions reduce the pressure with respect to the ideal-gas pressure. At temperature T > TB
the hard-core repulsions increase the pressure beyond the ideal-gas pressure.
B3 (T ) = 4b22 − 2b3 ;
1
(Q3 − 3Q2 Q1 + 2Q31 ) − 3Q−1 2 2
= − 1 (Q2 − Q1 ) ;
3V
1
Q3 − 3Q−1 2 3
= − 1 Q2 + 3Q1 Q2 − Q1 ;
3V Z
1
= − dr 1 dr 2 dr 3 exp[−β(ϕ12 + ϕ13 + ϕ23 )] − 3 exp[−β(ϕ12 + ϕ13 )] + 3 exp[−βϕ12 ] − 1 ;
3V
Z
1
= − dr 1 dr 3 f (r12 )f (r13 )f (r23 ), (13.31)
3
where we used the shorthand notation ϕij = ϕ(rij ), the fact that Q1 = V , and that r i are dummy
integration variables. The product of three Mayer functions in Eq. (13.31) implies that B3 (T ) involves
three particles, and since f (rij ) vanishes, if particle i and j are separated, the product will vanish unless
all three particles are simultaneously close to each another.
points (representing coordinates r i , r j , etc.) and lines connecting points (representing f (rij )). Using
this we can write
Z
1
B2 (T ) = − dr 1 dr 2 s s ; (13.32)
2V
1
Z s
B3 (T ) = − dr 1 dr 2 dr 3 s s ; (13.33)
3V
s s s s s s
Z
1
B4 (T ) = − dr 1 dr 2 dr 3 dr 4 3 s s + 6 s s + @
s@s ;
(13.34)
8V
It has been proven that all diagrams appearing in the integrand of the virial coefficients are doubly
connected, i.e., are still connected when any point and all of its associated lines are removed. This
means that diagrams like the following
s
s s
do not occur. We remark that in most of the literature the integral symbol and the integral measure
dr 1 · · · dr n are ignored when the diagrams are defined, and often even the prefactors, such as the “3”
and the “6” into the expression for B4 , are absorbed in the definitions. We will not pursue this here.
The number of diagrams for Bn increases rapidly with n, e.g., 468 in B7 . For the hard-sphere potential
(diameter σ) B2 , B3 , B4 are known analytically,
2π 3
B2 = σ ≡ b0 ; (13.35)
3
2
5π 6 5
B3 = σ = b20 ; (13.36)
18 8
√ !
89 219 2 4131 1
B4 = − + + arccos √ b30 ; (13.37)
280 2240π 2240π 3
≃ 0.28695b30 , (13.38)
while higher-order terms have been calculated by various numerical techniques, e.g., by Monte Carlo
integration methods
B5 = 0.1097(3)b40 ; (13.39)
B6 = 0.0386(4)b50 ; (13.40)
B7 = 0.0138(4)b60 . (13.41)
The convergence of the hard-sphere virial expansion can be judged from Fig. 13.5, where the numbers
label the second, third, etc. virial approximation, and where the full curve is the ‘exact’ result, which
was obtained using Metropolis Monte Carlo simulations.
Obviously, the deviations between the exact equation of state and the truncated virial expansions become
more pronounced as the packing fraction η = (π/6)ρσ 3 increases, and the 7th-order expansion gives a
good account up to moderate packing fractions, say η ≃ 0.3. More recent studies have computed
190 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
βp
12
7
6
10
5
8 4
6
3
4
2
2
1
η
0.1 0.2 0.3 0.4 0.5 0.6
Figure 13.5: Equation of state of hard-sphere fluid as a function of the dimensionless density η =
(π/6)ρσ 3 (the packing fraction). The red solid line is an exact result from simulations.
coefficients up to the 12th order [R.J. Wheatley, Phys. Rev. Lett. 110, 200601 (2013)]. Nonetheless, we
may conclude from Fig. 13.5 that the virial expansion is not suitable to describe liquids quantitatively.
This is because a typical liquid density is η ≃ 0.5 and many terms will need to be computed at great
expense to describe this with sufficient accuracy, whilst there exist other, more efficient ways to treat
such dense fluids. Another reason why the virial expansion is cumbersome when applied to liquids is
that the virial coefficients are generally T -dependent, so that lengthy calculations must be performed
at many values of T . This contrasts the hard-sphere case, where the virial coefficients are temperature
independent. For these reasons other methods have been devised to deal with dense liquids.
The virial expansion is, nevertheless, a valuable tool in the study of dilute or moderately dense gases.
For instance, much of our knowledge of atomic pair potentials stems from measurements of virial coef-
ficients. Although we focused on the application of the virial expansion to a one-component, classical,
monoatomic gas, it is possible to extend these results to more components and polyatomic molecules,
as well as describe quantum effects. We will return to the two former in Chapter 17.
13.5. EXERCISES 191
13.5 Exercises
Q75. Van der Waals Equation of State
The van der Waals equation of state p(ρ) is given by
N kB T
p(ρ) = − aρ2 (13.42)
V − Nb
where N is the number of particles, V is the volume, kB is Boltzmann’s constant, ρ is the density,
p is the pressure, and a, b are phenomenological constants.
(a) Explain in your own words what a and b are. Explain in what way this is a mean-field
approximation. Argue why both a and b are positive constants.
(b) Use Mathematica and plot the equation of state for different values of a and b. Notice that
for fixed a and b, the shape of the equation of state changes as the temperature changes.
(c) Write the isothermal compressibility in terms of the density ρ. Recall that the isothermal
compressibility is always positive in stable systems. Argue that this implies that there are T
where, for a range of densities, the equation of state describes an ‘unstable’ system.
(d) Using your observations from part (c), argue that the critical point can be determined from
setting 2
∂p ∂ p
= 0 and = 0. (13.43)
∂ρ ∂ρ2
8a 1
kB Tc = and ρc b = . (13.44)
27b 3
(f) Use the isothermal compressibility to determine the spinodals. Note: You may wish to use
Mathematica to solve this.
(g) Determine the free energy assuming that in the limit of ρ → 0 it is that of the ideal gas.
(h) Let a = 1 and b = 1, then plot the free energy for various temperatures. Identify in each plot
if there is a coexistence region or not.
(i) Using the free energies from part (h), determine the phase diagram for this system.
where p is the pressure, ρ = N/V with N the number of particles and V the volume, and a and b
are coefficients that set the properties of the gas.
(a) Show that in the low-density limit Dieterici’s equation of state reduces to that of the Van der
Waals gas and provide the expression for the third virial coefficient.
(b) Interpret the ‘improvement’ made to the Van der Waals gas by the exponential term on the
right-hand side of Dieterici’s equation of state equation using a few words.
192 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
(c) Sketch three isotherms of p as a function of v = V /N for the Dieterici’s equation of state
above, at, and below the critical temperature Tc , respectively, indicating which is which.
Label your axes and indicate the spinodal points.
(d) Show that kB Tc = a/(4b), the critical pressure is given by pc = a/(4b2 e2 ), and critical volume
per particle is vc = 2b. Here kB denotes the Boltzmann constant.
(e) Describe in a few words what happens with κT when you cross a spinodal curve for T < Tc .
(f) Introduce t = kB (T − Tc ) and expand the compressibility around t = 0 and v = vc . What is
the value of the critical exponent γ, i.e., κT ∝ |t|−γ ? Describe in a few words what feature
of the system leads to this behavior.
(a) Define the reduced second virial coefficient of the Lennard-Jones (LJ) fluid as B ∗ = B2LJ /B2HS ,
where B2LJ,HS is the second virial coefficient of the LJ, HS system with the same σ. Give an
expression for B ∗ (T ∗ ) as a function of the reduced temperature T ∗ = kB T /ϵ. The (dimen-
sionless) integral cannot be calculated analytically, but easily numerically.
(b) The pair potential ϕ(r) must decay to zero sufficiently fast with increasing r for B2 to be
finite. Show, for the case that ϕ(r) ∝ r−n for r → ∞, that B2 only exists provided n > d
with d the spatial dimensionality. Does B2LJ exist in d = 3? Does B2 exist for Coulomb
interactions in d = 3?
194 CHAPTER 13. CLASSICAL NON-IDEAL GASES AND DILUTE FLUIDS
Chapter 14
The simplest way to take into account the effect of interactions between the particles, is through a
low-density (virial) expansion such as in Eq. (13.2). In terms of the pair potential such an approach
yields for the second virial coefficient
Z
1
B2 (T ) = dr (1 − exp[−βϕ(r)]) , (14.1)
2
but expressions for higher-order coefficients, say beyond B5 or so, become hopelessly complicated. In
fact, analytic calculations of B3 are already far from trivial. The rapidly increasing complexity of higher-
order virial coefficients prohibits practical applications of the virial expansion to dense fluids such as
molecular liquids. In fact, the problem is not only practical but also fundamental, as it is not guaranteed
that the radius of convergence of the virial series is sufficiently enough to include the high densities of
interest. Moreover, the virial expansion does not take into account the true nature of a liquid, in which
each molecule constantly interacts strongly with all its neighbors. That is, this route approximates
the liquid as ‘small groups’ of particles in vacuum, rather than as a small group embedded in a liquid
comprised of other similar molecules.
In order to obtain better expressions for the thermodynamics of dense fluids, we have to explicitly take
into account the structure of the fluid. The structure of the fluid is captured by correlation functions
such as the structure factor and nth-order real-space correlation functions, with the second-order being
the pair-correlation function that we encountered in Chapter 11. In this chapter, we will examine these
correlation functions, and using Ornstein-Zernike theory, we will explore the behavior of dense fluids.
Scattered radiation in a particular direction θ, with outgoing wavevector ko , satisfies |ko | = k for
195
196 CHAPTER 14. CLASSICAL DENSE FLUIDS
s1 r
θ
ki R
s2
ko
ki
to detector
θ
q
ko
Figure 14.1: Schematic setup of a scattering experiment, indicating incoming and outgoing wave vectors
ki and ko , respectively, as well as the scattering angle θ.
elastic scattering. Consider now the path-length difference ∆s = s2 − s1 between the path “source →
R → detector”, with R an arbitrary point in the sample, and the path “source → r → detector”, where
r is the position of a scattering particle. It follows from the geometry that ks1 = ki · (r − R) and
ks2 = ko · (r − R). From this the phase difference ∆ψ of the two paths at the detector is obtained as
2π∆s
∆ψ = = k∆s = (ko − ki ) · (r − R) = q · (r − R), (14.2)
λ
where we defined the momentum transfer q ≡ ko − ki in the scattering process. The contribution from
this particle to the field amplitude A at the detector is proportional to exp[i∆ψ]. The total amplitude
at the detector is given by the contribution from all particles, and can be written as
N
X
A(θ) ∝ exp[iq · (r j − R)], (14.3)
j=1
for some static configuration of N particles in the irradiated volume. The measured intensity is the
ensemble or time average of the squared modulus of the amplitude, and can be written as
*N +
X
2
I(θ) = |A(θ)| ∝ exp[iq · r jk ] ∝ S(q), (14.4)
j,k
* N
+ * N
+
1 X 1 X
S(q) ≡ exp[iq · r jk ] = 1 + exp[iq · r jk ] . (14.5)
N N
j,k j̸=k
Note that q is directly related to θ through q ≡ |q| = 2k sin(θ/2) from elementary geometry. It is
implicitly assumed that the fluid of interest is homogeneous and isotropic, so that only the modulus q
is relevant and not the direction of q. A typical liquid structure factor is shown in Fig. 14.2.
14.2. THE PAIR-CORRELATION FUNCTION 197
3
Lennard Jones
Hard spheres
𝑆(𝑘)
1
0 5 10 15 20 25
𝑘𝜎
Figure 14.2: Structure factor of a Lennard-Jones fluid close to its triple point (ρσ 3 = 0.844, kB T /ϵ =
0.72), and that of a hard-sphere fluid close to freezing (η = 0.495), as obtained from computer sim-
ulations. Image adapted from [J.P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic
Press, 2nd ed. (1990)].
which are called the one-particle distribution and the pair distribution function, respectively. Higher-
order distributions can be defined accordingly, but we do not need them here because we restrict attention
to pairwise interactions. The angular brackets in Eqs. (14.6) and (14.7) denote an ensemble average,
either canonical or grand canonical. ρ(1) (r) is a measure for the probability density that a particle is
present at position r. Because of the normalization dr ρ(1) (r) = N , we see that ρ(1) (r) is the local
R
density, and equals ρ = N/V in a homogeneous bulk system. ρ(2) (r, r ′ ) is called the pair distribution
function, and is a measure for the probability that there is a particle at position r and another one at
r ′ simultaneously. Within the canonical ensemble, we obtain from Eq. (14.7) that
Z N X
N
1 X
ρ(2) (r, r ′ ) = dr N exp[−βΦ(r N )] δ(r − r i )δ(r ′ − r j ) ;
QN i=1 j̸=i
N (N − 1)
Z
= dr N exp[−βΦ(r N )]δ(r − r 1 )δ(r ′ − r 2 );
QN
N (N − 1)
Z
= dr 3 · · · dr N exp[−βΦ(r, r ′ , r 3 , · · · , r N )]. (14.8)
QN
198 CHAPTER 14. CLASSICAL DENSE FLUIDS
At sufficiently long distances |r − r ′ | these probabilities become uncorrelated, and we have ρ(2) (r, r ′ ) →
ρ(1) (r)ρ(1) (r ′ ). In isotropic, homogeneous systems such as liquids and gases we can use translational
invariance to define the radial distribution function g(r) by
Note that ρg(r) is the average particle density at a distance r from a fixed particle. Also note that
limr→∞ g(r) = 1. For systems with pairwise additive interactions, the thermodynamics follows com-
pletely from g(r), that is from g(r; ρ, T ) in the canonical ensemble or from g(r; µ, T ) in the grand
canonical ensemble. There are three independent routes from g(r) to thermodynamics, viz.
1. Virial route:
ρ2
Z
p = ρkB T − dr rϕ′ (r)g(r); (14.11)
6
2. Caloric route:
ρ2
Z
E 3
= ρkB T + dr ϕ(r)g(r); (14.12)
V 2 2
3. Compressibility route:
Z
∂ρ
kB T =1+ρ dr g(r) − 1 . (14.13)
∂p T
The virial and caloric route follow straightforwardly from the canonical partition function of a pairwise
additive system, as will be shown in one of the problems. The compressibility route is necessarily derived
grand canonically, and follows directly from the normalization drdr ′ ρ(2) (r, r ′ ) = ⟨N (N − 1)⟩ =
R
Knowledge of g(r) does not only lead to the thermodynamics of the fluid, but also to the structure
factor S(q) (that can be measured in scattering experiments). The relation between S(q) and g(r) is
obtained as follows
* N
+
(14.5) 1 X
S(q) = 1+ exp[iq · r ij ] ;
N
i̸=j
Z N
1 X
= 1+ dr N exp[−βΦ(r N )] exp[iq · r ij ];
N QN
i̸=j
N (N − 1)
Z Z
1
= 1+ dr 1 dr 2 exp[iq · r 12 ] dr 3 · · · dr N exp[−βΦ(r N )] ;
N QN
Z
(14.8) 1
= 1+ dr 1 dr 2 exp[iq · r 12 ]ρ(2) (r 1 , r 2 );
N
Z
(14.10)
= 1 + ρ dr exp[iq · r]g(r), (14.14)
14.3. ORNSTEIN-ZERNIKE THEORY 199
i.e., S(q) is essentially the Fourier transform of g(r). Since g(r) approaches unity for large r it is
convenient to rewrite Eq. (14.14) as
Z
S(q) = 1 + ρ dr exp[iq · r] g(r) − 1 + (2π)3 ρδ(q),
(14.15)
where the last term is irrelevant as long as the scattering angle θ, and hence the scattering vector q, do
not vanish. Clearly, we can also invert Eq. (14.14) with the result
Z
1
ρg(r) = 3
dq S(q) − 1 exp[iq · r], (14.16)
(2π)
which can be used to deduce g(r) from a measurement of S(q). Typical g(r)s for dense fluids are shown
in Fig. 14.3.
This name stems from the fact that the gradient −∇i w(r12 ) is the average force f (r12 ) acting on particle
1, keeping 1 and 2 fixed and averaging over all the others,
(14.17) −kB T
∇1 w(r 12 ) = ∇1 g(r12 );
g(r12 )
Z
dr 3 · · · dr N exp[−βΦ(r N )] − ∇1 Φ(r N )
(14.8)
= Z ;
dr 3 · · · dr N exp[−βΦ(r N )]
which is a measure for the ‘influence’ of molecule 1 on molecule 2 a distance r12 away. In 1914, Ornstein
and Zernike proposed to split this influence into two contributions, a direct part and an indirect part.
The direct contribution is defined to be given by what is called the direct correlation function, denoted
c(r12 ). The indirect part is due to the direct influence of molecule 1 on a third molecule, labeled 3, which
in turn influences molecule 2, directly and indirectly. Clearly, this indirect effect must be weighted by
the density of particle 3, and averaged over all its possible positions. Mathematically this decomposition
can be written as Z
h(r12 ) = c(r12 ) + ρ dr 3 c(r13 )h(r32 ), (14.20)
200 CHAPTER 14. CLASSICAL DENSE FLUIDS
6
3
Hard spheres
PY
𝑔(𝑟) 4 2
𝑔(𝑟)
2 1
0
0
1 2 0 5 10 15 20
𝑟/𝑑 𝑟(Å)
Figure 14.3: Radial distribution functions g(r). (left) Dense hard-sphere fluid at volume fraction η =
0.49 showing excellent agreement between exact simulation results and the theoretical result based on
the Percus-Yevick closure. (right) Radial distribution function of triple-point liquid Argon (85 K) as
measured by neutron scattering. The ripples at small r are artifacts of the data analysis. The large
number of oscillations in g(r) is indicative of being close to a critical point. Images adapted from [J.P.
Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, 2nd ed. (1990)].
which is called the Ornstein-Zernike (OZ) equation. It can be viewed as the defining equation for the
direct correlation function c(r). One may also argue, however, that we have rewritten a function we wish
to calculate, h(r), in terms of another function that we do not know, c(r). In that sense Eq. (14.20) can
be viewed as a single equation with two unknowns, which can only be solved provided another relation
between c(r) and h(r) is given. Such an additional relation is called the closure. The power of the
decomposition given by Ornstein and Zernike is that approximate closures can be given, that allow for
explicit calculation of c(r) and h(r) at a given density and temperature.
Before discussing an example of such a closure, we remark that the OZ equation (14.20) can be rewritten
in terms of the Fourier transforms ĥ(q) and ĉ(q) of h(r) and c(r), respectively, as
ĥ(q) ĉ(q)
ĉ(q) = and ĥ(q) = . (14.22)
1 + ρĥ(q) 1 − ρĉ(q)
From Eqs. (14.15) and (14.22) we find that ĉ(q) is related to the structure factor through
1
S(q) = . (14.23)
1 − ρĉ(q)
A very successful and relatively simple closure is named after Percus and Yevick (PY). It consists of the
approximation that
PY
c(r) ≈ g(r) 1 − exp[+βϕ(r)] . (14.24)
14.3. ORNSTEIN-ZERNIKE THEORY 201
The physical motivation behind the PY closure is as follows. First we note that the direct correlation c(r)
can be seen as the difference between the (total) pair correlation g(r) ≡ exp[−βw(r)], see Eq. (14.17),
and an indirect term gind (r). This indirect term, which is defined as gind (r) = g(r) − c(r), is now
approximated as gind (r) = exp[−β w(r) − ϕ(r) ] = g(r) exp[+βϕ(r)], i.e., as the Boltzmann factor of
w(r) − ϕ(r). Equation (14.24) then follows readily. Having in mind that w(r) is the (total) potential
of mean force, while ϕ(r) is the pair potential, this approximation indeed captures the idea that the
indirect contribution is not due to direct pair interactions. Other, more technical motivations for the
PY closure can also be given, e.g., in terms of ignoring some subclasses of diagrams in the diagrammatic
expansion of c(r), but this is beyond the present goals.
The OZ equation (14.20) with the PY closure (14.24) constitutes two independent equations that can
be solved for the two unknown functions h(r) and c(r), at least in principle. In practice this can often
only be done numerically, but for the important case of the hard-sphere potential (13.5) analytic results
have been found. It is instructive to consider the PY closure (14.24) for the hard-sphere case explicitly,
as it can be rewritten as
g(r) = 0, r < σ
, (14.25)
c(r) = 0, r > σ
where σ is the hard-sphere diameter. This closure reflects the idea that for hard spheres the direct
correlation c(r) indeed vanishes if there is no direct hard-core overlap (this is an approximation, however).
Independently from each other Wertheim and Thiele showed, in 1963, that the PY closure to the OZ
equation of a hard -sphere fluid (diameter σ) at the dimensionless density η = (π/6)ρσ 3 (i.e., the packing
fraction) yields for the direct correlation function
2 r 1 3
2 r
−(1 + 2η)2 + 6η 1 + 21 η
− 2 η (1 + 2η)
c(r) = σ σ r<σ . (14.26)
4
(1 − η)
0 r>σ
This can be analytically Fourier transformed, from which the structure factor follows using Eq. (14.23).
Unfortunately, g(r) cannot be written down analytically, but a numerical Fourier transform of S(q) is
straightforward, and from Eq. (14.14) g(r) follows. The result is in very good agreement with computer
simulations of g(r) of hard spheres for 0 < η > 0.5. This is illustrated in the left-hand panel to Fig. 14.3.
Since the hard-sphere fluid freezes at η ≈ 0.494, the PY closure is accurate in the whole fluid regime of
hard spheres. In one of the problems we will calculate that the explicit form (14.26) for c(r) leads to
the pressure pc via the compressibility route (14.13), and to pv via the virial route (14.11), where
pc 1 + η + η2 pv 1 + 2η + 3η 2
= and = . (14.27)
ρkB T (1 − η)3 ρkB T (1 − η)2
The difference between these two expressions increases with increasing η, but both give good account
of the pressure that results from simulations. The (slight) inconsistency that results from the different
routes is due to the PY approximation; an exact theory would lead to fully consistent thermodynamics.
It turns out that the linear combination pCS = (2pc + pv )/3, which is named after Carnahan and
Starling, is indistinguishable from the simulations up to η = 0.5,
pCS 1 + η + η2 − η3
= . (14.28)
ρkB T (1 − η)3
In one of the problems it will be worked out that the Helmholtz free energy, FCS , that follows from pCS
reads
FCS 4η − 3η 2
= log ρΛ3 − 1 + , (14.29)
N kB T (1 − η)2
202 CHAPTER 14. CLASSICAL DENSE FLUIDS
where the first two terms are ideal gas terms, see Eq. (4.3), and the last one the excess term due to the
hard-sphere interactions.
Because a typical triple point density, ρtr , of a simple fluid like argon satisfies ρtr σ 3 ≈ 1, it is interesting
to compare the triple-point structure with that of hard spheres at ρσ 3 ≃ 1, i.e., η ≃ 0.5. Figure 14.3
shows such a comparison: the structure of a real dense fluid (argon) is qualitatively similar to that
of a dense hard-sphere fluid! This implies that the fluid structure is mainly determined by the short-
ranged repulsions, while the attractions hardly affect the high-density structure. This notion is a crucial
ingredient of the perturbation theory to be discussed in the next section.
We consider a Hamiltonian of the form (3.1), and decompose the interaction part Φ, formally, into a
reference part Φ0 and a perturbation Φ1 . At this stage the decomposition is arbitrary, but a typical
one would be to include the repulsions into Φ0 and the attractions into Φ1 . We define the auxiliary
Hamiltonian
N
X p2i
Hλ (Γ) = + Φ0 (r N ) + λΦ1 (r N ) ≡ H0 (Γ) + λΦ1 (r N ), (14.30)
i=1
2m
where H0 is the reference Hamiltonian, and λ ∈ [0, 1] a coupling constant or a switching parameter that
switches Hλ from the reference Hamiltonian at λ = 0 to the Hamiltonian of interest at λ = 1. The
Helmholtz free energy Fλ of the system with Hamiltonian Hλ can be written as
Z Z
1 1
exp[−βFλ (N, V, T )] = dΓ exp[−βHλ (Γ)] = dr N exp[−βΦ0 (r N ) − βλΦ1 (r N )].
N !h3N N !Λ3N
(14.31)
Taking the derivative with respect to λ on both sides of Eq. (14.31), and rearranging terms gives
Z
dr N Φ1 (r N ) exp[−β(Φ0 + λΦ1 )]
∂Fλ (N, V, T )
= Z ≡ ⟨Φ1 ⟩λ , (14.32)
∂λ N
dr exp[−β(Φ0 + λΦ1 )]
where the angular brackets ⟨·⟩λ denote the canonical ensemble average of systems with Hamiltonian Hλ .
Using Eq. (14.32), the Helmholtz free energy of interest can be written
Z 1
F (N, V, T ) = F0 (N, V, T ) + dλ ⟨Φ1 ⟩λ , (14.33)
0
with F0 the free energy of the reference system. Note that Eq. (14.33) is an exact result. Thermodynamic
perturbation theory is based on a λ-expansion of the integrand of Eq. (14.33) about λ = 0. It follows
14.4. THERMODYNAMIC PERTURBATION THEORY 203
Let us now focus on the case that Φ(r N ) can be written as a sum of pair potentials, with
N
X N
X
Φ0 (r N ) = ϕ0 (rij ) and Φ1 (r N ) = ϕ1 (rij ), (14.36)
i<j i<j
V ρ2 1
Z Z
F (N, V, T ) = F0 (N, V, T ) + dλ dr gλ (r)ϕ1 (r);
2 0
V ρ2 1
Z Z
dλ dr g0 (r) + λg0′ (r) + · · · ϕ1 (r);
= F0 (N, V, T ) +
2 0
V ρ2
Z
1
dr g0 (r) + g0′ (r) + · · · ϕ1 (r),
= F0 (N, V, T ) + (14.38)
2 2
where g0′ (r) = (∂gλ (r)/∂λ)λ=0 .
The crucial point of the perturbation theory for a dense liquid-like triple point Argon, e.g., a liquid
described by a pair potential of the form shown in the right-hand panel to Fig. 14.3, is that its radial
distribution function g(r) hardly differs from that of the corresponding hard-sphere fluid, see the left-
hand panel to that same figure. This implies that the decomposition
is such that g0′ (r) (and also higher-order derivatives with respect to λ) is small. Consequently, the free
energy is given accurately by first-order perturbation theory about the hard-sphere reference fluid, viz.
V ρ2
Z
F (N, V, T ) = FHS (N, V, T ) + dr gHS (r) ϕ(r) − ϕHS (r) ;
2
V ρ2
Z
= FHS (N, V, T ) + dr gHS (r)ϕ(r), (14.40)
2 r>σHS
where we used that gHS (r) = 0 for r < σHS , the hard-sphere diameter. One can now use the Carnahan-
Starling free energy of Eq. (14.29) as a very accurate representation of FHS , and the PY radial distri-
bution for gHS (r), to describe the thermodynamics of dense fluids quantitatively correctly. Note that
there is some freedom to choose σHS .
14.5. EXERCISES 205
14.5 Exercises
Q81. Refresher on Fourier Transforms
The mathematics of this chapter relies heavily on Fourier transforms. In this exercise, you will
consider several identities that are useful to the calculations that follow. We define the forward
and backward Fourier transforms as
Z Z
ˆ 1
f (ω) = dr f (r)e −iω·r
and f (r) = dω fˆ(ω)eiω·r , (14.41)
(2π)d
where d is the dimension of the vectors r and ω.
(a) What are the Fourier transforms of: (i) The Gaussian function in 1D, exp(−αx2 ) with α > 0
a constant? (ii) The Dirac distribution in 3D, δ(r − r 0 ) with r 0 a fixed point? (iii) The
Heaviside function θ(x), with θ(x) = 0 for x < 0 and θ(x) = 1 for x > 0? (iv) The 1D
function xn f (x) with f (x) a sufficiently smooth function that is itself Fourier transformable?
Note: One of these is a trick question.
(b) What is the Fourier representation of the gradient of a function f (r) with r a three-dimensional
vector. And what is it for the Laplacian?
(c) Show that the convolution of two Fourier-transformable functions
Z
h(x) = (f ∗ g)(x) ≡ dy f (y)g(x − y), (14.42)
R
R
The Ornstein-Zernike equation h(r12 ) = c(r12 )+ρ dr 3 c(r13 )h(r32 ) as given in Eq. (14.20) relates
h(r) to the direct correlation function c(r) of a fluid at density ρ. Here rij = |r i − r j |. We define
R
the Fourier transform ĥ(q) = dr exp(iq · r)h(r), and likewise for ĉ(q).
At higher densities a more accurate expression for c(r) is provided by PY theory, see Eq. (14.26),
which results in g(r) as plotted in Fig. 14.1, giving quantitative agreement with ‘exact’ simulation
results at densities as high as η = 0.49.
(a) How is the number density ρ (the number of spheres per volume) related to the packing
fraction η?
(b) The equation of state for the hard sphere fluid is approximately
Pliq V 1 + η + η2 − η3
= 3 . (14.46)
N kB T (1 − η)
What is the corresponding free energy? Hint: At very low packing fraction the hard sphere
liquid acts like an ideal gas.
(c) The hard sphere solid is a face-centered-cubic crystal. The equation of state of the hard
sphere solid is well approximated by
Psol V 3 x − 0.7072
= − 0.5921 (14.47)
N kB T 1−x x − 0.601
where
6η
x= √ . (14.48)
π 2
The excess free energy at a packing fraction η = 0.545 is βFex /N = 5.91889. Note: the excess
free energy is the difference between the free energy of the system and the free energy of an
ideal gas. What is the free energy as a function of η?
(d) Using the free energies you just calculated, what equations would you need to solve to predict
the coexistence between the liquid and solid in this system?
208 CHAPTER 14. CLASSICAL DENSE FLUIDS
(e) Using the result of part (d), what is the region of phase coexistence between the fluid and
the solid as a function of η?
(f) What is the pressure at coexistence?
(g) Which phase has the higher entropy at coexistence?
(h) Sketch the resulting phase diagram.
Q87. Equation of State for a Square-Well Fluid
Consider a system with a square-well (SW) pair-interaction potential given by:
∞ r<σ
ϕSW (r) = −ϵ σ ≤ r ≤ λσ , (14.49)
0 r > λσ
with λ > 1 a measure for the range of the attraction (strength ϵ) and r the center-of-mass
separation between the particles. The equation of state (EoS) can be determined using a virial
expansion at low densities. A more precise high-density EoS can be derived from the virial route.
(a) Show that the pressure of the SW fluid is given by
βρ2
Z
βpSW = ρ − dr rϕ′SW (r)gSW (r). (14.50)
6 V
with ρ the particle number density, gSW (r) the SW pair-correlation function and ϕ′SW (r) the
derivative w.r.t. r of the pair potential.
(b) Demonstrate that you can write:
where Θ(x) is the Heaviside function, which has the property Θ(x < 0) = 0 and Θ(x > 0) = 1.
(c) Introduce e(r) = exp [−βϕSW (r)] and y(r) = gSW (r)/e(r) and use these functions to rewrite
Eq. (14.50) in terms of these functions. Show that
2πσ 3 2
ρ exp(βϵ)y(σ) − λ3 (exp(βϵ) − 1) y(λσ) ,
βpSW = ρ + (14.52)
3
and use this to obtain
2πσ 3 2
ρ gSW (σ + ) + λ3 gSW (λσ + ) − gSW (λσ − ) .
βpSW = ρ + (14.53)
3
with the plus and minus signs indicating the direction in which the limit to the value is
taken, i.e., from above and below, respectively.
Obtaining good contact values for gSW (λσ + ) is in general difficult. The square-well EoS can
instead (more readily) be obtained by thermodynamic integration with respect to the hard-sphere
(HS) reference. For this reference system, the reduced free energy is given by
πσ 3 FHS (η) 4η − 3η 2
fHS (η) ≡ β = η log η − η + η , (14.54)
6 V (1 − η)2
in the Carnahan-Starling approximation, where we have ignored some constant offset.
(e) Give an expression for ϕ∗ (r) which defines the perturbative pair potential by which the system
goes from HS to SW interactions.
(f) Derive that the thermodynamic perturbation of the HS fluid can be written as
Z
3β 2
f∗ (η) = η dr gHS (r)ϕSW (r), (14.55)
πσ 3 r>σ
where gHS (r) is the HS pair-distribution function and fSW (η) = fHS (η) + f∗ (η).
(g) Show that that Eq. (14.55) may be approximated as f∗ (η) ≈ −12βϵη 2 gHS (σ + ) when λ ≈ 1.
(h) State in a few words how you would obtain the contact value of the pair-distribution function
gHS (σ + ) for hard spheres at low densities.
Our rewrite may not seem to have solved much, as we still require the contact value of gHS .
However, in the Carnahan-Starling approximation
1 2−η
gHS (σ + ) ≈ . (14.56)
2 (1 − η)3
3
(i) Determine the differential equation for the reduced equation of state β πσ6 pSW in terms of
fSW (η) as a function of η. Do not plug in expressions!
(j) Taylor approximate the fSW (η) up to O(η 3 ) and show that in this low-density approximation
3
β πσ6 pSW ≈ η + 4(1 − 3βϵ)η 2 holds. Can the system phase separate?
(k) Sketch a temperature T , density ρ, and coupling constant λ diagram, and explain using the
diagram what the purpose of thermodynamic perturbation theory can be in this regard.
(a) Calculate the Helmholtz free energy F0 (N, T ) of this system. Does F depend on the Ri ?
(b) How does F depend on the value of C? For which values of C does the free energy diverge?
Does that make sense physically?
We are now interested in the free energy F (N, V, T ) of a crystalline phase formed by hard spheres.
PN PN
We assume that the system is described by a Hamiltonian H = i=1 p2i /2m + i<j ϕ(rij ). Note
that the particles are in principle free to move at low density. However, in the solid phase, they
may be assumed to be effectively bound to cages through constraints imposed by their neighbors.
The particles can rattle around in their cages about a well-defined center.
(c) Use the Einstein crystal as a reference point to define a difference potential by which you can
carry out thermodynamic integration on the system. Name your coupling constant λ. What
is an appropriate choice for the center points to the Harmonic wells of the Einstein crystal?
Why is the hard-sphere interaction problematic in setting up the difference potential? Provide
your answers using only a few words.
(d) Suppose we remove the hard-sphere interaction and we physically connect the particles to
their closest lattice site. Explain in a few words what happens when λ ↓ 0?
210 CHAPTER 14. CLASSICAL DENSE FLUIDS
(e) If we keep the hard-sphere interaction, explain in a few words how you can effectively remove
particle interaction by appropriately choosing value of C (assume a non-close-packed density).
(f) Perform thermodynamic integration and show that
Z 1
F = F0 + dλ ⟨Φ1 ⟩λ . (14.57)
0
What is Φ1 in this context? Your answer should follow from the insights you have gained
above. What is the implict assumption you make in setting F0 to your result from (a)?
Chapter 15
In the case of first-order phase transitions, we discovered that such systems have coexistences — for
instance, in an N V T ensemble, we can have part of our system in the liquid phase and part of it in a
gas phase; assuming we are in the coexistence region. However, we have completely neglected to discuss
the interface between these two regions. In Exercise Q40 of Chapter 6, we argued that for a system in
the thermodynamic limit, the surface free energy did not contribute to the free energy per particle in
the system. However, there are cases where it can matter. In this chapter, we will examine the surface
free energy for a gas-liquid interface, typically referred to as the surface tension. We then build upon
this to discuss what happens inside the region of the phase diagram between the binodal and spinodal
curves. This is the domain where there is an activation cost for going from one phase to another, as we
have argued in Chapter 6. What happens within the spinodal region will also be touched upon, but a
full theoretical description thereof goes beyond the scope of this course.
where pcoex and µcoex are the coexistence pressure and chemical potential, and where µ = (∂f /∂ρ) and
p = −f + ρ(∂f /∂ρ) = −f + µρ.
Here, we consider a fluid of volume V with overall density ρ ∈ [ρg , ρl ] at T < Tc , such that it has phase
separated into a gas phase (at high altitudes z) and a liquid phase (at lower altitudes), separated by a
planar interface of area A. We are interested in the density profile ρ(z) and the surface tension γ. The
211
212 CHAPTER 15. GAS-LIQUID INTERFACES AND CLASSICAL NUCLEATION THEORY
In the case of a planar interface, the grand potential can be expanded in the gradients and higher-
order derivatives of the density profile ρ(z). By symmetry there is no contribution from terms linear
in ρ′ (z) ≡ ∂ρ(z)/∂z, and the lowest-order nontrivial expansion is the square-gradient expression, where
the grand potential of a system with a density profile ρ(z) is approximated by
Z ∞
m ′ 2
Ω=A dz − W (ρ(z)) + ρ (z) , (15.5)
−∞ 2
where m is a phenomenological prefactor that characterizes the stiffness of the interface — if m is large
than there is large free-energy penalty for density gradients. Our task is now to find that profile ρ(z)
that minimizes Eq. (15.5) for a given bulk gas-liquid coexistence characterized by µ, p, ρg , and ρl .
If one is familiar with functional differentiation one directly sees that the minimization condition implies
that ρ(z) must satisfy the condition
∂ 2 ρ(z) ∂W (ρ(z))
m =− . (15.6)
∂z 2 ∂ρ(z)
This equation is analogous to the equation of motion of a particle at position x(t) at time t in a potential
V (x), for which the equation of motion (Newton’s law) reads ẍ(t) = −V ′ (x).
Those who are not so familiar with functional differentiation can arrive at Eq. (15.6) by discretizing the
z-interval into equidistant points zk separated from each other by ∆z, such that
" 2 #
X m ρk+1 − ρk−1
Ω≃A ∆z −W (ρk ) + , (15.7)
2 2∆z
k
where ρk = ρ(zk ). Expression (15.7) is a function of a large discrete set of variables {ρk }, and we wish
to find the minimum of this expression. This requires that the derivative w.r.t. to any of the ρk ’s must
vanish, i.e. ∂Ω/∂ρi = 0 for all i. This leads to
∂W (ρi ) m 1 2
0 = − + 2 (ρi − ρi−2 ) − (ρi+2 − ρi ) ;
∂ρi 2 2∆z
∂W (ρi ) ρi+2 + ρi−2 − 2ρi
= − −m , (15.8)
∂ρi (2∆z)2
15.1. FREE ENERGY OF THE INTERFACE 213
ρ(z)
liquid
gas
z
Figure 15.1: Sketch of Eq. (15.14) for a set of density differences with the density profile ρ given as a
function of z, both in arbitrary units.
This analysis shows that ξ is the typical thickness of the meniscus, the length scale on which the crossover
takes place from the bulk gas phase to the bulk liquid phase. Since (ρl − ρg ) → 0 upon approaching the
critical point, one sees that the interface thickness then diverges; a sketch is provided in Fig. 15.1.
The surface tension, the free energy cost of creating the interface per unit surface area, can be calculated
by evaluation of the minimal grand potential, which yields
Z Z
m ′ 2 2
Ω = A dz − W (ρ(z)) + ρ (z) = A dz − p + m ρ′ (z) ;
2
Z
2
= −pV + Am dz ρ′ (z) ≡ −pV + γA, (15.15)
Note that the surface tension is positive per definition. Negative surface tensions are not permitted in
equilibrium, because the system would be able to lower its free energy arbitrarily by creating increasing
amounts of surface area. This is, to an extent, what happens within the spinodal region of the phase
diagram, where the system is unconditionally unstable. In this context, note that Eq. (15.16) implies
that γ → 0 upon approaching the critical point. This makes sense, as the system at the critical point is
characterized by scale invariance with patches of either phase being present, and thus ‘interfaces’ on all
length scales. In this context, phase-field models (relying on a free-energy density that has perturbative
gradient contributions) can be used to study the behavior of such systems. One of the more well-known
examples of this is the Cahn-Hilliard model.
Lastly, we note that you may have previously encountered the surface tension in a mechanical context,
in the Young-Laplace equation, which describes the curvature of interfaces. The above result, derived
from a statistical mechanics perspective, is exactly the surface tension that enters the Young-Laplace
equation. For very small curvatures, corrections to the planar value of γ will need to be made, which
requires the introduction of the Tolman length.
Classical nucleation theory (CNT) concerns itself with the description of these small nuclei — also
sometimes referred to as condensates — of a daughter phase in the parent phase. Within the framework
15.2. CLASSICAL NUCLEATION THEORY 215
of CNT it is also possible to predict the rate at which critical nuclei are formed, these are the nuclei
that are of exactly the size required to cross the nucleation threshold or nucleation barrier and grow
out to a bulk daughter phase. Before we procee, we should remark on one additional concept. CNT in
its simplest form concerns itself with homogeneous nucleation. That is, nucleation that occurs in the
bulk of the parent phase. This is generally not the most prevalent or favorable form of nucleation, as
heterogeneous nucleation has a lower nucleation barrier: ice tends to form on the edges of a container
being cooled, rather than somewhere in the middle of the fluid. Or recall the example of Chapter 6
inducing crystallization in an undercooled liquid by inserting a needle.
Note that the pressure in the pure and nucleating phase are not the same, in fact the pressure in
the nucleus is higher. This might seem counterintuitive, as in bulk coexistence, there is mechanical
equilibrium, which implies equal pressure between the coexisting phases. However, if you have studied
the Young-Laplace equation, you may recall that the presence of interfacial tension induces a Laplace
pressure (pressure difference) across any interface that is curved. Many textbooks make the connec-
tion between nucleation and the Gibbs free energy, which requires a Legendre transform of the grand
potential in Eq. (15.18). However, you should exercise caution in reading such sources, as there are
underlying assumptions that are not necessarily transparently presented. We refer the interested reader
to W.W. Mullins [J. Chem. Phys. 81, 1436 (1984)] for a more detailed and correct derivation of CNT,
which is slightly involved, but that also covers crystal nucleation.
Examining the shape of Eq. (15.18) reveals that for small radii the surface term dominates the volume
term, meaning that the formation of the daughter phase is not favored; ∆Ω > 0. For large radii the
bulk (volume) term dominates, meaning that the nucleus can grow out, because ∆Ω < 0. These two
behaviors are separated by an effective barrier to growth, the location of which can be computed by
taking the derivative with respect to R, that is
∂∆Ω
= 0, (15.19)
∂R R=Rc
where Rc is the critical radius. However, be careful, γ is also dependent on R in this case. If one
ignores this dependence, one readily finds Rc = 2γ/∆P . This should be familiar, as it is simply
216 CHAPTER 15. GAS-LIQUID INTERFACES AND CLASSICAL NUCLEATION THEORY
an expression for the Laplace pressure. Plugging Rc into Eq. (15.18), gives the nucleation barrier
∆Ωc ≡ ∆Ω(Rc ) = 16πγ 3 /(3∆P 2 ). Clearly, the probability to form a nucleus of this size in the parent
phase is given by pc ∝ exp(−β∆Ωc ). Simulation studies have shown that this functional form indeed
describes the probability of cluster sizes well. It is possible to determine the rate of nucleation under
the assumption of single-particle attachment to a growing cluster in the parent phase. This makes use
of both reaction rates and aspects of chemical equilibria as described in Chapter 5, but goes beyond the
scope of these notes.
We will finish this short excursion into CNT by considering why heterogeneous nucleation occurs more
readily. Assume we have a flat wall with surface area A, which is in contact with our metastable phase,
wherein nuclei are forming. We assume that the shape of the nucleus is a spherical cap1 , such that its
volume is given by
πh2
Vd = (3r − h) , (15.20)
3
with r the sphere radius and h the height of the top of the sphere cap above the plane. It will prove
useful to express h = r(1 − cos θ), with θ the contact angle, as shown in Fig. 15.2.
metastable phase
𝑃𝑝
nucleus
𝛾𝑝𝑛
𝑟
𝑃𝑛
𝜃
𝛾𝑝𝑤 𝛾𝑛𝑤
wall
Figure 15.2: Cross-sectional sketch of a heterogeneous nucleation process taking place at a flat wall
(grey). The nucleus (dark blue) is a spherical cap with radius r (red arrow) and contact angle θ, as
indicated by the angle in green. The pressures Pp and Pn of the metastable parent phase (light blue)
and the nucleus-forming daughter phase are indicated, as well as the various surface tensions γ.
The area of between the parent and daughter phase is now given by Apn = 2πrh and the area in of
the nucleus in contact with the wall is given by Anw = π 2rh − h2 . Assume that the total volume of
the system is given by V and that the parent phase has pressure Pp , the daughter phase has a pressure
Pn , the surface tension between the wall and parent phase is γpw , and similarly γnw and γpn denote the
respective daughter-wall and parent-daughter surface tensions. Then the total grand potential of the
system is now given by
Ωtot = (Pn − Pp )Vn + (A − Anw )γnw + Anw γnw + Apn γpn , (15.21)
1 This assumption is justified as a spherical cap minimizes the surface area between the parent and daughter phase as
well as the surface contact area and the three-phase contact line length.
15.2. CLASSICAL NUCLEATION THEORY 217
such that the grand-potential difference with respect to the pure parent phase in contact with the wall
is given by
We now have an equation for ∆Ω in two (constrained) unknowns: r and 0 ≤ h ≤ 2r, or equivalently r and
0 ≤ θ ≤ π. We now wish to eliminate one of the variables in order to proceed with our argument. Note
that we could naively minimize ∆Ω with respect to both variables (under the assumption of constant
γs). However, it is advantageous to eliminate h at a given constant droplet volume, i.e., as we vary h or
θ we must also vary r to ensure a constant volume. It is convenient to perform this minimization using
θ and we write
4πr3
∆Ω = ∆P (2 − 3 cos θ + cos3 θ) + πr2 (γnw − γpw )(1 − cos2 θ) + 2πrγpn (1 − cos θ). (15.23)
3
Here, the last two terms on the right-hand side constitute the surface contribution. We can now write
this contribution in terms of dr and dθ. Similarly, we can write dVd = 0, since we impose a constraint
on the volume. The differential for the constraint can be solved for dr and substituted back into the
surface term. After some straightforward, but involved algebraic manipulation, we recover the relation
γpw − γnw
cos θ = . (15.24)
γpn
Note that this is simply the Young equation, which specifies the force balance on the contact line due to
the difference in surface tensions between three phases that meet there. In this context, θ is the contact
angle and is a three-phase material property. In other words, at a constant droplet volume, the shape
of the droplet is prescribed by energetic considerations involving the surface, for a given θ and γpn .
Substituting the Young equation into Eq. (15.23), we finally obtain for fixed-volume clusters (the free
parameter is r) that the grand-potential difference for heterogeneous nucleation is given by
πr2
2 − 3 cos θ + cos3 θ (∆P r − 3γpn ) .
∆Ωhet = (15.25)
3
We can now compare this to the difference incurred in homogeneous nucleation, which is simply given
by evaluating ∆Ωhet for θ = π:
4πr2
∆Ωhom = (∆P r − 3γpn ) . (15.26)
3
The normalized difference — recall that ∆Ωhom > 0 up to and (to an extent) beyond the critical radius
— is thus
∆Ωhet − ∆Ωhom 1
= − (2 − cos θ)(1 + cos θ)2 ≤ 0, (15.27)
∆Ωhom 4
where the last inequality obviously holds for all θ. We conclude that heterogeneous nucleation is a favored
pathway to nucleation, except in the limit of θ = π. This limit corresponds to a fully hydrophobic surface
in the case of water. Therefore, nucleation of water droplets occurs readily on hydrophilic surfaces.
218 CHAPTER 15. GAS-LIQUID INTERFACES AND CLASSICAL NUCLEATION THEORY
15.3 Exercises
Q89. Classical Nucleation Theory
In this exercise, we examine the derivation of Eq. (15.18). Our starting point is Eq. (15.17).
(a) Explain using only a few words why it is possible to split the grand potential into three
components.
(b) Use properties of the grand potential to introduce the pressure terms Pn and Pp into Ωtot .
You will need to remove a constant term to obtain the result in Eq. (15.18), why is this
permitted? Explain using a few words.
(c) Use the definition of the surface tension to rewrite the term Ωs (µ, A, T ). Here, we assume
that this term is only dependent on A, while in the derivation of γ we found that the interface
has a finite width. What is assumed about the interface? Explain using a few words.
(d) Use a Legendre transform to convert the grand-potential difference to a Gibbs free-energy
difference. What do you realize from this?
(e) The difference in pressures ∆P is often converted into a difference in chemical potential
Vn ∆P = Nn [µp (Pp ) − µn (Pp )]. This expression can be obtained by assuming the equation
of state in the nucleus is approximately linear between the coexistence pressure and current
pressure. Show this.
Be careful with your result from (e), it assumes that the nucleating vapor is incompressible, which
may not be an accurate description.
Q90. Classical Nucleation in 2D
Here we use a simple 2D system to examine why nuclei are typically round.
(a) Assuming a surface tension given by γ, which is independent of size, and pressure difference
∆P , calculate the free-energy barrier for nucleation in a two-dimensional system assuming
that the nucleus that forms in the system is circular. What is the critical radius and the
maximum barrier height?
(b) Calculate the barrier properties for nucleation in a 2D system assuming instead that the
nucleus which forms is a square. What is the critical side length of the square? What is the
maximum barrier height?
(c) Assuming that the kinetic prefactor κ is the same for both the square and spherical nuclei,
which nuclei have a faster nucleation rate? Use this to explain why the nuclei in most systems
are circular (or spherical in 3D).
Q91. Cylindrical Nucleation
A crystalline nucleus might form in a sample of needles at sufficient density. Provide the general
expression for the Gibbs free energy difference and explain what the parameters represent that
you need to specify in order to obtain a nucleation barrier. Derive an expression for the size of the
critical nucleus and nucleation barrier, under the assumption that the nucleus is a cylinder with
radius r/2 and height r.
Q92. Classical Nucleation Theory with a Seed
Here we will rewrite classical nucleation theory with a “seed” crystal and see the effect this has on
the nucleation barriers. For a seed, assume that we have a single crystal plane of diameter RS as
shown in Figure 15.3. Note that the seed is in exactly the same phase as the solid so we will not
consider interactions between the seed and the nucleating crystal. However, the seed does change
15.3. EXERCISES 219
the geometry of the nucleation. How does the volume of the nucleus depend on RS ? The area?
What is the resulting free energy barrier? Plot the free energy for different values of the surface
tension and the chemical potential difference between the crystal and fluid. How does the seed
change the free energy barriers? Note: Approximate all shapes as parts of spheres, not ellipses.
𝑟𝑠 𝑟
ቐ
𝑟ቊ
Seed Seed Seed
Fluid Crystal
Fluid Crystal
Fluid
𝑟=0 𝑟 < 𝑟𝑠 𝑟 > 𝑟𝑠
Figure 15.3: Illustration of nucleation with a seed, inspired by a similar figure from [Hermes et al., Soft
Matter 7, 4517 (2011)].
220 CHAPTER 15. GAS-LIQUID INTERFACES AND CLASSICAL NUCLEATION THEORY
Chapter 16
Important classes of systems in physical chemistry and biology are solutions and suspensions. They usu-
ally consist of solvent (e.g., water, alcohol, cyclohexane) and solute (e.g., ions in ionic solutions, polymers
in polymeric solutions, and colloidal particles in colloidal suspensions). In many cases these systems are
essentially classical — quantum mechanical effects can be ignored — and their thermodynamic prop-
erties can be studied from modifications and extensions of the statistical mechanics of one-component
fluids discussed thus far. Such modifications and extensions are the topic of this chapter.
We distinguish two approaches. The first one is a direct generalization of the single-component theory
discussed in the previous chapter; it treats all chemical components in the system on the same footing.
For that reason it is most applicable to mixtures of fairly “similar” species, e.g., Argon-Neon mixtures
or salty water (e.g., a mixture of Na+ and Cl− in water). The second approach we discuss is more
readily applicable to very asymmetric mixtures, e.g., mesoscopic colloidal particles or macromolecules
in a microscopic solvent. In this approach, we determine an effective Hamiltonian of the mesoscopic
particles by integrating out the microscopic degrees of freedom in a suitably-defined partition function.
This formalism will be worked out explicitly for a noninteracting (ideal) “solvent”. This example has
direct relevance for understanding some colloid-polymer mixtures, since polymers can be modeled as
objects that can interpenetrate, i.e., an ideal gas.
dr N
Z Y s i
!
h i
(i)
Z({N }, V, T ) = 3Ni
exp −βΦ(r N Ns
(1) , · · · , r (s) ) ,
1
(16.1)
i=1
Ni !Λi
221
222 CHAPTER 16. MULTI-COMPONENT FLUIDS AND COLLOIDAL SUSPENSIONS
It is possible to extend the low-density virial expansion for the pressure and the Helmholtz free energy,
as developed for one-component systems, to mixtures. In terms of the densities ρi = Ni /V and the De
Broglie wavelength Λi of species i, as defined in Eq. (3.16), one obtains
s s s
F X X (ij) 1 X (ijk)
= ρi log ρi Λ3i − 1 + B2 (T )ρi ρj + B3 (T )ρi ρj ρk + · · · , (16.3)
V kB T i=1 i,j=1
2
i,j,k=1
where the second virial coefficients are given in terms of the pair interactions ϕ(ij) (r) between species i
and j by Z
(ij) 1 h i
B2 (T ) = − dr exp −βϕ(ij) (r) − 1 . (16.4)
2
This is a straightforward generalization of Eqs. (13.10) and (13.29) for one-component fluids. Higher-
order virial coefficients can also be generated accordingly. The first term of Eq. (16.3) is the ideal gas
contribution, the higher order terms are due to interactions.
One can also introduce pair correlation functions g (ij) (r) analogously to the one-component case, and
obtain the Helmholtz free energy F from a coupling constant integration after splitting Φ = Φ0 + Φ1
into a reference part Φ0 and an excess part Φ1 ,
s Z 1 Z
1 X (ij) (ij)
F = F0 + V ρi ρj dλ dr ϕ1 (r)gλ (r), (16.5)
2 i,j=1 0
(ij)
where F0 is the free energy of the s-component reference system, and where ϕ1 (r) is the perturbation
of the pair-interaction between particles of species i and j with respect to the reference interaction.
The two above one-component expressions are useful provided the mixed chemical species are not too
asymmetric, e.g., for a mixture of Argon and Neon, or for a (dilute) mixture of colloidal particles with
similar diameters.
To ease the notation, from here on we will consider instead of an s-component system, a 2-component
system of solutes (species 1, e.g., colloids or macromolecules) and a solvent (species s). It is straight-
forward to add more colloidal and solvent components later on, if desired. The role of the solvent can
also be played by any other chemical component that we wish to integrate out, e.g., ions in the case of
charged colloids or depleting polymers in colloid-polymer mixtures.
16.2. EFFECTIVE INTERACTIONS AND THE OSMOTIC ENSEMBLE 223
with p the total pressure, µ the chemical potential of the colloids, S the entropy, and ⟨Ns ⟩ the (average)
number of solvent molecules in the volume V . Note that in the thermodynamic limit we may write Ns
for ⟨Ns ⟩, which we will do henceforth. Denoting the coordinates of the colloids by {R}, and those of
the solvent molecules by {r}, the partition sum of this semi-grand or osmotic ensemble can be written
as
∞
X
exp[−βΩ] = exp[βµs Ns ]Z(N, Ns , V, T );
Ns =0
Z ∞ Z
(16.1) 1 X exp[βµs Ns ]
= dRN dr Ns exp[−βΦ({R}, {r})];
N !Λ3N
1 Ns =0
N !Λ
s s
3Ns
| {z }
≡ exp −βΦeff ({R}; µs , T )
Z
1
dRN exp −βΦeff ({R}; µs , T ) ,
= (16.7)
N !Λ3N
1
where we defined the effective interaction Φeff between the colloids. The effective interactions consist of
direct interactions, i.e., interactions that would be present between the colloid particles in vacuum, and
of solvent-mediated interactions (that depend parametrically on µs and T ). This can be made explicit
by writing the interaction Hamiltonian as
Φ({R}, {r}) = Φ11 ({R}) + Φ1s ({R}, {r}) + Φss ({r}), (16.8)
where Φ11 denotes the bare colloid-colloid interactions, Φ1s the colloid-solvent interactions, and Φss the
solvent-solvent interactions. With this splitting of terms, which is completely general, we can write
| {z }
≡ exp[−βW ({R}; µs , T, V )]
= exp[−βΦ11 ({R}) − βW ({R}; µs , T, V )]. (16.9)
In words, the effective colloidal interactions consist of bare interactions Φ11 and solvent-induced or
solvent-mediated interactions W . In fact, one sees from its definition that W is the grand potential of
the inhomogeneous solvent in the external field of the configuration {R} of the colloids, and exp[−βW ]
is the grand partition function of that system.
It is, of course, a gigantic problem to actually calculate W and hence Φeff = Φ11 + W , but the structure
of Eq. (16.9) suggests the following scheme to calculate W . First, consider the case that no colloidal
particles are present, N = 0. In that case the system is a one-component system consisting of solvent (at
chemical potential µs ) only, and W ≡ −p0 (µs , T )V with p0 the pressure of that system. Note that p0 is
224 CHAPTER 16. MULTI-COMPONENT FLUIDS AND COLLOIDAL SUSPENSIONS
the pressure of the pure solvent reservoir at one side of the membrane. Next, consider the case that only
1 colloidal particle is present, at position R1 . Then W = −p0 V + w1 (µs , T ), where w1 is, by definition,
the grand-potential excess of the solvent due to that solute particle. It incorporates entropic effects due
to the restructuring of the solvent close to the colloidal surface, and energetic effects due to attractions
and/or repulsions of the solvent molecules close to the colloidal surface. If w1 ≫ kB T the solvent is a poor
solvent for the colloids, and most likely the colloids will prefer to reside at the solvent meniscus or at the
walls of the containers. If ω1 ≪ −kB T the solvent quality is good. Regardless the sign and magnitude,
due to translational invariance w1 is independent of R1 (for large enough V ). Finally consider the case
of two colloidal particles, at positions R1 and R2 . Then W = −p0 V + 2w1 + w2 (|R1 − R2 |; µs , T ),
which defines the solvent-induced pair interaction between the colloids. Note that limr→∞ w2 (r) = 0 by
construction. Extending this reasoning to the N -colloid system yields
N
X N
X
W ({R}; µs , T ) = −p0 (µs , T )V + N w1 (µs , T ) + w2 (Rij ; µs , T ) + w3 (Rijk ; µs , T ) + · · · ,
i<j i<j<k
(16.10)
where Rijk is short for the triangle-coordinates of the three colloids i, j, k, and where the dots represent
solvent-induced four-and-higher-body interactions. Note that we have not explicitly calculated p0 , w1 ,
w2 etc. in terms of Φ1s and Φss , we have just split up the grand potential of the inhomogeneous solvent
in terms of zero-, one-, two-body contributions, etc. But once they have been calculated, preferably by
someone else, one can write the effective interactions as
where we split of the coordinate-independent terms −p0 V + N w1 from the coordinate-dependent terms
that we collectively call H({R}). Before we carry on, we should note that main advantage of the series
in Eq. (16.10) over a virial expansion is that the series is convergent and the approximation becomes
better with the addition of higher-order terms.
We are now ready to evaluate the thermodynamic potential of interest, Ω(N, µs , V, T ). Insertion of
Eq. (16.11) into Eq. (16.7) yields
where A is defined by
Z
1
exp[−βA] = dR1 · · · dRN exp[−βH({R)}]. (16.13)
N !Λ3N
In other words, A is the Helmholtz free energy of the canonical system of N “dressed colloids” in a volume
V at temperature T interacting with the Hamiltonian H({R}; µs , T ). Note that Eqs. (16.12)) and (16.13)
are exact, i.e., it is not an approximation to view the colloid-solvent mixture as a one-component system
of colloids. Of course it will generally be difficult to calculate H exactly, and approximations will need
to be made. Typically, one often ignores the induced triplet terms w3 and higher-body interactions, and
often even the calculation of w2 involves drastic approximations. For colloid-polymer mixtures, with
ideal polymers, an exact calculation is possible, as we will see in one of the problems.
16.3. EFFECTIVE INTERACTION INDUCED BY IDEAL COMPONENT 225
The pressure p = −(∂Ω/∂V )N,T,µs of the suspension, and the chemical potential µ = (∂Ω/∂N )V,T,µs of
the colloids can be written from Eq. (16.13) as
p = p0 (µs , T ) + Π(ρ, µs , T ); (16.14)
′
µ = w1 (µs , T ) + µ (ρ, µs , T ), (16.15)
where the osmotic pressure Π is the excess pressure (over that of the solvent reservoir) due to the
presence of colloids,
∂A
Π=− . (16.16)
∂V N,µs ,T
Here, ρ = N/V is the colloid density. Note that the osmotic pressure is the pressure of the effective
colloids-only system, with interaction Hamiltonian H, i.e., the pressure one would calculate if one had
ignored the solvent from the start. The corresponding excess chemical potential is defined by
∂A
µ′ = . (16.17)
∂N V,µs ,T
Note that the total chemical potential µ is shifted by an amount w1 with respect to µ′ , due to the
interactions with the solvent. As one is often interested in the effect of increasing colloid density for a
given solvent at a given temperature, i.e., at fixed T and µs , one can view w1 as an arbitrary offset that
need not be calculated or determined. The pressure of the solvent reservoir, p0 , can be treated similarly.
Of course it is still a big challenge to actually calculate w1 , w2 (Rij ), etc. for given µs and T , but the
formalism shows that, once they have been determined, a solution or suspension can be treated as an
interacting molecular “gas”, in which the dense solvent is solely present through µs and T . Often w3
and higher body terms are ignored (just as in simple fluids), and one thus assumes pairwise additivity
of the effective interactions. A well-known example of this is the Coulomb interaction q1 q2 /(ϵr) between
charges q1 and q2 in a dielectric medium with dielectric constant ϵ, where the only effect of the medium
is to modify the bare interaction q1 q2 /r (the interaction in vacuum) by a factor ϵ. Note that ϵ = ϵ(µs , T ).
In a solvent with mobile charges (e.g. salt ions), the bare Coulomb interaction are often approximated
by q1 q2 exp(−κr)/(ϵr), where κ({µs }, T ) is the inverse Debye length of the reservoir, i.e., it depends on
the chemical potentials {µs } of both the solvent and the ionic species.
The famous Van ‘t Hoff’s law Π = kB T ρ follows if the solute interactions can be ignored, H ≡ 0, as e.g.
in the low-density limit. At higher densities a virial expansion can be performed, and the second virial
coefficient B2 (T, µs ) are given by integrals over Mayer functions of the effective pair interactions, etc.
The free energy A and the osmotic pressure Π of dense suspensions can be studied by thermodynamic
integration, exploiting radial distribution functions and their different routes to thermodynamics, the
Ornstein-Zernike equation etc. for the effective colloids-only system. That is, the whole machinery of
liquid state theory discussed in Chapters 13 - 15 can be applied.
where RN1 is the set of spatial coordinates of particles of species 1, and r N2 for species 2. Moreover,
we focus attention to the pairwise additive case
N1 X
X N2
Φ12 (RN1 , r N2 ) = ϕ12 (Ri − r j ), (16.19)
i=1 j=1
with ϕ12 (r) the pair-interaction between particles of species 1 and 2 at separation r. This structure of
Φ12 leads straightforwardly to
" #
h i N1 X
X N2 N2
Y N1
X
exp −βΦ12 (RN1 , r N2 ) = exp −β ϕ12 (Ri − r j ) = exp −β ϕ12 (Ri − r j ) . (16.20)
i=1 j=1 j=1 i=1
That is, dependence of the Boltzmann factor on Φ12 is factorized according to r j . Consequently, we can
write, using the thermal-volume-weighted fugacity z̃2 = exp[βµ2 ]/Λ32 of species 2 from Eq. (3.24),
h i
exp −βΦeff (RN 1 , µ2 , T ) =
∞ Z
(16.7) X exp[βµ2 N2 ] h
N1 N2
i
= dr 1 . . . dr N exp −βΦ(R , r ) ;
N2 !Λ23N2
2
N2 =0
N2
#
∞
" N1
z̃2N2
Z
(16.20) N1
X X
= exp[−βΦ11 (R )] dr exp −β ϕ12 (Ri − r) ;
N2 !
i=1
N2 =0
| {z }
≡ Vf (RN1 )
N2
h i X∞ z̃2 Vf (RN1 )
= exp −βΦ11 (RN1 ) ;
N2 !
N2 =0
h i
= exp −βΦ11 (RN1 ) + z̃2 Vf (RN1 ) , (16.21)
where we defined the free volume Vf (RN1 ). A comparison with Eq. (16.9) shows that W = −z̃2 Vf now.
Note that the dimension of Vf is indeed that of a volume. The reason of this nomenclature will be
clarified later on. It follows from Eq. (16.21) that the effective interactions are given by
Φeff (RN
1 , µ2 , T ) = Φ11 (R
N1
) − z̃2 kB T Vf (RN1 ), (16.22)
where the first contribution is the “bare” and the second one the “induced” interaction, tunable by
the fugacity (or chemical potential) of species 2. Since z̃2 kB T = p0 (µ2 , T ), which is the pressure of
the one-component ideal-gas system of species 1 in the reservoir, one can strengthen the effect of the
induced interactions by increasing the pressure (or the density) of species 2.
their mutual interactions. Although they are statistically spherically symmetric, say with a diameter
σp , their flexibility allows for center-to-center distances less than σp without a high energetic cost. This
mutual interpenetration can, to a first approximation, be described by a vanishing polymer-polymer
interaction, Φ22 ≡ 0. We now regard the colloidal particles as hard spheres of diameter σc , i.e., two
colloids cannot approach each other more closely than a center-to-center distance σc . Moreover, although
a polymeric particle can overlap with another polymer, it cannot overlap with a solid colloidal particle.
This can be described by a colloid-polymer interaction, ϕ12 (r), that is hard-sphere like, with a distance
of closest approach given by
σc + σp
σcp = . (16.23)
2
Recalling now the definition of the free volume,
Z " N1
#
X
N1
Vf (R )= dr exp −β ϕ12 (Ri − r) , (16.24)
i=1
one sees that the only contribution to the integral over r stems from those regions of space that are
sufficiently far away from the center of any colloidal particle Ri , i.e., from those positions r with
|r − Ri | > σcp for all i = 1, . . . , N1 . This is indeed the “free” volume that is available to the polymers.
The volume excluded to the polymers consists of N1 spheres of radius σcp , centered about the colloidal
3
particles at positions Ri . However, this does not imply that Vf = V − N1 (4π/3)σcp , since the exclusion-
spheres (of radius σcp ) overlap with each other as soon as colloidal spheres are separated by a distance
smaller than 2σcp .
We will now restrict our attention, for simplicity, to pairwise exclusion overlaps only. It follows from
basic geometry (see one of the problems) that the (lens-shaped) overlap volume of a pair of spheres of
radius σcp at center-to-center distance Rij is given by
" 3 #
3
4πσcp
3 Rij 1 Rij
1− + Rij ≤ 2σcp
v(Rij ) = 3 4 σcp 16 σcp . (16.25)
0 Rij > 2σcp
3
Note that v(Rij ) is non-negative, and varies smoothly from 4πσcp /3 at Rij = 0 to 0 at Rij = 2σcp . Of
course, the regime Rij < σc is unphysical because of the colloidal hard-core, so the physically relevant
interval of definition of v(Rij ) is Rij ≥ σc . Ignoring triplet and higher-order overlaps, the free volume
can now be written as
N1
X
Vf (RN1 ) = V − N1 (4π/3)σcp 3
+ v(Rij ). (16.26)
i<j
Assuming now a pairwise additive bare interaction Φ11 (RN1 ), with a bare pair potential ϕ11 (Rij ), one
arrives with Eq. (16.22) at the effective colloid-colloid interaction Hamiltonian of the form
N1
X
Φeff (RN1 , µ2 , T ) = Φ0 (N1 , V, z̃2 , T ) + ϕeff (Rij ; z̃2 , T ). (16.27)
i<j
3
Here, Φ0 (N1 , V, z̃2 , T ) = −z̃2 kB T [V − N1 (4π/3)σcp ] is independent of the colloidal coordinates RN1 .
Due to the linear dependence of Φ0 on N1 and V this term is irrelevant for the phase behavior; it can
be seen as a mere shift of the pressure and the chemical potential as discussed in detail before. The
228 CHAPTER 16. MULTI-COMPONENT FLUIDS AND COLLOIDAL SUSPENSIONS
This effective pair potential, which is often referred to as the Asakura-Oosawa potential, describes
an attractive interaction between a pair of colloids at separation Rij in the range σc ≤ Rij ≤ σc +
σp . The strength of this attraction is proportional to the polymer volume-weighted fugacity z̃2 (or
its dimensionless form z̃2∗ = (π/6)σp3 z̃2 ), whereas its range is determined by σp (or the dimensionless
diameter ratio q ≡ σp /σc ). As an illustration we plot βϕeff for q = 0.4 and two values of z̃2∗ = (π/6)σp3 z̃2
in Fig. 16.1. Note that the potential minimum occurs at contact, Rij = σc .
βϕeff
1
q = 0.4
0
z*2 = 0.2
-1
-2 z*2 = 0.6
-3
Rij/σc
0.8 1.0 1.2 1.4 1.6
Figure 16.1: The effective pair interaction of two colloidal hard spheres (diameter σc ) in a sea of
ideal polymers (diameter σp ) at the indicated dimensionless fugacity z̃2∗ = (π/6)σp3 z̃2 , for size ratio
q = σp /σc = 0.4. The polymers induce an attraction for colloidal separations Rij ∈ (σc , σc + σp ). The
vertical dashed line indicates the hard-sphere divergence.
Now that the effective colloid-colloid interaction has been obtained, we can explicitly describe the colloid-
polymer mixture as an effective one-component system with a pair interaction given by Eq. (16.28).
It is now possible to apply all the techniques devised for one-component systems to calculate, e.g.,
the (osmotic) pressure and the Helmholtz free energy, and from the latter the phase diagram. A
convenient representation of such phase diagram is the z̃2∗ -ηc plane for a fixed diameter ratio q, with
ηc = (π/6)σc3 N1 /V the colloidal packing fraction, i.e., the dimensionless density of colloids. This
representation is very similar to the temperature-density representation of a truly one-component system
such as Ar, for instance the tie-lines that connect two coexisting phases are horizontal. Note, however,
that z̃2 plays the role of the inverse temperature: increasing z̃2 gives rise to a potential well that is
deeper in units of kB T , which corresponds to a lower temperature.
Figure 16.2 shows z̃2∗ -ηc phase diagrams for diameter ratios q = 0.1, 0.4, 0.6, and 0.8, i.e., for polymers
smaller than colloids. These phase diagrams are obtained with a Monte Carlo technique to calculate
F (N1 , V, T, z̃2 ) at fixed z̃2 and T for many densities N1 /V , involving a coupling constant integration to
gradually switch on the attractions starting from the hard-sphere fluid. The first observation is that
all phase diagrams of Fig. 16.2 reduce, at z̃2 = 0, to the hard-sphere phase diagram with a fluid phase
16.4. COLLOID-POLYMER MIXTURES 229
z*2 z*2
z*2 z*2
Figure 16.2: Phase diagrams of colloid-polymer mixtures for several size ratios q = σp /σc in the ηc -
z̃2∗ representation, with ηc = (π/6)σc3 N1 /V the dimensionless colloid density and z̃2∗ = (π/6)σp3 z̃2 the
dimensionless polymer fugacity. We distinguish a face centered cubic (fcc) crystalline phase at sufficiently
high ηc , and a fluid phase at lower ηc . For q = 0.1 the system shows (metastable) fcc-fcc coexistence,
where the two coexisting phase have a density (or lattice spacing). For q < q ∗ ≃ 0.5, the gas-liquid
coexistence is metastable with respect to the fluid-fcc transition, i.e., there is no liquid phase. At
sufficiently high q and z̃2∗ , the fluid phase splits into a dilute colloidal gas and a dense colloidal liquid
phase. Note that z̃2∗ is equal to the polymer packing fraction in a pure polymer system of fugacity z̃2 ,
since the polymers are ideal. Data kindly provided by M. Dijkstra.
230 CHAPTER 16. MULTI-COMPONENT FLUIDS AND COLLOIDAL SUSPENSIONS
at ηc < 0.494, an fcc solid phase at ηc > 0.545, and fluid-fcc coexistence in between. For nonzero z̃2
the phase diagrams are seen to strongly depend on the size ratio. For sufficiently large polymers, e.g.,
at q = 0.8 and 0.6, the phase diagram features a gas-liquid coexistence regime, where a liquid phase
exists between the critical and the triple point (dashed horizontal line). This is very similar to the phase
diagram of Ar. This was to be expected, since the range of the attraction for these values of q is of
the order of the colloidal diameter, just like the range of the Ar-Ar attractions is of the order of the Ar
diameter (see the Lennard-Jones potential).
Upon decreasing q, i.e., making the attractions relatively shorter ranged by using shorter polymers, it
is found that the critical z̃2 and the triple point z̃2 approach each other, thereby decreasing the liquid
regime, and finally annihilating it at q < q ∗ ≃ 0.5. That is, there is no (stable) liquid phase when the
range of the attractions is substantially shorter than the hard-core diameter. Instead, for q < q ∗ there
is a single fluid-fcc coexistence regime that broadens with increasing z̃2 . This coexistence regime does
contain a gas-liquid binodal, as indicated by the grey areas, but this is metastable with respect to the
fluid-fcc coexistence. Moreover, for q = 0.1 another metastable transition appears that is very similar
to the gas-liquid transition, except that it does not involve two fluid phases of different density but two
fcc phases. This (metastable) fcc-fcc transition, which ends in a critical point just like the fluid-fluid
(gas-liquid) transition, becomes stable at q = 0.05 (not shown here).
16.5. EXERCISES 231
16.5 Exercises
Q93. Interactions in a Classical Two-Component Mixture
The interactions in a classical two-component mixture of N1 particles of type 1 (with coordinates
RN1 ) and N2 particles of type 2 (with coordinates r N2 ) can always be written in the form
(a) Give the canonical partition function Z(N1 , N2 , V, T ) of this system, and the “semi-grand”
partition function Ξ(N1 , µ2 , V, T ) with µ2 the chemical potential of component 2. The “corre-
sponding” thermodynamic potentials are the Helmholtz free energy F (N1 , N2 , V, T ) and the
semi-grand potential Ω(N1 , µ2 , V, T ), respectively. What is the relation between F and Ω?
(b) It follows from (a) that Ω can be written as
Z
1
exp[−βΩ] = dRN1 exp[−βΦeff (RN1 )], (16.30)
N1 !Λ3N
1
1
with the “effective” 1-1 interaction Φeff ≡ Φ11 (RN1 ) + W (RN1 ; µ2 , T ), with W the grand
potential of the inhomogeneous fluid of species 2 in the static external potential due to
particles of species 1 at positions Ri . Show this and give a formal expression for the “induced”
interactions W (RN1 ; µ2 , T ). Why is the nomenclature “effective” and “induced” useful?
PN1 PN2
(c) If Φ22 ≡ 0 and Φ12 = i=1 j=1 ϕ12 (Ri − r j ), i.e., component 2 is an ideal gas and the 1-2
interaction is pairwise additive, then
Z N1
X
W (RN1 ; µ2 , T ) = −z̃2 kB T dr exp[−β ϕ12 (r − Ri )], (16.31)
i=1
| {z }
≡Vf (RN1 )
with z̃2 = exp[βµ2 ]/Λ32 the thermal-volume-weighted fugacity of component 2. Prove this.
(d) Give an interpretation of Vf (RN1 ) for the case that ϕ12 is a hard-sphere potential with
diameter σ12 . Calculate Vf (R1 , R2 ) = V (R12 ) for N1 = 2 hard spheres (diameter σ11 ) at a
distance R12 > σ11 . Does the ideal component induce an attraction or a repulsion between
the two hard spheres? Is this effect stronger or weaker at increasing density of species 2?
(e) The effective interaction between the spheres can also be interpreted as a larger available
volume, and hence a larger entropy, for component 2 upon a decreasing R12 . Explain this.
Q94. Demixing Revisited
Recall that a thermodynamic system at fixed particle numbers, volume, and temperature strives for
a minimum of its Helmholtz free energy. Consider now the Helmholtz free energy F (N1 , N2 , V, T )
of a binary mixture of N1 particles of species 1 and N2 particles of species 2 in a volume V at
temperature T .
(a) Show by considering 2 of these systems in diffusive contact, that for the system to be stable
against demixing into coexisting phases, one requires that
for any (physically possible, positive or negative) change of particle numbers ∆N1 and ∆N2 .
One could also consider a volume change ±∆V , but because of extensivity reasons this does
not lead to an additional stability condition.
232 CHAPTER 16. MULTI-COMPONENT FLUIDS AND COLLOIDAL SUSPENSIONS
(b) Consider now the stability with respect to infinitesimal changes δN1 and δN2 , and show that
the mixtures is stable with respect to these fluctuations provided
∂2F ∂2F
∂N 2 ∂N1 ∂N2 δN1
(δN1 , δN2 ) · 1 · > 0, (16.33)
∂2F ∂2F δN2
∂N2 ∂N1 ∂N22
Anisotropic Particles
Thus far, we have mostly considered fluids with spherically symmetric pair interactions ϕ(|r 1 − r 2 |), i.e.,
the interaction only depends on the positions of the center of mass r i of the particles. The only possible
phases that these system can exhibit is gas, liquid, and solid1 . However, many systems in nature cannot
be described realistically by spherically symmetric interactions, and other phases apart from gas liquid
and solid exist. Such phases are referred to as mesophases or mesomorphic phases to emphasize the
in-between-ness of their structure: disordered in certain directions (liquid like) and ordered in others
(solid like). We briefly touched upon mesophases for the example of ellipsoidal particles in the context
of Landau theory in Chapter 9, but we return to this topic in more detail in this chapter.
To make the above more concrete, let us examine the phase behavior of needle-like objects. In general,
“elongated” molecules like cholesterol or “rod-like” colloidal particles such as Tobacco Mosaic Virus
1 Also the plasma phase, i.e., a mixture of positive ions and electrons, can be described by the spherically symmetric
Coulomb potential.
233
234 CHAPTER 17. ANISOTROPIC PARTICLES
can show mesophases upon cooling or compressing2 , see Fig. 17.1. For such molecules the phases are
historically referred to as liquid crystal phases or simply liquid crystals, though the term is a bit of
an oxymoron. Generally speaking, liquid crystals have a degree of ordering in between that of liquids
(disordered, homogeneous, isotropic) and crystals (ordered, inhomogeneous, anisotropic). We refer back
to Chapter 9 for the definitions of these terms.
Many other combinations of translational and rotational symmetries can be broken, and the number
of liquid crystalline phases is vast (nematic, smectic, columnar, hexatic, blue phases, cholesteric, etc.).
For this reason there is considerable fundamental interest in the structure of liquid crystals and phase
transitions between different mesophases. Liquid crystals also find many industrial applications (e.g.,
LCD’s, sunroofs), thus the research into their properties has commercial and economic benefits. The
importance of liquid crystals is essentially due to their character being in between that of liquids and
crystals, e.g., possessing the flow properties of a liquid and the light-scattering properties of a crystal.
Lastly, liquid crystalline order has a strong connection with biophysics, e.g., the behavior of suspension
of viruses, the organization and colony development found in bacteria, and the structuring and dynamics
of epithelia.
The properties of liquid crystals, their phases, and phase transitions can be (and are being) described
using the principles of statistical mechanics. In this chapter, we will use the techniques of Chapter 13
to obtain a theory for the isotropic-nematic phase transition, focusing on interactions between rod-like
particles. Perhaps surprisingly a direct generalization of the second virial approximation yields a realistic
description of the nematic phase for rod-like particles. However, it should also be noted that colloid
synthesis has advanced to the point that there is an entire zoo of shape-anisotropic particles, many of
which are amenable to such a description, hence the title of the chapter “Anisotropic Particles”. For the
purpose of convenience, we will focus on rod-like systems here, but it is important to keep the generality
in mind.
where interaction potential Φ is now dependent on the positions and orientations of all the particles.
Here, I is the moment of inertia tensor of particle i and the superscript T denotes transposition of the
angular momentum. We work in a reference frame where the tensor is diagonal and it may contain three
potentially different diagonal entries, say Ix , Iy , and Iz , depending on the symmetry properties of the
particles. The inverse of I appears in Eq. (17.1) as a consequence of angular velocity Ω and angular
momentum being linearly related via the inertia tensor: L = IΩ, which implies that the whole has
units of energy.
2 As we have seen in the Chapter 16 both molecular fluids and colloidal suspensions of systems may be described using
similar theoretical means, with the difference lying in an “effective” interaction for the latter.
17.1. PARTITION FUNCTIONS 235
Despite the addition of angular-momentum and orientation terms to the Hamiltonian (17.1), much of
the discussion in Chapter 3 can be readily extended to cover the new degrees of freedom. This should be
clear from the rather general result of the equipartition theorem derived therein. Of particular relevance
is the existence of a generalized partition function, which in the canonical ensemble is given by
Z
1
Z(N, V, T ) = dΓ exp[−βH(Γ)] (17.2)
N !h6N
where Γ ≡ (r N , pN ; q N , LN ) here and we have h6N instead of h3N to also account for the contribu-
tions of the angular momenta. The canonical average of observables independent of linear and angular
momentum, i.e., observables described by phase functions A(Γ) = A(r N , q N ), can be written as
Z
1
⟨A⟩ = dΓ exp[−βH(Γ)]A(r N , q N );
N !h6N Z(N, V, T )
Z Z
1
= dr N dq N exp[−βΦ(r N , q N )]A(r N , q N ), (17.3)
Q(N, V, T )
where the configurational integral is defined as over both positional and angular degrees of freedom
Z Z
Q(N, V, T ) = dr N dq N exp[−βΦ(r N , q N )]. (17.4)
Note that
Q(N, V, T )
Z(N, V, T ) = , (17.5)
N !Λ3N λN N N
1 λ2 λ3
where the De Broglie wavelength Λ is as before. The λi (i = x, y, z) are the dimensionless wavenumbers
for the degrees of freedom associated with angular momentum and are given by
h
λi = √ . (17.6)
2πIi kB T
This implies that the classical, canonical partition function for N noninteracting particles in a volume
V at temperature T reduces to
2 N
1 8π V
Z(N, V, T ) = , (17.7)
N !Λ3N λN λ N λN
1 2 3 σ
with the factor 8π 2 /σ stemming from the orientational integration. The factor σ is called the symmetry
number. How does one arrive at this factor? Suppose that a particle has no symmetry properties, then
one can choose an axis, usually one of the axes imposed by making I diagonal. Complete rotation about
this axis contributes a factor of 2π to the orientational integral. Integration over all possible orientations
of this axis, which is constrained to the unit sphere, contributes another factor of 4π. However, this
overcounts identical configurations whenever the particle has symmetries. It can be that the particle
only possesses discrete symmetries, e.g., the Platonic solids. In this case, the symmetry number is equal
to the number of elements in its rotational symmetry group. Note that this leaves something to be
desired for, when approximating continuous symmetries with discrete symmetries, such as a disk with
a regular n-gon. Clearly the n-gon should have n ↑ ∞ symmetries to become a disk. The resolution
to this problem comes from the freedom to ‘choose’ an orientational vector for the disk, which the
n-gon does not possess. In general, one must be quite careful with symmetry, especially when mixing
discrete and continuous symmetries. However, this is only an issue when examining the free energy or
partition function, as for thermodynamic quantities (derivatives of the free energy) the prefactor drops
out when considering single-component system. Situations where these considerations do play a role
involve mixtures and/or chemical equilibria.
236 CHAPTER 17. ANISOTROPIC PARTICLES
In the 1940’s. Onsager described a system of identical rods as a multi-component system by (i) regarding
particles with different orientations as chemically different species, (ii) applying the multi-component
virial expansion of the free energy, Eq. (16.3), and (iii) minimizing the free energy with respect to the
densities of particles pointing in each direction. With this scheme he could explain the (experimentally
observed) phase transition from a disordered isotropic fluid phase at low densities to an orientationally
ordered nematic phase at high densities. Moreover, he showed that the second virial approximation is
exact in the limit of large length-to-diameter ratio of the rods (for the isotropic phase), a situation that
we will consider in one of the exercises. Below we discuss the essential parts of Onsager’s theory briefly.
Onsager’s first step was to discretize the surface of the unit sphere into s P domains (s ≫ 1) of area dω̂ i
s
centered around unit vectors ω̂ i , with i = 1, · · · , s and the normalization i=1 dω̂ i = 4π. He regarded
a rod with orientation ω̂ as a particle of species i, whenever ω̂ is in the i-th domain of the unit sphere.
Denoting the density of particles with orientation i by ρi , Onsager wrote the second-virial approximation
to the Helmholtz free energy of the s-component system — using Eq. (16.3) — as
s s
F X X (ij)
= f = ρi log(ρi V) − 1 + B2 (T )ρi ρj ;
V kB T i=1 i,j=1
Z Z
s→∞
dω̂dω̂ ′ B2 (ω̂, ω̂ ′ ; T )ρ(ω̂)ρ(ω̂ ′ ),
= dω̂ ρ(ω̂) log ρ(ω̂)V − 1 + (17.8)
where the continuum limit s → ∞ results in integrals over the unit sphere. Here, the “thermal volume”
V is the analogue for rods of the factor Λ3 for spheres; its precise form is immaterial as it has no baring
on physically measurable quantities. Recall that the second virial coefficient is given by
Z
(ij) 1
B2 (T ) = − dr exp[−βϕ(ij) (r)] − 1 ;
2
Z
s→∞ 1
= − dr exp[−βϕ(r, ω̂, ω̂ ′ )] − 1 ≡ B2 (ω̂, ω̂ ′ ; T ), (17.9)
2
where the pair potential ϕ(r, ω̂, ω̂ ′ ) between two rods at separation r depends on the two orientations,
ω̂ and ω̂ ′ , of the two rods under consideration. Before specifying the precise form of this pair potential,
we discuss Onsager’s next step: minimize the free-energy density f with respect to ρi (or ρ(ω̂)). Naively
one would solve the s coupled equations ∂f /∂ρi = 0 for the s unknown quantities ρi at fixed T and total
density
Ps of the rods ρ. The problem, however, R is that the ρi s must satisfy the normalization constraint
ρ
i=1 i = ρ, (or in the continuum limit dω̂ ρ(ω̂) = ρ). This constraint is most easily taken into
account by introducing the Lagrange multiplier λ (see Chapter 2), and solving the resulting set of
coupled nonlinear equations
s
!
∂ X
0 = f −λ ρi
∂ρi i=1
s
(ij)
X
= log ρi V + 2 B2 ρj − λ. (17.10)
j=1
17.2. ONSAGER THEORY 237
The value of λ will be fixed, at a later stage of the calculation, by the normalization constraint. The
(implicit) solution of Eq. (17.10) reads
s
exp(λ) X (ij)
ρi = exp −2 B2 ρj , (17.11)
V j=1
which can be interpreted as a self-consistent Boltzmann distribution, in the sense that the density ρi is
proportional to a Boltzmann factor that depends on all ρj s (including the terms i = j). That is, the
P2 (ij)
term 2 j=1 B2 ρj is like a potential (divided by kB T ) that acts on species i. This potential is, of
course, due to the interaction with all the other species (and its own species). The normalization follows
from Eq. (17.11) as
s s s
X exp(λ) X X (kj)
ρk = exp −2 B2 ρj = ρ. (17.12)
Λ3 j=1
k=1 k=1
and hence we obtain, from Eq. (17.11), the implicit equations for the minimizing densities
h P i
s (ij)
ρ exp −2 j=1 B2 ρj
ρi = P h P i. (17.14)
s s (kj)
k=1 exp −2 B
j=1 2 ρ j
The continuum version of this equation, which we will analyze from now on, is given by
This equation is a nonlinear integral equation for ρ(ω̂), that must be solved for given and fixed total
density ρ.
of the rod system, i.e., the phase with no preferred direction of the long axes of the rods. Inserting the
isotropic solution Eq. (17.16) back into the expression for f , Eq. (17.8), yields the free-energy density
of the isotropic phase (in the second virial approximation)
ρV
fiso (ρ, T ) = ρ log − 1 + b(T )ρ2 , (17.17)
4π
with the orientation-averaged second virial coefficient
Z
1
b(T ) = dω̂dω̂ ′ B2 (ω̂, ω̂ ′ ; T ). (17.18)
(4π)2
238 CHAPTER 17. ANISOTROPIC PARTICLES
It turns out that the isotropic distribution is the only solution of Eq. (17.15) at sufficiently low densities
ρ or sufficiently high temperatures T . That is, the second virial theory predicts that the isotropic phase
is the only possible phase for a fluid of rods in these regimes — this is consistent with experimental
observations. For typical rod interactions, which are such that B2 (ω̂, ω̂ ′ ) is smaller for smaller angles
between ω̂ and ω̂ ′ , there is also an anisotropic solution to Eq. (17.15) provided ρ is sufficiently high
or T sufficiently low. This anisotropic solution describes the nematic phase. Unfortunately, it is not
possible to calculate this nematic distribution analytically, but it is numerically straightforward to solve
the equation by means of, e.g., iteration. The idea is to guess, for fixed ρ and T , an explicit form for
ρ(ω̂), from which a second guess follows by evaluating the right-hand side of Eq. (17.15), etc., until a
self-consistent solution for ρ(ω̂) is found. This solution can then be inserted into Eq. (17.8) to obtain
the free-energy density fnem (ρ, T ) of the nematic phase. Note that this quantity only exists at high
ρ and/or low T . Of course, it should be clear that an explicit form for B2 (ω̂, ω̂ ′ ) is needed for such
numerical calculations.
The fact that f in Eq. (17.8) can be minimized by either an isotropic or a nematic distribution is due
to the fact that the first (ideal-gas) term of Eq. (17.8) is minimized by the isotropic distribution (with a
maximum entropy), while its second term is minimized by nematic distributions, since typical B2 (ω̂, ω̂ ′ )
for rods is such that small angles between ω̂ and ω̂ ′ have a lower B2 than large angles. At a sufficiently
low total density ρ, the second O(ρ2 ) term is dominated by the first O(ρ log ρ) term, and hence the
minimum of the sum of these terms is realized by the minimum of the first term. At higher densities,
the second term becomes relevant and its minimization requires orientational ordering.
which is the direct analogue for “cigar” shaped particles of the hard-sphere potential for spheres. The
Mayer function that corresponds with this potential is therefore -1 in the case of overlap and 0 otherwise.
It then follows from the geometry of the problem that
L≫D
B2 (ω̂, ω̂ ′ ) = 4v0 + L2 D| sin γ(ω̂, ω̂ ′ )| → L2 D| sin γ(ω̂, ω̂ ′ )|, (17.20)
where γ is the angle between ω̂ and ω̂ ′ , i.e., cos γ ≃ ω̂ · ω̂ ′ . In the long-rod limit L ≫ D, the v0 term
is O(LD2 ), and is thus vanishingly small compared to the O(L2 D) term3 . From now on we will work,
implicitly, in this thin-needle (long-rod) limit.
The isotropic distribution in Eq. (17.16) holds for any B2 (ω̂, ω̂ ′ ), and hence also for the hard-needle
case of interest here. Its orientation average, defined in Eq. (17.18), is given by b = 4v0 + (π/4)L2 D →
(π/4)L2 D, as will be worked out in one of the problems. It is independent of T in this hard-rod case.
With this b, the free energy of the isotropic phase is thus completely specified by Eq. (17.17) within the
3 This argument does not hold if the two rods are exactly parallel, γ = 0. This case is, however, of measure zero and
second-virial approximation. One expects nematic solutions that minimize Eq. (17.8) at high enough
ρs. The reason is that the last term of Eq. (17.8) is small, with the B2 of Eq. (17.20), if small angles γ
occur frequently, i.e., if ρ(ω̂) is peaked about a specific director, say n̂. The direction of n̂ is irrelevant
for the resulting free energy of a bulk fluid by symmetry, since the free energy will not be a affected by
a global rotation. The symmetry is also such that a rotation about the symmetry axis n̂ does not affect
the free energy, and the resulting minimizing distribution can only depend on the polar angle θ of the
orientation ω̂ with respect to n̂, i.e., ρ(ω̂) = ρ(θ) with cos θ = ω̂ · n̂. The nematic distribution of hard
needles must therefore satisfy, from Eqs. (17.15) and (17.20), the nonlinear integral equation
Rπ
ρ exp −2L2 D 0 dθ′ sin θ′ K(θ, θ′ )ρ(θ′ )
ρ(θ) = R π ′′ Rπ , (17.21)
0
dθ sin θ′′ exp −2L2 D 0 dθ′ sin θ′ K(θ′′ , θ′ )ρ(θ′ )
Here, we use that we parameterized ω̂ as (sin θ sin φ, sin θ cos φ, cos θ) (choosing n̂ = (0, 0, 1)). Note
that K is independent of φ. Numerical solutions to Eq. (17.21) can be obtained for given dimensionless
densities ρL2 D, and some resulting distributions are plotted in the left-hand panel to Fig. 17.2. Insertion
of the orientation distributions into the free-energy expression (17.8) gives the isotropic and nematic
free-energy densities fiso (ρ) and fnem (ρ). These are plotted in the right-hand panel to Fig. 17.2.
240 CHAPTER 17. ANISOTROPIC PARTICLES
𝜌(𝜃)
𝜌𝐿2 𝐷 =5
12
𝜌𝐿2 𝐷 =6
20 6
10
0
0
0 0.5 1 0 2 4 6
𝜃/𝜋 𝜌𝐿2 𝐷
Figure 17.2: Properties of the Onsager theory for hard rods. (left) Angular distribution function of
hard rods at several dimensionless densities ρL2 D. The distribution of ρL2 D = 5 (red) is isotropic, i.e.,
independent of θ. The distributions for ρL2 D = 6 (orange), 7 (green), and 8 (blue) are peaked about
θ = 0 and θ = π and represent nematic distributions. Note the up-down symmetry and the increasing
orientational ordering with ρ. (right) The reduced Helmholtz free-energy density (blue curves) of the
isotropic and nematic phase of infinitely elongated hard spherocylinders. The common-tangent con-
struction (red, dashed line) yields coexistence of an isotropic phase at density ρI and a nematic phase
at density ρN , with ρI L2 D = 4.189 and ρN L2 D = 5.336, as indicated using the green, dashed lines.
At densities ρ < ρI the system is isotropic, at densities ρ > ρN the system is nematic, with increasing
orientational ordering as ρ increases.
17.4. EXERCISES 241
17.4 Exercises
Q95. The Orientation-Averaged Second Virial Coefficient
The orientation-averaged second virial coefficient of two hard spherocylinders is given by
Z
iso 2
B2 = 4v0 + L D/(4π) 2
dω̂dω̂ ′ | sin γ|, (17.23)
with γ the angle between ω̂ and ω̂ ′ . Check this expression, calculate B2iso , and discuss the impor-
tance of the first term 4v0 as a function of L/D.
with ρα the density of particles with orientation α, and V the (irrelevant) thermal volume.
(a) Argue that Bαα′ is a symmetric 3 × 3 matrix. Calculate the second virial coefficients B11 =
B22 = B33 ≡ B∥ and B12 = B13 = B23 ≡ B⊥ for pairs of parallel and perpendicular rods,
respectively.
(b) Consider from now on the “needle” limit L/D → ∞. First calculate B∥ /L2 D and B⊥ /L2 D
in this limit, and then show that the dimensionless free energy ψ = F L2 D/V kB T takes the
form
X V
ψ= cα (log cα − 1 + log 2 ) + 2(c1 c2 + c1 c3 + c2 c3 ), (17.25)
α
L D
with dimensionless densities cα = L2 Dρα . The constant term log V/L2 D can be ignored, it
is an irrelevant offset of the free energy or chemical potential as we will see.
(c) Define the nematic order parameter S by c3 = c(1 + 2S)/3 and c1 = c2 = c(1 − S)/3, with
c = c1 + c2 + c3 = ρL2 D the total dimensionless density. Explain this nomenclature. Give
the range of the parameter S, keeping in mind that densities are non-negative.
(d) Calculate ψ(c, S). For a given c one needs to determine S such that it minimizes ψ (at the
fixed c). Show that S = 0 is a solution of (∂ψ/∂S)S=0 for any c. Which phase is associated
with S = 0?
(e) The result of (d) does not guarantee that S = 0 yields a minimum of ψ. Argue on the basis
of (∂ 2 ψ/∂S 2 )S=0 that ψ is minimized by S ̸= 0 at c > c∗ . Calculate c∗ . Which phase do you
associate with S ̸= 0?
(f) Phase coexistence of a low-density isotropic phase, with density cI and order parameter
SI = 0, and a high-density nematic phase, with density cN and order parameter SN , requires
three conditions to fix the three unknowns cI , cN , and SN . Give these conditions.
242 CHAPTER 17. ANISOTROPIC PARTICLES
(g) The coexistence conditions involve nonlinear algebraic equations that can easily be deter-
mined numerically, e.g., with Mathematica root-finding procedures. Write such a code, and
confirm that cI = 1.258, cN = 1.915, and SN = 0.915. Compare these numbers also to c∗ .
(h) Estimate, for hard rods with L/D = 100, the packing fractions beyond which orientational
ordering is to be expected on the basis of the results of (g).
(a) What are the isotropic and nematic phase for a (colloidal) liquid crystal? Illustrate using
sketches.
The nematic director is the average orientation of the rods in the liquid crystal. Let θ measure
the angle between a rod-like colloid and this director. It is sensible to create an order parameter
that depends on a series in terms of cos θ, which takes values in the range [−1, 1].
(b) Explain why S = (1/2) 3 cos2 θ − 1 is a suitable order parameter for this transition,
referencing the properties of the rods and Landau theory.
(c) The existence of an equilibrium state is guaranteed, if the highest order is even and the
associated prefactor is positive. Explain.
(d) What does the linear term hS represent and why may we set h = 0?
The typical assumption is that A = a(T − Tc ) with a a prefactor (dependent on p) and Tc is the
critical temperature. B and C are assumed to be nonzero and to not depend on the temperature,
we write B = −b and C = c to indicate this and arrive at the following form for the IN Landau
theory, after truncating the series to 4th order: FIN = F0 + a(T − Tc )S 2 − bS 3 + cS 4 .
2
(e) Show that FIN − F0 = a(T − Tc ) − b2 /(4c) S 2 + cS 2 [S − b/(2c)] .
(f) For which two values of S does the right-hand side vanish? One solution requires an additional
condition on the temperature, call it T ∗ . What is the physical meaning of T ∗ ?
(g) What is the order of the isotropic-nematic phase transition? Use arguments supported by
the above Landau theory.
Chapter 18
In this chapter, we will consider the dynamics of colloidal particles suspended in a fluid medium. We
have previously examined the interplay between the fluid medium and suspended colloids in Chapter 16,
focusing on the latter’s effective interactions. Here, we will study the random motion exhibited by the
colloids, i.e., Brownian motion. We saw in Chapter 3 that a time average of a quantity, when taken
over a sufficiently long interval and provided the system is ergodic, is equal to the ensemble average.
That is, the dynamics of the particles generate, over time, the configurations that are the basis of the
state-counting arguments that underlie statistical mechanics. Thus, the fluid medium may be said to
fluctuate in time around a well-defined average, which coincides with that of the ensemble average.
Consider a larger particle is suspended in this fluctuating medium and examine a small fluid subvolume
in contact with a small part of the particle’s surface. Then we would expect that a difference in density,
due to a fluctuation in particle number, would lead to a slightly different force experienced by that
part of the particle’s surface. Clearly, on average, the effect of fluctuations should wash out. However,
there will be small positional excursions of the suspended particle, due to instantaneous heterogeneous
distribution of forces acting on it. That is, the particle will diffuse around its average position. The
effect is more pronounced, when the particle is closer to the size of the molecules of the fluid. This
insight can be made mathematically rigorous, as was done by Einstein in one of his three seminal 1905
papers (the most cited one). It turns out that there is actually something more interesting going on,
which was later generalized and formalized in the fluctuation-dissipation theorem. This theorem was
proven by Herbert Callen and Theodore Welton in 1951 and expanded upon by Ryogo Kubo.
Briefly outlining the content, we will first set the historical background for the fluctuation-dissipation
theorem, before deriving the static result. Next, we discuss the Langevin equation, by which the
dynamics of a Brownian particle may be described macroscopically. Then we turn to the dynamic variant
of the fluctuation-dissipation theorem. After that, we build upon the Langevin formalism to provide
insight into a continuum description of the problem of diffusion, i.e., using the continuity equation and
density-based Fickian diffusion. This sets the stage for recovering the Stokes-Einstein relation, which is
what triggered much of the discussion on the fluctuation-dissipation theorem to begin with. We close
the chapter and these notes by giving a flavor of a truly non-equilibrium system that has received much
attention in modern physics. We will extend the Langevin equation to a system of particles that are
self-propelled, e.g., bacteria and camphor boats. These systems show many interesting behaviors, for
which a full description using the methods of statistical physics is thus far remains elusive.
243
244 CHAPTER 18. MOVING AWAY FROM EQUILIBRIUM
Nowadays, we do not find it strange to think that Brown’s observations on the random motion of the
“small particles inside the pollen of plants”, is caused by the bombardment of those particles by solvent
molecules. However, in the context of Brown’s time, this interpretation was far from trivial. The
existence of atoms and molecules was still widely disbelieved, with the continuum picture of matter
being favored. In addition, Antonie van Leeuwenhoek had used a microscope to observe the motion of
“tiny animals”. Brown’s work was on plants, so who was to say that the tiny particles in the pollen
of plants were not such creatures, or that they moved under the influence of some kind of ‘life force’ ?
Brown (presumably) realized this and therefore demonstrated the robustness of his finding by creating
powders of various minerals and even going as far as grinding up a piece of the nose of the Sphinx. In
hindsight, the thinking was likely that the Sphinx was the oldest man-made object known at the time.
Hence, powdering a piece of it would convince everyone that the effect was not life-based, as clearly
Sphinx’ material must be quite dead. These careful studies substantiated Brown’s hypothesis that the
motion was caused by the existence of molecules.
However, it took much longer before this picture was accepted, in part due to the efforts of Rutherford on
demonstrating the existence of the nucleus, and in part due to careful measurements of Brownian motion
and a theoretical description thereof in terms of molecular theory. It was Einstein who theoretically
related the observable effect of Brownian displacements to the thermal motion of individual atoms or
molecules, which are unobservable themselves — X-ray microscopes did not yet exist1 . Einstein’s 1905
analysis led to the following relation
kB T
D= (18.1)
6πηa
where D is the diffusion coefficient of a spherical particle of radius a in a fluid medium with dynamic
viscosity η. The viscosity is essentially the frictional damping coefficient of the medium, or more
precisely the momentum diffusion coefficient. Convince yourself of this by examining the dimensions of
the combination η/ρ, with ρ the density of the medium. The particle’s diffusion coefficient in Eq. (18.1)
is defined in terms of the variance of the particle displacements 6Dt = ⟨|R(t) − R(0)|2 ⟩.
Equation (18.1) is remarkable. The diffusion coefficient and the viscosity are both (nontrivial) functions
of the temperature and the pressure, but in such a way that their product Dη ∝ T for a given colloidal
particle size a. We can make this a bit more explicit. The factor in the denominator is exactly the
resistance experienced by a spherical particle under the application of an external force. That is, in the
low Reynold’s number regime of fluid dynamics — governed by the Stokes equation — we have that
F = 6πηau with F the applied force and u the sphere’s velocity. The equation is therefore referred to as
the Stokes-Einstein relation. Dimensionally, the results makes sense: D has units of m2 /s and u of m/s,
while kB T has units of J and F of N = J/m. However, the implication of Eq. (18.1) is very profound:
thermodynamic fluctuations in a physical variable predict the response quantified by the susceptibility
1 Visualizing the dynamics of single atoms in solids using X-ray microscopy is considered state of the art.
18.2. STATIC FLUCTUATION-DISSIPATION THEOREM 245
of the same physical variable. This sloppy definition of the fluctuation-dissipation theorem is a rather
abstract, but we have encountered this concept at several points throughout the notes, without explicitly
commenting on it. We will start to build towards a general theorem in Section 18.2.
Before turning to the fluctuation-dissipation theorem, we wish to make a few closing remarks with
regards to Eq. (18.1) and its historical significance.
• The Stokes-Einstein relation can be used to estimate the time it takes a particle to diffuse over
a distance of its radius. Denoting this diffusion time by tD , it follows from Eq. (18.1) that
tD = a2 /6D = πa3 η/kB T . Inserting typical numbers for colloids, a = 1.0 µm, η = 10−3 kg/ms
(water), and kB T ≃ 10−21 J (room temperature), one finds tD ≃ 1 s, i.e., the motion is time-
resolvable under a microscope. It is sometimes erroneously reported that Brown studied the
random motion of pollen, a claim that is not substantiated by even a cursory examination of the
title of his paper. Brown studied the particles inside Clarkia pulchella pollen, which are colloidal
in size. The pollen themselves have a radius of ≈ 25 µm, which implies that he would have had to
wait at least 3 hours to have seen diffusive motion of pollen! Perhaps the mistake lies in disbelief
over the level of optical resolution Brown would have needed to observe these particles. Indeed
the colloidal matter that Brown studied was close to the edge of what could be resolved using the
microscopes at his disposal. These were able to resolve structures as fine as 0.7 µm [B.J. Ford,
Notes Rec. R. Soc. Lond. 55, 29–49 (2001)]; clearly, his observations are pretty impressive.
• Einstein pointed out that all quantities in Eq. (18.1) can be measured directly: D from the mean
squared displacement of the colloids observed under the microscope; η, e.g., from macroscopic
mechanical experiments with the medium; a with the microscope; T with a thermometer. Conse-
quently, the experiments on Brownian motion should yield the numerical value for the Boltzmann
constant kB . From this Avogadro’s number follows, since NA = R/kB with R the gas constant,
which is also known from macroscopic experiments of dilute gases.
Einstein’s derivation motivated Jean Perrin to quantitatively study Brownian motion under the micro-
scope, and he could verify the validity of the predictions quantitatively, as well as obtain a value of NA .
This provided conclusive evidence that atoms and molecules actually exist and deservedly resulted in
Perrin’s 1923 Nobel prize.
β
⟨M 2 ⟩ − ⟨M ⟩2 .
χ= (18.2)
N
That is, the susceptibility can be written in terms of the fluctuations of the magnetization M . However,
this property also determines the magnetization of a system in response to an external magnetic field
246 CHAPTER 18. MOVING AWAY FROM EQUILIBRIUM
H. That is, M = χH, whereP H aligns the spins and competes with their internal, alignment-based
interactions. Writing M = i Si for the magnetization, we can rewrite the susceptibility as
β X
χ= (⟨Si Sj ⟩ − ⟨Si ⟩⟨Sj ⟩) , (18.3)
N i,j
but the summand is obviously related to the spin-spin correlation function G(i, j), see Chapter 11.
Thus, the ability to align spins using external means is directly related to correlations in their thermal
alignment. Similarly, for the other two examples — compressibility and heat capacity — the conjugate
forces are the temperature and pressure, respectively. With this and the dynamic example of Brownian
motion, where fluctuations in the position relate to the mobility of the particle, it should now be
abundantly clear that there is something very generic about correlations in the system and the system’s
response to external driving. We shall formalize this next.
For the static case, consider an equilibrium system at temperature T , then the time-independent canon-
ical distribution function is
exp [−βH(Γ)]
fc (Γ) = ; (18.4)
N !h3N ZH (N, V, T )
Z
1
ZH (N, V, T ) = dΓ exp [−βH(Γ)] , (18.5)
N !h3N
with H the Hamiltonian of the system. The equilibrium average of a macroscopic observable A(Γ) is
given by Z
⟨A⟩H = dΓA(Γ)fc (Γ). (18.6)
The subscript H is used to indicate against which Hamiltonian the distribution, partition function, and
average are taken. Let us now perturb the system with a time-independent potential V = −λA, which
is linear in our observable of interest. The perturbed Hamiltonian may now be written as H′ = H + V ,
the factor λ is a constant that determines the strength of the perturbation, we will assume λ ≪ 1.
Writing the Hamiltonian this way, λ can be easily seen to be the conjugate variable to A. That is, if
we use the perturbed Hamiltonian to construct a free energy F then, λ = −∂F/∂A, which explains the
naming convention. Referring back to our examples, think pressure for λ and volume for A.
We should stress that we consider the static case here, such that under the perturbation, expressions
similar to those in Eqs. (18.4)) and (18.5) are applicable, but then with H′ as the appropriate Hamil-
tonian. We start by Taylor expanding these to linear order in λ to obtain the following expression for
the perturbed partition function
Z
1
ZH′ (N, V, T ) = dΓ exp [−βH′ (Γ)] ; (18.7)
N !h3N
Z
1
≈ dΓ exp [−βH(Γ)] (1 − βV ) = ZH (N, V, T ) (1 − β⟨V ⟩H ) . (18.8)
N !h3N
Suppose that we have another macroscopic observable B(Γ), then the perturbed average of B to first
order is given by
Z
1
⟨B⟩H′ = dΓB(Γ) exp [−βH′ (Γ)] ;
N !h3N ZH′ (N, V, T )
Z
1
≈ dΓB(Γ) exp [−βH(Γ)] (1 − βV ) ;
N !h3N ZH (N, V, T ) (1 − β⟨V ⟩H )
≈ (1 + β⟨V ⟩H ) (⟨B⟩H − β⟨BV ⟩H ) ;
≈ ⟨B⟩H + β (⟨B⟩H ⟨V ⟩H − ⟨BV ⟩H ) . (18.9)
18.3. THE LANGEVIN EQUATION 247
This result ignores any terms that are quadratic in λ, as we linearize in the perturbation. Let now
examine the change of B with respect to the unperturbed system
In the last line, we have introduced the part of the observable B that is connected to the perturbation
caused by observable A. That is, in a statistical sense, the part of B that is correlated with A. We can
also rewrite this as the derivative of ⟨B⟩H′ with respect to λ, i.e., we have Taylor expanded so that the
∂⟨B⟩H′
βλCH (BA) = . (18.11)
∂λ λ=0
However, the obvious physical interpretation of Eq. (18.11) is that this is the change of B with respect
to an applied A, for small departures from the unperturbed system. Or in other words, this is the
susceptibility of B to A, so that we may finally write
This is the general static variant of the fluctuation-dissipation theorem. Summarizing, this theorem
states that the linear response of the system to a perturbation is given by the connected part of the
correlation function with respect to the unperturbed system. Clearly, the above examples assume A = B,
but this is not necessary in general, as we have now seen. Before we can cover the dynamic variant of the
fluctuation-dissipation theorem, we shall need to gain some feeling for dynamic correlation functions,
which we will do next.
Let us ignore, for the moment, the effect of the random force; this situation is often referred to as
quiescent to indicate that the suspending fluid is non-fluctuating. The Langevin equation is then
dv(t)
= −ξv(t) ⇒ v(t) = v 0 exp(−ξt), (18.14)
dt
where the initial velocity v 0 is an integration constant. We have now found a physical interpretation of
the parameter ξ: on a timescale O(ξ −1 ) the drag force reduces the initial velocity considerably, such that
248 CHAPTER 18. MOVING AWAY FROM EQUILIBRIUM
after a period of a few ξ −1 the particle has essentially come to rest. The situation is more interesting
when f is not ignored. In that case one checks that the solution of the Langevin equation (18.13) reads
1 t
Z
v(t) = v 0 exp(−ξt) + ds f (s) exp ξ(s − t) . (18.15)
m 0
That is, the equation depends on the details of f in the time interval [0, t]. Figure 18.1 shows an
illustration of the effect of the noise on the velocity decay.
v(t)
v0
v0e-ξt
Figure 18.1: Illustration of the velocity decay from the Langevin equation without thermal noise (blue,
dashed) and with thermal noise (red, solid) curves, respectively.
Even though we have now found the exact solution, we cannot say too much about the dynamics of
the particle as long as we do not know the details of f (t). However, it seems physically reasonable to
assume that the time average ⟨v 0 · f (t)⟩ = 0 for all t, since one expects that the random force and the
initial velocity are uncorrelated for all elapsed times. As a consequence, we can write for the correlation
between the initial velocity and the velocity at time t that
3kB T
⟨v(t) · v 0 ⟩ = ⟨|v0 |2 ⟩ exp(−ξt) = exp(−ξt), (18.16)
m
where the second equality follows from the equipartition theorem, (m/2)⟨v 0 · v 0 ⟩ = 3kT /2, and we have
implicitly assumed that t > 0, which is rather important in order to properly account for causality! In
equilibrium any time correlation function only depends on the time-difference that elapsed between the
initial and final time, and therefore we can rewrite Eq. (18.16) slightly more generally as
3kB T
⟨v(t) · v(t′ )⟩ = exp(−ξ|t − t′ |). (18.17)
m
Note that a comparison of Eq. (18.17) with Eq. (18.15) shows that ⟨f (s) ⊗ f (s′ )⟩ = ̸ 0, i.e., the random
forces are correlated (but only for very short time intervals s − s′ , or even for s = s′ only). We now use
Eq. (18.17) to calculate the typical distance that the particle has moved away from its initial position
after a time t. Denoting the particle’s center-of-mass position at time t by R(t), it is trivial to write
Z t
R(t) = R(0) + ds v(s). (18.18)
0
Since ⟨v(t)⟩ = 0 in the absence of any macroscopic flow, we have ⟨R(t) − R(0)⟩ = 0, i.e., the random
motion of the particle does not have a preferred direction. The mean squared distance, however, does
18.4. DYNAMIC FLUCTUATION-DISSIPATION THEOREM 249
Before we come to the dynamic fluctuation-dissipation theorem, we will first examine the physical
consequences of the above formalism. We can calculate the typical time scale tB ≡ ξ −1 , the Brownian
time, at which the crossover from ballistic to Brownian motion takes place. Using Stokes law, mξ = 6πηa
as discussed above, we find that the Brownian time tB = 1/ξ, is given by
m
tB = . (18.20)
6πηa
We can now insert a few typical numbers for, e.g., a colloidal particle in water at room temperature:
radius a = 1 µm, mass density 1g/cm3 , and solvent viscosity η = 10−3 kg/ms. p This yields tB ≈ 10 s.
−7
2 −10
The corresponding distance that is traveled during this time is ℓB = tB ⟨v0 ⟩ ≈ 10 m, where we
used the equipartition result and that kB T ≃ 10−21 J at room temperature. From this we can conclude
that ℓB ≪ a, i.e., the ballistic dynamics only lasts for an extremely short period of time, and during this
period the particle only travels a tiny distance compared to its own size. For this reason, we can ignore
the ballistic short-time behavior for all practical purposes, and instead focus on the long-time diffusive
dynamics for colloidal particles. This means that on the intermediate colloidal scale, the Langevin
equation may be replaced by the Brownian dynamics equation mξv(t) = f (t), which represents the
overdamped dynamics only. We will see that for active particles, which are the focus of the last section,
another ballistic regime will appear.
that the Hamiltonian remains unperturbed from t = 0 onward. That is H′ (Γ, t) = H(Γ) + λΘ(−t)A(Γ),
with A(Γ) an observable and H and H′ the original and perturbed Hamiltonian, respectively. Lastly, Θ
denotes the Heaviside function, 1 for t > 1 and 0 for t < 0. What we want to know is how the system
relaxes back toward equilibrium, once the perturbing force is switched off. This will give us the same
insights for the situation where we switch the disturbance on, but it is conceptually simpler.
The initial state of the system is well described by the static formalism of Eqs. (18.4) and (18.5), with
H′ as the appropriate Hamiltonian. After the perturbation is switched off, however, the system will
evolve according to H. To first order in λ, using the same line of argument as for the static case, the
time evolution of B for t > 0 with respect to the unperturbed system is given by
The interpretation of Eq. (18.21) is that of a difference in averages between (i) a system that up to time
t = 0 has come into equilibrium with the perturbed Hamiltonian H′ — this is why it was necessary to
let time run in the domain (−∞, 0] — and (ii) a system which has remained in equilibrium with respect
to H. Thus, the average ⟨·⟩H′ is not truly an ensemble average over the entire time line. It is far more
appropriate to write ∆B(t) ≡ ⟨B(t)⟩H′ − ⟨B(t)⟩H and work in terms of ∆B(t) only. However, the above
argument is how the derivation is presented in all textbooks the author of this chapter is familiar with.
The interpretation of CH is that of the relaxation of an out-of-equilibrium B (in equilibrium with H′ )
relaxing toward equilibrium with H.
We still need to introduce the dynamic susceptibility to make progress toward the fluctuation dissipation
relation we seek. Briefly assume that we have an arbitrarily time-dependent λ(t). The change in
a parameter B is simply the dynamic susceptibility χAB (t, s), with t the time at which we probe a
response to a perturbation applied at time s, integrated over time with respect to the applied field:
Z t
∆B(t) = ds χBA (t, s)λ(s). (18.23)
−∞
Note that the susceptibility is defined with respect to a system in equilibrium, hence χAB (t, s) =
χAB (t − s), i.e., it must be a function of differences in time. In addition, the susceptibility cannot look
into the future (causality), so that χAB (t − s) exists if t > s, and is zero otherwise. Now we return to
our system with λ(t) = λΘ(−t). Here we apply the force only for t < 0, so that
Z 0
∆B(t) = ds χBA (t − s)λ. (18.24)
−∞
which only applies for t > 0. Taking the time derivative of the two forms of ∆B, we arrive at
∂∆B ∂
= βλ CH (B(t)A(0)) = −λχBA (t) ⇒ (18.26)
∂t ∂t
∂
χBA (t) = −β CH (B(t)A(0)), (18.27)
∂t
18.4. DYNAMIC FLUCTUATION-DISSIPATION THEOREM 251
for t > 0 and χBA (t) = 0 otherwise. For B = A we find χ(t) ≡ χAA (t) = −β(∂/∂t)⟨δA(t)δA(0)⟩H .
This expresses that the way a system responds in time to a perturbing potential is related the temporal
nature of thermal excursions about its equilibrium. Returning to Fig. 18.1, the decay time of the velocity
is exactly the time over which the particle decorrelates with respect to its original position. The above
derivation is not quite as elegant as it could be, but a more general derivation requires more sophisticated
mathematical techniques. In such a derivation, a Fourier transform is taken and the resulting frequency-
based dynamic fluctuation-dissipation theorem is related to the spectral density of the system via the
Wiener-Khinchin theorem. This makes the connection between fluctuations and dissipation explicit.
∂
CH (v(t)x(0)) = −⟨v(t)v(0)⟩H ⇒ χvx (t) = ⟨v(t)v(0)⟩H (18.29)
∂t
The above expression already makes a clear connection between fluctuations in v and the mobility that
relates v to an applied force f .
We can now ask ourselves what the expression for the susceptibility is. The change of the particle
velocity due to the applied force can be related to the susceptibility according to
Z ∞
v(t) = f ds χvx (s) (18.30)
t
The Langevin equation (18.13) further tells us how the speed would decrease upon switching off the force
f at t = 0 in the absence of fluctuations, namely v(t) = v(0) exp(−ξt) = f exp(−ξt)/(mξ). Together
with Eq. (18.30) this implies that χvx (t) = exp(−ξt)/m and hence
kB T
⟨v(t)v(0)⟩H = exp(−ξt), (18.32)
m
which only holds for t > 0 and is an expression of the memory of the noise. This is exactly and
reassuringly the expression of Eq. (18.17). The system loses its memory with a characteristic time ξ −1 .
252 CHAPTER 18. MOVING AWAY FROM EQUILIBRIUM
Lastly, we obtain the Stokes-Einstein relation (18.1) for a spherical particle. Note that we have used
mξv(0) = f to arrive at Eq. (18.31). However, we know that the velocity must be that of a sphere that
has been subjected to f from time t = −∞ onward, thus fully relaxed to its terminal velocity. For a
small sphere in a fluid medium, the hydrodynamic Stokes equation states that 6πηav(0) = f , so that
mξ = 6πηa and we arrive at
kB T
⟨|x(t) − x(0)|2 ⟩ = 2 t, (18.33)
6πηa
for t ≫ m/(6πηa). Here, we have used translational invariance on the correlation and double integration
over time of Eq. (18.32) along the same lines as in the previous section. Since the above discussion was
completely in terms of a 1-dimensional problem, we expect ⟨|x(t) − x(0)|2 ⟩ = 2Dt, thus recovering the
desired form of Eq. (18.1).
Consider a hypothetical fixed volume V of arbitrary shape with linear dimensions small compared to
the system size, but large compared to the particle’s size; the system
R may be described by a continuum.
The number of particles in this volume is given, at time t, by V drρ(r, t). Since particles cannot be
destroyed or created, the only way that the number of particles in V can change is by a net flux of
particles into or out of the surface of V . The flux of particles is called j(r, t) here, and is such that
j · ds is the number of particles that flows through a small surface element ds per unit time, where the
orientation of ds is parallel to its normal. We can now write
Z Z Z
∂
dr ρ(r, t) = − ds · j(r, t) = − dr ∇ · j(r, t). (18.34)
∂t V S V
Here, S denotes the surface of V , ds is an outward pointing surface element, and ∇ denotes the gradient
with respect to r. We also employed Gauss’ theorem to obtain the second equality. Note that the minus
sign in Eq. (18.34) indicates that the number of particles in V increases if the flux is antiparallel with the
outgoing surface normal, and decreases if it is parallel. Because the volume V is arbitrary, Eq. (18.34)
should hold for any V and hence we have the continuity equation
∂ρ(r, t)
= −∇ · j(r, t), (18.35)
∂t
which provides one relation between the two fields ρ(r, t) and j(r, t). Thus, a second equation is needed
before a solution can be found.
Perhaps the simplest case emerges when the particle flux is assumed to be proportional to the (negative
of the) concentration gradient, j ∝ −∇ρ, which describes that particles tend to flow from high to low
concentrations. We call the proportionality factor D — we will see that D turns out to be the diffusion
coefficient introduced before — and hence we have
which is referred to as Fick’s law and is the required second relation between ρ and j. Combining it
with the continuity equation leads to the partial differential equation known as the diffusion equation
∂ρ(r, t)
= D∇2 ρ(r, t), (18.37)
∂t
which can be solved, in principle, if an initial density profile ρ(r, t = 0) is given.
Figure 18.2: A cross section of the density profile for an initially δ-localized particle density, which
spreads out over time, from red to blue.
We consider the particular case ρ(r, t = 0) = N δ(r), i.e., all N particles are initially in the origin. The
solution of the diffusion equation (18.37) can then be written as
N
ρ(r, t) = exp(−r 2 /4Dt), (18.38)
(4πDt)3/2
which is illustrated in Fig. 18.2. In three dimensions, we can decompose r 2 = x2 + y 2 + z 2 , from which
we see that ρ(r, t) = f (x)f (y)f (z) with f (x) ∝ exp(−x2 /4Dt). That is, we have a Gaussian with a
time-dependent variance given by 2Dt. From this we find the mean-square displacement of the particles
⟨r 2 ⟩ = 6Dt, which upon comparison with Eqs. 18.19 and 18.1 shows that D as defined in Eq. (18.36) is
indeed the diffusion coefficient.
effects. While a full description of these efforts goes beyond the scope of these notes, we will have a
sneak peek at one of the most basic models for self-propulsion of bacteria.
Bacteria are mostly in the 1 to 10 µm size range, though there are single-cell organisms called Thiomar-
garita namibiensis that can grow up to 0.3 mm. Clearly, overdamped dynamics is applicable for to
describe the dynamics of a single bacterium. In addition, it turns out that bacteria like to live near
surfaces2 , which implies that their motion is usually in plane. The problem of describing the bacterium’s
motion may thus be described by specifying the evolution of position of the center of mass r(t) and the
orientation n̂, which equivalently can be specified by an angle θ for in-plane 2D motion.
〈r 2(t)〉/(4Dtτ)
104
v0
103 100
2
10 25
101 10
5
100 Pe = 0
10-1
t/τ
10-1 100 101
We start by specifying the Brownian dynamics equation for the bacterium’s velocity v(t) = ṙ(t):
γt [v(t) − v0 n̂(t)] = f t (t). (18.39)
Here, γt is the hydrodynamic translational friction coefficient (generally a tensor for a shape-anisotropic
particle) with associated diffusion coefficient Dt = kB T /γt . The parameter v0 specifies the constant
self-propulsion speed, which points in the direction n̂ that co-moves with the swimmer, see the inset to
Fig. 18.3 for an example. Lastly, f t is the thermal noise associated with translational motion as before:
⟨f t (t)⟩ = 0 and ⟨f t (t) · f t (t′ )⟩ = 4kB T γt δ(t − t′ ). The direction in which the bacterium points can be
updated either randomly or according to some more biorealistic update rule, such as a run-and-tumble
process3 . For the stake of simplicity, reorientation on the basis of Brownian noise is typically used,
which is also conveniently what we have studied thus far in the notes. Let n̂ be specified by the angle
θ(t) (2D motion), then
γr θ̇(t) = fr (t), (18.40)
with γr the rotational friction coefficient and ⟨fr (t)⟩ = 0 and ⟨fr (t)fr (t′ )⟩ = 2kB T γr δ(t − t′ ). Note
that the (inverse) rotational diffusion coefficient Dr−1 = γr /kB T has a dimension of time and is often
2 The discussion as to why this is the case exactly, is still ongoing, but very relevant to medical and commercial problems
For example, in Escherichia coli straight swimming (running) can be accomplished by bundling all the self-propelling
flagella into one helical ‘braid’, while tumbles are accomplished by the bacterium pulling one of the flagella out of the
braid, reorienting, and re-braiding.
18.6. ACTIVE BROWNIAN PARTICLES 255
called the rotational diffusion time τ . The evolution of the orientation with time then may be written
as ⟨n̂(t) · n̂(t′ )⟩ = exp(−|t − t′ |/τ ). These two equations specify what is known in the literature as the
(2D) Active Brownian Model; variants exclude translational diffusion or use run-and-tumble statistics.
From the Active Brownian Model, it is straightforward to compute the translational mean-squared
displacement. Working out the math, as you will do in one of the exercises, we find
This implies that there is another ballistic regime of motion, associated with the self-propulsion of
the particle. At times longer than the reorientation time, there is a fully diffusive regime with an
enhanced diffusion coefficient Deff = Dt + v02 /(2τ ). The reorientation time is thus a control parameter
for the transition between active ballistic and (enhanced) diffusive motion, see Fig. 18.3 for example full
mean-squared displacement curves. Note that we can rewrite the result of Eq. (18.41) by introducing a
Péclet number. Péclet number are dimensionless combinations that indicate the relative contribution of
persistent to diffusive motion (typically) and
√ a possible combinations here are: (i) Pe = v0 a/Dt with a
the size of the bacterium and (ii) Pe = v0 / Dt Dr . The former Pe number compares the time it takes to
self-propel one particle radius, with the time it takes the particle to diffuse the same distance, while the
latter does not make explicit reference to the size of the particle, i.e., it may be used in a point-particle
description. Clearly, whenever Pe ≪ 1 it is justified to reduce the above dynamics further and only have
the rotational noise term, ignoring contributions from translational diffusion. This is also becomes clear
by examining the evolution of the mean-squared-displacement curves in Fig. 18.3 with increasing Pe.
The above simple result of substantial enhancement of the diffusion coefficient has a number of impli-
cations. Microfluidic mixing turns out to be exceptionally time consuming due to laminarity of fluid
flows, meaning that mixing only takes place though diffusion, rather than via turbulence, as is the case
on our length scale. Bacteria and artificial variants thereof are proposed to enhance the diffusion in
the fluid medium, thereby leading to faster mixing. There are many more interesting aspects of active
matter, including how the Active Brownian Model together with simple interaction rules can give rise
to clustering, but this unfortunately goes beyond the scope of these lecture notes.
256 CHAPTER 18. MOVING AWAY FROM EQUILIBRIUM
18.7 Exercises
Q98. The Diffusion Equation
The expression ∂ρ/∂t = D∇2 ρ describes the evolution of the particle density ρ(r, t). In this
exercise, we will use Fourier transforms to obtain the solution to this equation, given that ρ0 (r) =
ρ(r, t = 0). The Fourier transform of a function f (r) is defined as fˆ(k) = dr exp(ik · r)f (r).
R
with k 2 = k · k.
(c) Compute ρ̂0 (k) for ρ0 (r) = N δ(r).
(d) Compute ρ(r, t) from the answer to (c) and from this the mean-squared displacement
Z
1
⟨r2 (t)⟩ = dr r2 ρ(r, t). (18.43)
N
Use the equilibrium distribution to rewrite the ρ- and U -dependent terms and obtain the Stokes-
Einstein relation.
(a) The number of forward steps is nf and the number of backward steps is nb . Express nf en
nb in n en k.
(b) How many possible ways can nf en nb be chosen from n steps?
(c) What is the probability of one of those combinations?
(d) Derive
n!2−n
W (k, n) = . (18.45)
[(n + k)/2]![(n − k)/2]!
18.7. EXERCISES 257
(e) Assume n ≫ 1 en n ≫ k (is this reasonable?). Use the Stirling approximation and make use
of the second-order Taylor approximation for the logarithm ln(1 + x) ≈ x − x2 /2 to show that
r
2
W (k, n) ≈ exp(−k 2 /2n). (18.46)
πn
p probability of returning to the origin after a large number of steps n is thus W (0, n) ≈
The
2/(πn).
(f) Define the covered distance as x = kx0 and introduce the time it takes to take n steps t = nt0 .
Show that the probability W (x, t)dx that the particle is between x and x + dx at a time t is
given by
1
W (x, t) = √ exp(−x2 /4Dt). (18.47)
4πDt
This is a diffusion equation with diffusion coefficient D = x20 /(2t0 ).
(g) Qualitatively sketch the behavior of W (x, t) as a function of t for a fixed x, as well as a
function of x for a fixed t.
(h) Show that W (x, t) satisfies
∂W ∂2W
=D . (18.48)
∂t ∂x2
Q101. Active Brownian Model
Derive Eq. (18.41) using a variant of the approach that was employed to determine the mean-
squared displacement of the Langevin equation in Chapter 18. The first step should be to show
that ⟨n̂(t) · n̂(t′ )⟩ = exp(−|t − t′ |/τ ). Next, use this result to work out ⟨r(t) · r(t′ )⟩. Finally, Taylor
expand the obtained expression for large and small times compared to the reorientation time τ .