[go: up one dir, main page]

0% found this document useful (0 votes)
0 views34 pages

Brody and Hook

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 34

Home Search Collections Journals About Contact us My IOPscience

Information geometry in vapour–liquid equilibrium

This article has been downloaded from IOPscience. Please scroll down to see the full text article.

2009 J. Phys. A: Math. Theor. 42 023001

(http://iopscience.iop.org/1751-8121/42/2/023001)

View the table of contents for this issue, or go to the journal homepage for more

Download details:
IP Address: 140.158.32.27
The article was downloaded on 24/04/2010 at 17:49

Please note that terms and conditions apply.


IOP PUBLISHING JOURNAL OF PHYSICS A: MATHEMATICAL AND THEORETICAL

J. Phys. A: Math. Theor. 42 (2009) 023001 (33pp) doi:10.1088/1751-8113/42/2/023001

TOPICAL REVIEW

Information geometry in vapour–liquid equilibrium

Dorje C Brody1 and Daniel W Hook2


1 Department of Mathematics, Imperial College London, London SW7 2AZ, UK
2 Blackett Laboratory, Imperial College London, London SW7 2AZ, UK

Received 9 September 2008, in final form 4 November 2008


Published 4 December 2008
Online at stacks.iop.org/JPhysA/42/023001

Abstract √
Using the square-root map p → p a probability density function p can
be represented as a point of the unit sphere S in the Hilbert space of
square-integrable functions. If the density function depends smoothly on a
set of parameters, the image of the map forms a Riemannian submanifold
M ⊂ S . The metric on M induced by the ambient spherical geometry
of S is the Fisher information matrix. Statistical properties of the system
modelled by a parametric density function p can then be expressed in terms of
information geometry. An elementary introduction to information geometry is
presented, followed by a precise geometric characterization of the family of
Gaussian density functions. When the parametric density function describes
the equilibrium state of a physical system, certain physical characteristics can
be identified with geometric features of the associated information manifold
M. Applying this idea, the properties of vapour–liquid phase transitions are
elucidated in geometrical terms. For an ideal gas, phase transitions are absent
and the geometry of M is flat. In this case, the solutions to the geodesic
equations yield the adiabatic equations of state. For a van der Waals gas, the
associated geometry of M is highly nontrivial. The scalar curvature of M
diverges along the spinodal boundary which envelopes the unphysical region
in the phase diagram. The curvature is thus closely related to the stability of
the system.

PACS numbers: 02.40.Ky, 02.50.Tt, 05.20.−y, 05.70.Fh, 64.60.A, 64.70.F

1. Statistical geometry

This paper is an overview of the information-geometric description of vapour–liquid phase


transitions in equilibrium statistical mechanics. The present section begins with a reasonably
self-contained account of the relevant background material on information geometry. As an
illustrative example we shall examine in some detail the geometry of the space of Gaussian
density functions. The relation between the information measure of Fisher and that of Shannon
1751-8113/09/023001+33$30.00 © 2009 IOP Publishing Ltd Printed in the UK 1
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

and Wiener is also briefly discussed. In later sections these ideas are applied to the information-
geometric characterization of the equilibrium properties of noninteracting and interacting gas
molecules. The relevant references are provided in the bibliographical notes in section 4,
where we also provide a brief and perhaps incomplete history of information geometry.

1.1. From probability to geometry


The concept of ‘information geometry’ is a simple one which emerged from an attempt to
discriminate among different probabilities in statistical analysis. The idea can be sketched as
follows. Let {pi }i=1,2,...,N denote a set of probabilities satisfying

N
0  pi  1 and pi = 1. (1)
i=1
We introduce the following square-root map:

pi → ξi = pi . (2)
By construction, the square-root probabilities {ξi } satisfy the normalization condition

N
ξi2 = 1. (3)
i=1
If we regard the variables {ξi }i=1,2,...,N as the coordinates of a vector in an N-dimensional
Euclidean space RN , then the normalization condition (3) implies that the endpoint of the
vector {ξi } lies on the unit sphere S in RN . Now suppose that {ηi }i=1,2,...,N corresponds to a
second set of square-root probabilities. Then the vector {ηi } also lies on the unit sphere in RN .
Hence, we can measure the relative separation or overlap of two sets of probabilities in terms
of the angle

N
φ = cos−1 ξi ηi (4)
i=1
between the associated square-root probability vectors. The angular separation φ clearly
vanishes if {ξi } and {ηi } are equal. Conversely, if {ξi } and {ηi } are orthogonal then φ achieves
its maximum value 12 π . The angular separation φ defined in (4) is known as the Bhattacharyya
spherical distance. Note that the cosine square of the spherical distance resembles the
transition probability in quantum mechanics modelled on a finite-dimensional Hilbert space.
We turn to the notion of the so-called statistical geometry, which arises from the
embedding of probability density functions in the Hilbert space of square-integrable functions.
In probability theory one typically deals with a probability density function p(x) on, say, the
real line R. For the function p : x → p(x) to represent
 the density of some random variable X
we require that p(x)  0 for all x ∈ R and that R p(x) dx = 1. If we consider the square-root
map

p(x) → ξ(x) = p(x), (5)
then the function ξ(x) defined in this way belongs to the space H = L2 (R) of square-integrable
functions on the real line. In other words, we embed the density functions in Hilbert space
via the square-root map (5). In particular, since the square-root density functions satisfy the
normalization condition

ξ(x)2 dx = 1, (6)
R
the images of the map lie on the unit sphere S ⊂ H.

2
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

ξ1

φ ξ2

Figure 1. Bhattacharyya’s spherical distance. Two vectors ξ1 (x) and ξ2 (x), corresponding to a
pair of probability density functions p1 (x) and p2 (x), lie on the surface of the positive orthant of
the unit sphere S in Hilbert space. The spherical distance between two unit vectors is given by the
angle φ defined in equation (7).

The advantage of working in Hilbert space H rather than the space of density functions is
that H is a vector space endowed with various geometric features that are familiar from other
branches of physics, such as quantum mechanics or general relativity.
Suppose we have a pair of density functions p1 (x) and p2 (x), and wish to compare the
overlap or separation of these two density functions. If the associated Hilbert space vectors
are given respectively by ξ1 (x) and ξ2 (x), then the overlap is measured in terms of the inner
product R ξ1 (x)ξ2 (x) dx. Since ξ1  = ξ2  = 1, i.e. both vectors have unit norm, this
overlap is given by the cosine of the angular separation. It follows that the Bhattacharyya
spherical distance between two square-root density functions is

φ = cos−1 ξ1 (x)ξ2 (x) dx. (7)
R
This idea is illustrated schematically in figure 1.

1.2. Parametric density and Fisher–Rao geometry


In theoretical statistics one typically deals with a parametric family of probability density
functions pθ (x) = p(x|θ ). Here θ denotes one or more real parameters. For example, a
Gaussian density function is characterized by two parameters, i.e. the mean μand the variance
σ 2 . For each value, or set of values, of θ we have the normalization condition R pθ (x) dx = 1.
In problems of statistical inference, it is often convenient to consider the log-likelihood
function
lθ (x) = ln p(x|θ ). (8)
However, in physics it is more natural to work with the square-root density function

ξθ (x) = p(x|θ ), (9)
since, as indicated above, this permits formulation of the problem in a real Hilbert-space
context. As before, for each given θ the density function is mapped to a point on the unit
sphere S ⊂ H by the prescription (9). If the value of θ is changed, the image under the map in
general also varies on S . We assume that the density function is at least twice differentiable

3
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

with respect to the parameters. Then as the parameters change continuously, the image point
on S will vary smoothly over a parametric subspace M of the sphere S .
Given a parametric subspace M ⊂ S the metric of the ambient sphere S induces a
Riemannian metric on the subspace in the usual way. This can be seen as follows. Recall that
the Hilbert space inner product is defined by

ξ, η = ξ(x)η(x) dx. (10)
R
Therefore, if we set
ξ(x) = ξθ (x) and η(x) = ξθ (x) + ∂i ξθ (x) dθ i , (11)
where ∂i = ∂/∂θ , then the squared distance ds of the difference vector ξ(x) − η(x) is given
i 2

by
 
ds =
2
∂i ξθ (x)∂j ξθ (x) dx dθ i dθ j . (12)
R
Before we proceed further with the derivation of the metric, let us introduce the statistical
notion of the Fisher information matrix, which is usually defined by

Gij = pθ (x)∂i lθ (x)∂j lθ (x) dx, (13)
R
where lθ (x) is the log-likelihood density (8). The Fisher information matrix is important in
statistics because it provides a lower bound for the variance of a parameter estimate. Consider,
for example, the case of a one-parameter family of density functions. That is, we have a
density function pθ (x) that depends upon a single unknown parameter θ . The objective is thus
to estimate the parameter by performing observations. If T (x) is an unbiased estimator for θ ,
i.e. if the expectation of T (x) with respect to pθ (x) yields θ , then we have

(T (x) − θ )ξθ (x)2 dx = 0. (14)
R
Differentiating with respect to θ we obtain

1
(T (x) − θ )ξθ (x)∂θ ξθ (x) dx = , (15)
R 2
where ∂θ = ∂/∂θ . By the Schwarz inequality
 2    
(T (x) − θ )ξθ (x)∂θ ξθ (x) dx  (T (x) − θ ) ξθ (x) dx
2 2 2
(∂θ ξθ (x)) dx (16)
R R R
we thus find

1
(T (x) − θ )2 ξθ (x)2 dx   . (17)
R 4 R (∂θ ξθ (x))2 dx
Note that the left-hand side is the variance of the estimator, whereas the denominator
of the right-hand side is the one-parameter form of the Fisher information matrix. The
relation (17) provides a lower bound for the variance, and is known as the Cramér–Rao
inequality. The inequality in (16) is attained only when the two vectors are proportional, that
is, ∂θ ξθ (x) = c(T (x) − θ )ξθ (x) for some constant c. By scaling θ we can set c = 12 without
loss of generality. Hence, the lower bound of the variance is attained only if the square-root
density assumes an exponential form:

exp 12 θ T (x)
ξθ (x) =  1/2
. (18)
R exp(θ T (x)) dx

4
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

The exponential family (18) plays an important role in the applications to statistical mechanics
considered below.
In a multi-parameter context the reciprocal of the Fisher information matrix determines
lower bounds for the variance in an analogous manner. From the geometrical viewpoint, the
significance of the Fisher information matrix is that it defines the induced Riemannian metric
on the parametric subspace M of the unit sphere S in H. Specifically, comparing (12) and
(13) we see that
ds 2 = 14 Gij dθ i dθ j . (19)
1
The metric G
4 ij
on M will be referred to as the Fisher–Rao metric. The factor of a quarter
is purely a matter of convention, and the Fisher–Rao metric is thus given by a quarter of the
Fisher information matrix.

1.3. Riemannian structure of the exponential family


We introduce here some elementary concepts in Riemannian geometry that are relevant to the
ensuring discussion. We note first that all equilibrium distributions that we consider here are
represented in the exponential form

pθ (x) = q(x) exp − θ i Hi (x) − ψ(θ ) , (20)
i

where {θ i }i=1,2,... are parameters, q(x) represents the prescribed equilibrium state at θ i = 0
for all i, and the functions {Hi (x)}i=1,2,... determine the form of the energy. In other words,
we shall only consider equilibrium states represented in the canonical form. The parameters
{θ i } may include inverse temperature, chemical potential, pressure, magnetic field and so on,
whereas the functions Hi (x) may represent system energy, particle number, system volume,
magnetization and so on. The variable x ranges over the phase space Γ of the system. The
function
 
ψ(θ ) = ln exp − θ i Hi (x) q(x) dx (21)
Γ i
determines the overall normalization. We refer to ψ(θ ) as the thermodynamic potential of the
system. It should be evident by inspection that
 
∂ψ H (x) exp − i θ i Hi (x) q(x) dx
Γ i
− i =  
∂θ exp − i θ i Hi (x) q(x) dx
 Γ

= Hi (x)pθ (x) dx. (22)


Γ

In other words, the first derivative of ψ(θ ) with respect to θ i determines the expectation value
of Hi (x) in the equilibrium state (20). As we shall indicate below, analogous calculations
show that higher derivatives of the thermodynamic potential ψ(θ ) determine higher moments
of the functions {Hi (x)}.
If the equilibrium density function assumes the form (20), then the expressions for the
corresponding Fisher–Rao metric and the coefficients of the associated metric connection on
the statistical manifold M are simplified. Let us state the results first.
Proposition 1. For a density function of the exponential form (20) the Fisher–Rao metric Gij
and the Christoffel symbols ij k = Gil jl k are given, respectively, by
Gij = ∂i ∂j ψ(θ ) (23)

5
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

and
ij k = 12 ∂i ∂j ∂k ψ(θ ), (24)
in terms of the canonical parameterization {θ } on M, where ∂i = ∂/∂θi .
i

The Christoffel symbols characterize the geodesics on M. Specifically, to find the shortest
path from a to b on M we consider the variational problem:
 b
δ ds = 0. (25)
a
From (19) we find
∂Gik l
4δ ds 2 = dθ i dθ k δθ + 2Gik dθ i dδθ k . (26)
∂θ l
Bearing in mind that the right-hand side of (26) equals 8ds δ ds, we obtain
 b 
1 dθ i dθ k ∂Gik l dθ i dδθ k
δθ + Gik ds = 0. (27)
a 2 ds ds ∂θ k ds ds
Integrating the second term in the integrand by parts and writing ui = dθ i /ds we see that (27)
reduces to
 b 
1 i k ∂Gik d
uu − (Gil u ) δθ l ds = 0.
i
(28)
a 2 ∂θ l ds
Since this must hold for arbitrary δθ l we have
1 i k ∂Gik d
uu − (Gil ui ) = 0. (29)
2 ∂θ l ds
Writing
 
1 im ∂Gmk ∂Gml ∂Gkl
kl = G −
i
+ (30)
2 ∂θ l ∂θ k ∂θ m
for the Christoffel symbol we find that (29) can be expressed in the form
d2 θ i k
i dθ dθ
l

2
+ kl = 0. (31)
ds ds ds
This is the geodesic equation that determines the shortest paths on M. Owing to their
nonlinearity, geodesic equations do not generally admit elementary analytic solutions, although
in some cases one can solve (31) in the closed form, as in the Gaussian example discussed
below.
Proof of proposition 1. From (20) we have
∂i lθ (x) = −(Hi (x) + ∂i ψ(θ )). (32)
On the other hand, differentiating the normalization condition

pθ (x) dx = 1 (33)
Γ
once with respect to θ i and using ∂i pθ (x) = pθ (x)∂i lθ (x) we obtain

pθ (x)(Hi (x) + ∂i ψ(θ )) dx = 0, (34)
Γ
whence it follows that the expectation of Hi (x) with respect to pθ (x) is given by −∂i ψ(θ ), as
shown in (22). Differentiating (34) with respect to θ j , we find that

− pθ (x)(Hi (x) + ∂i ψ(θ ))(Hj (x) + ∂j ψ(θ )) dx + ∂i ∂j ψ(θ ) = 0. (35)
Γ

6
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

In view of (32) and (13), this implies that the Fisher–Rao metric Gij is given by (23). From
(30) we have
ikl = 12 (∂l Gik + ∂k Gil − ∂i Gkl ), (36)
but since the metric is given by (23) we immediately deduce expression (24) for the Christoffel
symbols. 

It is important to note that if we choose an alternative parameterization for M, then the


components of the metric tensor and the Christoffel symbol cannot be calculated using the
simple expressions given in proposition 1, and we must use the defining equations (13) and
(30) to determine these quantities. Also note that the metric tensor is the covariance matrix
of the functions {Hi (x)}, whereas the components of ij k are third-order cross-moments of
{Hi (x)}.
In terms of the Christoffel symbols ji k the Riemann curvature tensor Rlji k can be expressed
as
Rlji k = ∂k i
lj − ∂j i
lk − i h
j h lk + i h
kh lj . (37)
The Riemann tensor encodes the information concerning the local geometry of M, and is
related to the parallel transport of vectors on M. In particular, if we define the covariant
derivative by
∂Ai
∇j Ai = − ijk Ak , (38)
∂θ j
then the commutator of the covariant derivatives defines the Riemann tensor:
∇j ∇k Ai − ∇k ∇j Ai = Al Rijl k . (39)
The symmetry properties of the Riemann tensor can be derived by lowering the index with the
metric and writing Rij kl = Gim Rjmkl . Specifically, this is given by
 
1 ∂ 2 Gil ∂ 2 Gj k ∂ 2 Gik ∂ 2 Gj l 
Rij kl = + − j l − i k + Gnm jnk ilm − jnl ik m
, (40)
2 ∂θ j ∂θ k ∂θ i ∂θ l ∂θ ∂θ ∂θ ∂θ
whence it follows that Rij kl = Rklij . Along with the relation Rji kl = −Rji lk that follows
from (37) we find that Rij kl = −Rij lk = −Rj ikl . Therefore, the components of Rij kl vanish
when i = j or k = l. In the case of a two-dimensional manifold M the only nonvanishing
components of the Riemann tensor are given by
R1212 = −R1221 = −R2112 = R2121 . (41)
In other words, R1212 is the sole independent component in two dimensions.
Given a Riemann tensor Rji kl we define the associated Ricci tensor by the symmetric
expression
Rj l = Rjkkl . (42)
Equivalently, we can write Rj l = Gik Rij kl . A further contraction with the metric defines the
scalar curvature:
R = Gj l Rj l . (43)
We call (43) the Ricci scalar curvature. If the Ricci tensor Rj l is proportional to the metric
tensor Gj l then the manifold M is called an Einstein space. This is because the metric of such
a space satisfies the vacuum Einstein equation
Rki − 12 δki R = 0, (44)

7
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

where Rki = Gil Rlk . The significance of the Einstein equation in statistics or statistical
mechanics can be seen if we relax the normalization condition on the square-root density
function ξ(x) and thus eliminate the physically irrelevant degree of freedom associated with
the norm ξ(x). Specifically, if T (x) represents an observable function on the phase space
Γ then its expectation value in the generic ‘state’ ξ(x) is

ξ(x)T (x)ξ(x) dx
T (x) = Γ  . (45)
Γ
ξ(x)2 dx
Evidently, the expectation so defined is invariant under the scale transformation ξ(x) → λξ(x),
where λ is an arbitrary nonzero number. Thus, all the relevant statistical information is encoded
in the direction of the vector ξ(x) ∈ H, irrespective of its length. Via the identification
ξ(x) ∼ λξ(x) we obtain a space of rays in H, which is known as the real projective Hilbert
space. Suppose now that we consider the Einstein equation (44) for the metric on the projective
Hilbert space. Then there is a unique solution which is induced by the infinitesimal form of
the Bhattacharyya spherical distance (7). Specifically, this is obtained by setting ξ1 (x) = ξ(x)
and ξ2 (x) = ξ(x) + dξ(x) in
 2
R ξ1 (x)ξ2 (x) dx
cos φ = 
2
 , (46)
R ξ1 (x)ξ1 (x) dx R ξ2 (x)ξ2 (x) dx

Taylor expanding each side, and retaining terms of quadratic order. Then with the notation of
(10) we can write
dξ, dξ  ξ, dξ 2
φ2 = − , (47)
ξ, ξ  ξ, ξ 2
which defines a metric on the projective Hilbert space. It follows that the Einstein equation
uniquely determines the probabilistic properties of the space of densities.
For a statistical manifold M associated with a distribution of the exponential type (20),
the Riemann tensor assumes a simple form because the first two terms in (37) cancel, and only
the contractions of the Christoffel symbols remain.
The examples of statistical mechanical systems considered here are parameterized by a
pair of external variables, so that the statistical manifold M is two dimensional. In this case,
the expression for the scalar curvature admits further simplifications. Specifically, we have
the following:

Proposition 2. In terms of the canonical parameterization (θ 1 , θ 2 ), the scalar curvature of a


two-dimensional statistical model corresponding to the density function (20) is given by the
determinant
 
ψ (θ ) ψ12 (θ ) ψ,22 (θ ) 
1  11 
R = − 2 ψ111 (θ ) ψ112 (θ ) ψ122 (θ ) , (48)
2G 
ψ112 (θ ) ψ122 (θ ) ψ222 (θ )
where G = det(Gij ) is the determinant of the Fisher–Rao metric, and where ψ12 (θ ) =
∂ 2 ψ(θ )/∂θ 1 ∂θ 2 , ψ112 (θ ) = ∂ 3 ψ(θ )/∂θ 1 ∂θ 1 ∂θ 2 , and so on.

Proof. Recall that the scalar curvature is determined by the contraction


Rij kl = ( kmi j ln − kmj
mn
iln )G . (49)
Substituting (24) and the inverse of (23) into (49), we obtain (48) after some rearrangement
of terms. 

8
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

1.4. Geometry of Gaussian distributions


As an elementary illustrative example, consider the Gaussian (normal) distribution N (μ, σ )
on the real line R with mean μ and standard deviation σ > 0. For the parameterized density
function we have
 
1 (x − μ)2
p(x|μ, σ ) = √ exp − . (50)
2π σ 2σ 2
The normal density function can be rewritten in the canonical form
p(x|θ ) = exp[−θ 1 x 2 − θ 2 x − ψ(θ )], (51)
where
 2 √ 
1 μ 1 θ2 2π
θ =
1
, θ =− 2
2
and ψ(θ ) = − ln . (52)
2σ σ 8 θ1 2θ 1
By differentiating ψ(θ ) with respect to the parameters {θ i } we can determine the components
of the metric tensor Gij (θ ) in the coordinate system specified by the canonical parameterization
{θ i }.
Alternatively, we may regard the mean μ and standard deviation σ as coordinates on the
statistical manifold M. In terms of the parameters (μ, σ ) the metric does not admit a simple
representation (23), and we must perform the Gaussian integration in the defining relation
(13). The line element then becomes
1
ds 2 = 2 (dμ2 + 2dσ 2 ), (53)
σ
which is defined on the upper-half plane −∞ < μ < ∞ and 0 < σ < ∞. Since the metric
Gij is diagonal in these coordinates, it is easily inverted and we obtain
 
2 1 0
G =σ
ij
. (54)
0 12
A short calculation shows that the Christoffel symbols are given by
1
12 = 21 = −2 11 = 22 = − , 11 = 22 = 12 = 21 = 0. (55)
1 1 2 2 1 1 2 2
σ
Since the inverse of the metric tensor (54) is diagonal, we need to only determine the diagonal
components of the Ricci tensor in order to calculate the scalar curvature (that is, the off-
diagonal elements of the Ricci tensor vanish). These are
1 1
R11 = − 2 and R22 = − 2 , (56)
2σ σ
respectively. Hence, the resulting geometry is that of a hyperbolic space, which is a
homogeneous manifold of constant negative curvature:
R = −1. (57)
This space has many interesting properties. For example, the geodesic equations (31)
characterizing trajectories of shortest paths on M are determined by the equations
d2 μ(s) 1 dμ(s) dσ (s)
2
−2 =0 (58)
ds σ (s) ds ds
and
    
d2 σ (s) 1 1 dμ(s) 2 dσ (s) 2
2
+ − = 0. (59)
ds σ (s) 2 ds ds

9
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Since σ > 0 we can divide (58) and (59) by σ , obtaining


   
μ μ σ σ 1μ μ
− =0 and + = 0, (60)
σ σ σ σ 2σ σ
where the prime indicates d/ds. It follows that
 2  2
μ σ
+2 = 0, (61)
σ σ
and hence that
 2  2
μ σ
+2 = v2, (62)
σ σ
where v  0 is a constant. On the other hand, if we define X = μ /σ then from the first
equation of (60) we have Xσ − X σ = 0, or equivalently (X/σ ) = 0, and thus μ /σ = cσ ,
where c is a constant. Substituting this into (62) we deduce that
 2
σ
2 2
c σ + = v2, (63)
σ
√ √
where we have rescaled the integration constants, i.e. c → 2c and v → 2v. There are
now two cases to consider, depending on whether c is zero or nonzero.
If c = 0, then μ is constant, and σ = vσ , so that σ (s) = a evs for some constants a and
v such that a > 0 and v > 0. This represents a straight line parallel to the σ -axis in the μ–σ
plane. If c = 0, then cσ  v from (63), hence σ (s) = vc sin γ (s) for some γ (s). Substituting
this into (63) we obtain
 2

= v 2 sin2 γ (s). (64)
ds
Since γ = 0, γ (s) is monotonic and thus invertible. We may assume, without loss of
generality, that γ > 0 so that (64) implies dγ /ds = v sin γ . Using σ (s) = vc sin γ (s) we find
that
σ 1 dσ dγ
= = v cos γ , (65)
σ σ dγ ds
and further, using (62) with rescaled c and v we obtain
μ (s) = vσ (s) sin γ (s). (66)
If we regard s = s(γ ) as parameterized by γ we can write
  
dμ ds v v
μ(s) = dγ = μ dγ = sin γ dγ = − cos γ + b, (67)
dγ dγ c c
where b is an integration constant.
To summarize, if we regard γ (s) as the independent parameter, then the solutions to the
geodesic equations for the Gaussian family of densities (50) are
v v
μ(s) = − cos γ (s) + b and σ (s) = sin γ (s). (68)
c c
These equations represent half-circles on the μ–σ plane centred on the μ axis (σ = 0) with
radius v/c. In figure 2 we sketch examples of geodesic curves for the Gaussian family, in the
case c = 0.

10
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

(σ2 , μ2)

(σ1 , μ1)

Figure 2. Geodesic curves for Gaussian distributions. The statistical manifold M in this case is
the upper half plane parameterized by μ and σ . We have −∞ < μ < ∞ and 0 < σ < ∞. The
shortest path joining the two normal distributions N (μ1 , σ1 ) and N (μ2 , σ2 ) is given by the unique
semi-circular arc through the given two points and centred on the boundary line σ = 0.

If follows from the solutions to the geodesic equations that the separation of a pair of
normal distributions N (μ1 , σ1 ) and N (μ2 , σ2 ) is given by
1 1 + δ1,2
D(ρ1 , ρ2 ) = √ log , (69)
2 1 − δ1,2
where the function δ1,2 defined by

(μ2 − μ1 )2 + 2(σ2 − σ1 )2
δ1,2 = (70)
(μ2 − μ1 )2 + 2(σ2 + σ1 )2
lies between 0 and 1. These results follow directly from the fact that the geodesics are semi-
circular arcs centred on the boundary line σ = 0 (this line itself is not part of the manifold M
because σ > 0). In the exceptional case when μ1 = μ2 , the geodesic is a straight line μ =
constant, and
 
1  σ1 
D(ρ1 , ρ2 ) = √ log  . (71)
2 σ 2
The above example illustrates how various geometric aspects of a statistical manifold
M can be investigated in a systematic manner. It is interesting to note, in particular, that
the Gaussian distributions define an elementary hyperbolic geometry with constant negative
curvature. Before examining various geometric characterizations of ideal and interacting gases
in thermal equilibrium, let us discuss the relation between the Fisher–Rao distance measure
and various measures of entropy, a topic of some interest.

1.5. From entropy to Fisher information


We have observed how the notion of information geometry arises from the Fisher information
matrix commonly used in statistical analysis. On the other hand, the term ‘information’ often
suggests the concept of entropy, rather than the Fisher matrix. Indeed, the entropy concept is
essential in thermodynamics and statistical mechanics. Therefore, it would be appropriate to
clarify the interrelation between these two notions of information.
We shall discuss entropy in a fairly general context, and consider as before a parametric
family of probability density functions p(x|θ ) which we assume to be defined, say, on the real
line R, where θ = {θ i }. Then with respect to any twice-differentiable concave function ϕ(p)
we define the associated entropy functional by the expression

Hϕ (p) = ϕ[p(x|θ )] dx. (72)
R

11
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Now, if f represents a vector in the tangent space (at p) of the manifold of density functions,
then the derivative of the entropy Hϕ at p in the direction f is defined by

d 
dHϕ (p; f ) = Hϕ (p + sf )
ds
 s=0

= ϕ [p(x)]f (x) dx, (73)


R
where ϕ (p) = dϕ(p)/dp. Similarly, if f and g are two vectors in the tangent space at p, we
define the Hessian of Hϕ by

d Hϕ (p; f, g) =
2
ϕ [p(x)]f (x)g(x) dx. (74)
R
The corresponding quadratic form is
f Hϕ (p) = 4 d2 Hϕ (p; f, f ), (75)
or equivalently,

f Hϕ (p) = 4 ϕ [p(x)]f 2 (x) dx, (76)
R
where the factor of 4 here is purely conventional. The concavity of ϕ then implies that
−f Hϕ (p)  0. (77)
In particular, if we chose f to be ∂i p, where ∂i = ∂/∂θ i , then we have

∂p Hϕ (p) = 4 ϕ [p(x|θ )] (∂i p(x|θ ))2 dx. (78)
R
Thus far, we have not specified the form of the function ϕ, except for the requirements
of concavity and twice differentiability. As a special case, let us consider the one-parameter
family of concave functions
ϕα (z) = (α − 1)−1 (z − zα ), (79)
where α > 0. This determines a one-parameter family of entropies Hα (p) given by
  
1
Hα (p) = 1 − p α (x) dx . (80)
α−1
Note that when α = 1 we have ϕ1 (z) = −z ln z and hence

H1 (p) = − p(x) ln p(x) dx. (81)

In other words, we recover the expression for the familiar Shannon–Wiener entropy in the
limit α → 1. In the general case, expression (80) defines the Havrda–Charvát entropy (also
known as the α-order entropy), which is related to the well-known Rényi entropy Rα (p) as
follows:
1
Rα (p) = ln(1 + (1 − α)Hα (p)). (82)
1−α
In particular, Rα is monotonic in Hα .
Also, choosing ϕα as in (79) we find that
1 1
dsα2 = − ∂ p Hϕ (p) = − d2 Hϕ (p; ∂i p, ∂j p) dθ i dθ j (83)
4α α
is positive definite and defines a Riemannian metric. We can summarize this as follows.

12
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Proposition 3 (Burbea–Rao metric). The coefficients of the differential metric


dsα2 = G(α) i
ij dθ dθ
j
(84)
associated with the Hessian of the α-order entropy (80) are

(α)
Gij = pα (∂i ln p)(∂j ln p) dx. (85)

In particular, when α = 1, G(α)


ij reduces to the Fisher–Rao metric.

Proof. The expression in (85) follows at once from (83) by virtue of the relation ϕα (p) =
−αpα−2 . 
We find, therefore, that the so-called α-order entropy metric is closely related to the
Fisher–Rao geometry of the statistical manifold. In addition, there is another significant
relationship between the derivative of the entropy and the α-order Fisher information matrix.
This can be established as follows. From the defining equation (80) we have
 
α
∂i ∂j Hα (p) = −α pα−2 (∂i p)(∂j p) dx − pα−1 ∂i ∂j p dx, (86)
R α−1 R
and therefore we deduce that

(α) 1 1
Gij = − ∂i ∂j Hα (p) + pα−1 ∂i ∂j p dx. (87)
α α−1 R
For the canonical distribution (20), the limit α → 1 of this relation yields the Shannon–Wiener
entropy
 
H1 (p) = θ i p(x|θ )Hi (x) dx + ψ(θ ). (88)
i
In other words, the thermodynamic potential and the entropy are related by a Legendre
transformation. Consequently, the Fisher–Rao geometry and the geometry arising from
the Hessian of the Shannon–Wiener entropy are related by the general theory of Legendre
transforms.

2. Classical ideal gas

We shall now characterize the geometry of the statistical manifold that arises from the
equilibrium distribution of a gas of noninteracting particles. Although this system displays
no phase transition, the analysis presented here will provide an enlightening contrast with the
results of section 3 where we shall examine the geometry of the van der Waals gas, which
does exhibit a liquid–vapour transition.

2.1. Partition function in P–T distribution


To elucidate the geometrical representation of gaseous systems in statistical mechanics, we
begin our analysis with a system of noninteracting identical particles in the absence of potential
energy. Physically, this system corresponds to a classical ideal gas immersed in a heat bath.
As we shall show below, not only the Riemann curvature but also the geodesic equations for
this system can be solved exactly. We consider, in particular, a pressure–temperature (P– T)
distribution (also known as the Boguslavski distribution) of the form
exp(−βH − αV )
p(H, V |α, β) = (89)
Z(α, β)
13
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

defined on the phase space Γ of a system of noninteracting particles. Here the partition
function Z(α, β) is determined by the phase-space and volume integral
 ∞  
1
Z(α, β) = exp(−βH ) dq dp exp(−αV ) dV . (90)
N!h3N 0 Γ

As usual, we have β = 1/kB T , α = P /kB T , where P denotes the pressure, h the Planck
constant and N the number of particles. The Hamiltonian H is just the free particle kinetic
energy
N
p2i
H = . (91)
i=1
2m
Thus, we consider a closed system of noninteracting gas molecules immersed in a heat
bath at inverse temperature β and effective pressure α. Since the system is in contact
with a bath at fixed temperature and pressure, the system energy and volume fluctuate. In
thermal equilibrium, the distribution of these variables is determined by (89). For a real gas,
the constituent particles inevitably interact. Nevertheless, the ideal gas represented by the
distribution (89) adequately characterizes the properties of a real gas at high temperature or
low density, where the effects of inter-particle interactions can be neglected.
Comparing (89) and (20) we observe that the thermodynamic potential is given by
ψ(α, β) = ln Z(α, β). Therefore, to determine the Fisher–Rao metric (23) we must perform
the integration (90). Noting the fact that each q-integration in (90) gives the volume V of the
system, one obtains the partition function
 
2π m 3N/2 −(N+1)
Z(α, β) = α . (92)
h2 β
This follows from the fact that
   p2 
N 
N
N
e−β
i
exp(−βH ) dq dp = i=1 2m dpi dqi (93)
Γ q p i=1 i=1

is just a product of Gaussian integrals, and the identity


 ∞
1
V N e−αV dV = α −(N+1) (94)
N! 0
that holds for α > 0.
Note that the partition function z(β, V ) in the canonical ensemble is

1
z(β, V ) = exp(−βH ) dq dp
N!h3N Γ
 
1 2π m 3N/2 N
= V , (95)
N! h2 β
from which one can calculate the Helmholtz free energy
F (β, V ) = −kB T ln z(β, V ) (96)
and thus obtain the equation of state
 
∂F
P =− = N kB T /V (97)
∂V β
satisfied by a classical ideal gas.

14
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

2.2. Curvature and geodesics for ideal gas


Expression (92) for the partition function clearly shows that the Riemannian geometry of
the statistical model M associated with the classical ideal gas depends upon the number N
of particles. Although finite size effects in small systems are sometimes of interest, here
we are primarily concerned with the geometry that arises in the so-called thermodynamic
limit N → ∞. Thus, we consider the thermodynamic potential ψ(α, β) per particle in the
thermodynamic limit, given by
3 2π m
ψ(α, β) = lim N −1 ln Z(α, β) = ln 2 − ln α. (98)
N→∞ 2 hβ
The components of the Fisher–Rao metric, with respect to the parameterization (α, β), can
then be calculated by differentiation, with the result
 −2 
α 0
Gij = 3 −2 . (99)
0 2
β
From this expression we deduce the following.

Proposition 4. All components of the Riemann tensor, and consequently also the scalar
curvature, of the statistical manifold M associated with the classical ideal gas vanish, and
thus the manifold is flat.

Proof. From the components of the metric (99) one can calculate the components of the
Christoffel symbol ji k and the Riemann tensor Rji kl using the definitions (24) and (37).
Alternatively, to show that this manifold is flat, it suffices to display a change of coordinates
which transforms the metric (99) into a Euclidean metric. Here, we adopt the latter approach
because this also permits more expeditious solution of the geodesic equations. We recall that
under a coordinate transformation x i → x̄ i the metric of a Riemannian manifold transforms in
the usual tensorial manner, so that the components of the inverse metric in the new coordinate
system are determined by the contraction
dx̄ i dx̄ j
Ḡij = Gkl . (100)
dx k dx l
Consider the following coordinate transformation:

α → α = ln α and β→β = 3
2
ln β. (101)
A straightforward calculation then shows that the components of the inverse metric in the
(α , β ) coordinate system are
 
1 0
Ḡ =
ij
, (102)
0 1
and thus the manifold is indeed flat. 

Since the statistical manifold associated with the ideal gas is flat, solution of the geodesic
equations is straightforward. The result can be summarized as follows.

Proposition 5. The geodesic curves on the statistical manifold M associated with the classical
ideal gas are given by
 
P kB T 1+c
= , (103)
P0 k B T0
15
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

where P0 , T0 and c are integration constants. In particular, the geodesics include the adiabatic
equation of state for the ideal gas, corresponding to the choice c = −CV /N kB , where CV is
the constant-volume heat capacity.

Proof. The geodesic equations for the variables α and β assume identical forms, i.e.
 
d2 x 1 dx 2
− =0 (104)
ds 2 x ds
for x = α, β. This can be rewritten as
 
d dx d
ln = ln x, (105)
ds ds ds
from which we see that the general solution is x(s) = c1 ec2 s . Thus, we obtain
P 1
= c1 ec0 s and = c3 ec2 s (106)
kB T kB T
as the general solution to the geodesic equations. Combining these two equations, we have
c1
P = c0 /c2 (kB T )1−c0 /c2 . (107)
c3
Setting s = 0 we find c1 = P0 /kB T0 and c3 = 1/kB T0 , which yields at once the expression
in (103). 

3. van der Waals gas

The geometry of the statistical manifold changes considerably if the gas molecules interact.
In particular, if the system exhibits a phase transition, then the curvature tends to become
singular at the transition point. This property seems to be universal and appears in many
systems exhibiting critical phenomena. The van der Waals gas model is not only of physical
interest, but also illustrates many of the universal geometrical features of the associated
manifold of equilibrium states.

3.1. Equation of state


The idealized system of noninteracting particles considered above is inadequate for the
description of phase transition phenomena, that is, the condensation of gas molecules. Here
we shall extend the model to include inter-particle interactions, which leads to the van der
Waals equation of state:
 
N2
P + a 2 (V − bN ) = N kB T , (108)
V
where N is the total number of molecules and a, b are constants determined by the properties of
the molecule. The liquid–vapour transition occurs at the critical point where the temperature
T, pressure P and volume V simultaneously assume the values
a 8a
Pc = 2
, Vc = 3bN and Tc = . (109)
27b 27kB b
The critical point is determined by the simultaneous solution of the equations
∂P ∂ 2P
=0 and = 0. (110)
∂V ∂V 2
16
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

T = 1.4
P T = 1.0
4
T = 27
3 32
T = 0.6
2
1
0 V
1 2 3 4 5
-1
-2
-3

Figure 3. Equations of state for the van der Waals gas in terms of dimensionless variables.
The isothermal curves correspond to T̂ = 1.4, T̂ = 1, T̂ = 27/32 (maximum superheating
temperature) and T̂ = 0.6. Note that the isothermal curves associated with temperatures below
Tm allow metastable regions for which P̂ < 0.

Using the dimensionless variables


P V T
P̂ = , V̂ = , and T̂ = , (111)
Pc Vc Tc
the equation of state can be rewritten in the universal form

(P̂ + 3V̂ −2 ) V̂ − 13 = 83 T̂ , (112)
independent of the parameters a and b. In figure 3 we plot the pressure P̂ as a function of the
volume V̂ for T̂ > 1, T̂ = 1 and T̂ < 1.
Note that the positivity of the pressure implies a bound on the temperature. In particular,
from (112) we deduce that the condition P̂  0 is equivalent to the bound:
9V̂ − 3
T̂  , (113)
8V̂ 2
in terms of the dimensionless volume V̂ . If we demand the positivity of P̂ for all volumes
V̂  13 , then we must require T̂  27
32
, or equivalently
27
T  Tc . (114)
32
The temperature
27
Tm = Tc ≈ 0.85Tc (115)
32
is known as the temperature of maximum superheating and is related to the nucleation of
bubbles when the liquid is heated very abruptly. In particular, if T < Tm then the liquid can
be contained at low external pressure, whereas if T > Tm the liquid cannot exist under low
external pressure and thus evaporates. Experimental data show, for example, that Tm = 0.89Tc
for ether, Tm = 0.92Tc for alcohol and Tm = 0.84Tc for water, indicating the fairly accurate
predictive power of the van der Waals equations of state.
Turning to the equations of state, the pressure P as a function of the volume V has three
distinct roots when the temperature is below its critical value Tc . Amongst these three roots,

17
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Vv

Vl

P2 P

Figure 4. Isothermal curve and equal area law in the pressure–volume plane (above), and pressure
dependence of the Gibbs free energy (below). As the pressure P of the gas is slowly increased, the
Gibbs free energy G(P ) increases along the thick solid line in the lower diagram, until P reaches
the coexisting pressure P2 . The gas then undergoes a phase transition and condenses. During this
transition the volume changes from Vv to Vl , at which point the entire system enters the liquid
phase. The value of the coexisting pressure P2 is determined by Maxwell’s equal area law.

the intermediate root corresponds to a point at which (∂P /∂V )T > 0. Hence, this root is
unstable, since the pressure increases with volume for fixed temperature. It follows that one of
the remaining two roots should correspond to thermal equilibrium. To ascertain which of the
two roots is stable, we recall that the condition for stability is determined by the minimization
of the free energy. If we let G(T , P ) denote the Gibbs free energy, then Maxwell’s relation
V = (∂G/∂P )T implies that
 P
G(T , P ) = G(T , P0 ) + V (u, T ) du. (116)
P0

Therefore, when viewed as a function of pressure P for a fixed temperature T < Tc below
the critical point, the free energy G(T , P ) describes one of the two distinct curves, depending
on whether P is reduced from high values or increased from low values. This is shown
schematically in figure 4.
If the free energy assumes its minimum value, then as the value of P changes, G(T , P )
must describe one of the two curves in figure 4 until its intersection with the other curve at
pressure P = P2 , whereafter G(T , P ) follows the other curve. At the point P = P2 the liquid
and vapour phases coexist. Therefore, if we, say, reduce the pressure from high values, then
after reaching the value P2 the pressure remains constant until the liquid is entirely converted
into vapour. During this transition the volume changes from Vl to Vv as indicated in figure 4.

18
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

r0
d r

φ
0

Figure 5. Lennard–Jones potential. There is a strong repulsive√force at short distances r  r0 and


a weak attractive force at long distances r > r0 , where r0 = 6 2d and d represents the radius of
the gas molecule.

The value of the coexisting pressure P2 is determined, for each fixed T < Tc , by Maxwell’s
equal area principle. That is, the vertical line in figure 4 is chosen so that the volumes of the
two shaded regions are exactly equal.

3.2. Canonical partition function


The equation of state (108) was first deduced empirically by van der Waals, directly from
experimental observations. However, it can also be derived analytically from the canonical
partition function associated with an empirically postulated intermolecular potential. Assume
that the interaction energy between a pair of molecules separated by a distance r is given by
the Lennard–Jones potential
   6 
d 12 d  r 12  r 6
0 0
φ(r) = 4φ0 − = φ0 − 2φ0 , (117)
r r r r
where r0 = 21/6 d and d is a parameter which can be regarded as the radius of the gas
molecule. Clearly, φ(d) = 0 and φ(r) assumes its minimum value at r = r0 . As we see from
figure 5, this inter-molecular potential energy gives rise to a weak long-range attractive force
and a strong short-range repulsive force between each pair of molecules.
The canonical partition function can thus be written as
⎛ ⎞
N  
1   
N 2
p
z(β, V ) = d3 pi d3 ri exp ⎝−β i
−β φij ⎠ , (118)
N!h3N i=1 i=1
2m (ij )
where φij = φ(rij ), with rij denoting the distance between the ith and j th molecules. Thus,
the canonical partition function can be expressed as a product
z(β, V ) = z0 (β, V )Q(β, V ), (119)
where z0 (β, V ) is the canonical partition function for the ideal gas (95) and
⎛ ⎞
  
1
Q(β, V ) = N d3 r1 · · · d3 rN exp ⎝−β φij ⎠ (120)
V (ij )
is the contribution from the interaction energy.
19
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Now, as an approximation to the Lennard–Jones potential we assume that exp(−βφij ) = 0


for rij < d. In other words, we regard the molecules as hard spheres of radius d, which cannot
overlap. As a consequence, the overlapping region can be removed from the range of the
volume integration (120). Defining the so-called Mayer function fij = f (rij ) by
fij = exp(−βφij ) − 1, (121)
we rewrite the integral (120) as
  
1
Q(β, V ) = N d r1 · · ·
3
d3 rN (1 + fij )
V r1 >d rN >d (ij )
⎛ ⎞
   
1
= N d3 r1 · · · d3 rN ⎝1 + fij + fij fkl + · · ·⎠ . (122)
V r1 >d rN >d (ij ) (ij ) (kl)

Assuming that the parameter φ0 in (117) is sufficiently small, the contribution arising from
fij in the specified integration range can be regarded as an infinitesimal. The first term on the
right-hand side of (122), i.e. the integral of unity, can, on the other hand, be approximated by
V (V − v0 ) · · · [V − (N − 2)v0 ][V − (N − 1)v0 ] ≈ (V − bN )N , (123)
where we put v0 = 2b = 4
3
π d 3.
The integrations are performed consecutively, so that the first
particle can occupy volume V without constraints, the second particle can occupy volume V
less the volume v0 occupied by the first particle, the third particle can occupy volume V less
the volume 2v0 occupied by the first two particles, and so on. Similarly, the second term on
the right-hand side of (122) can be approximated by
  
d3 r1 · · · d3 rN fij = (V − bN )N−1 d3 rj fij
r1 >d rN >d rj >d
 ∞
≈ −(V − bN ) N−1
βπ φ(r)r 2 dr. (124)
d
Assembling these results, we can approximate Q(β, V ) in the following closed form:
   
N N aN 2
Q(β, V ) =∼ 1−b 1+β + ···
V V
 N  
N aN 2

= 1−b exp β , (125)
V V
where we have defined
 ∞
a = −π r 2 φ(r) dr. (126)
d
Using the above expression for Q(β, V ) we finally obtain the canonical partition function
 
1 2π m 3N/2
z(β, V ) = (V − bN )N exp(aβN 2 /V ). (127)
N! βh2

3.3. Critical behaviour of the van der Waals gas


From the expression for the partition function of the canonical distribution we deduce the
equation of state
 
1 ∂ ln z(β, V ) N kB T N2
P = = −a 2, (128)
β ∂V β V − bN V
20
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

where for clarity we have substituted β = 1/kB T . Observe that this is precisely the van
der Waals equation introduced in (108). If we had not applied various approximations in the
derivation of (127), then additional terms of order (N/V )3 and higher would have appeared
on the right-hand side of (128).
To analyse the behaviour of the system near a critical point we introduce the deviation
parameters
p = P̂ − 1, v = V̂ − 1 and t = T̂ − 1. (129)
In terms of these shifted variables the equation of state (128) becomes
8(t + 1) 3
p= − − 1. (130)
3v + 2 (v + 1)2
We can then expand the equation of state (130) for small v and t, obtaining
p = t (4 − 6v) − 32 v 3 + · · · . (131)
Similarly the Gibbs free energy
G = P V − kB T ln z(T , V ) (132)
can be expanded as follows:
  
3 4 3 2π mkB T
G = Pc Vc (p − 4t)v + 3tv + v + 1 + p − N kB T ln
2
. (133)
8 2 h2
For fixed pressure p and temperature t, the volume v in thermal equilibrium is that which
minimizes the Gibbs free energy G. The equation of state (131) is a necessary but not
sufficient condition for the Gibbs free energy to assume its minimum. Therefore, at the
coexisting pressure p = 4t below the critical temperature (t < 0) we have the three roots

v = ±2 −t, 0 (134)
for the volume determined by the equation√ of state (131). Differentiating (133) with respect
to v, we find that the first two roots ±2 −t minimize the free energy G and thus correspond
to stable states, while the root v = 0 maximizes G and thus corresponds to an unstable state.
Stable states represent the coexisting phase of liquid and vapour, with pressure given by
 
Tc − T
P2 = Pc c(1 + 4t) = Pc 1 − 4 . (135)
Tc
The liquid phase is more stable when P2 < P < Pc , and the vapour phase more stable when
P < P2 .

3.4. The thermodynamic limit


The existence of the instability in the van der Waals system studied above is related to the
fact that in the canonical distribution the volume V of the system is held fixed, whereas in
a real gas volume fluctuations are significant in the vicinity of the critical point. In other
words, the canonical distribution does not provide a completely accurate physical description
of the vapour–liquid equilibrium. Therefore, as in the case of an ideal gas, we consider the
pressure–temperature distribution, with the corresponding partition function

1 ∞
Z(α, β) = z(β, V ) exp(−αV ) dV (136)
b bN
wherein the volume fluctuation is integrated out. Recall that b = 23 π d 3 represents the
smallest volume each molecule can occupy. Hence, the random variable V representing

21
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

the total volume ranges from bN to infinity. When the canonical partition function (127) is
substituted into (136), the resulting integral does not admit an elementary analytical expression.
Nevertheless, in the thermodynamic limit N → ∞ we can implicitly determine the potential
ψ(α, β) = N −1 ln Z(α, β) by the method of steepest descent.
We proceed as follows. First we write the integrand in (136) as
exp(−αV )z(β, V ) = exp[Ng(α, β, v̂)], (137)
where v̂ ≡ V /N and
  32 
2π m βa
g(α, β, v̂) = 1 − α v̂ + ln (v̂ − b) + . (138)
βh2 v̂
In deriving (138) we have used the Stirling formula ln N !  N ln N − N . It should be evident
from (132) that
G = −β −1 g(α, β, v̂) (139)
is the Gibbs free energy. Also, note that g(α, β, v̂) must have at least one maximum in the range
v̂ ∈ [b, ∞) corresponding to the minimum Gibbs free energy, because g(α, β, v̂) → −∞ in
the limits v̂ → b and v̂ → ∞. The value of v̂ at which g(α, β, v̂) is maximized therefore
determines the equation of state (128) for the canonical distribution. However, in the P–T
distribution the volume is a random variable, hence we must take its expectation to obtain the
equation of state:
1 ∂ ln Z(α, β)
v̂ = − , (140)
N ∂α
where v̂ denotes the expected volume per particle in the P–T distribution characterized by
the density function Z −1 (α, β) exp[Ng(α, β, v̂)].
Applying the change of variable V → v̂, the partition function (136) can be written in
the form

N ∞
Z(α, β) = exp[Ng(α, β, v̂)] dv̂. (141)
b b
Recall that we are interested in the thermodynamic potential per particle in the thermodynamic
limit:
ψ(α, β) = lim N −1 ln Z(α, β). (142)
N→∞

Using the method of steepest descent, we find that ψ(α, β) in this limit is given by
ψ(α, β) = g(α, β, v̄) = −α v̄ + ln z(β, v̄), (143)
where v̄ = v̄(α, β) is the function of α and β which maximizes g(α, β, v̂). Since v̄ minimizes
the Gibbs free energy, it is the solution of the van der Waals equation of state. Although the
exact functional form of v̄(α, β) is not at our disposal owing to the cubic nature of the equation
of state, we can nonetheless determine the exact expression for the scalar curvature in terms
of the variables β and v̄. Before we proceed, however, we first establish the following result.

Proposition 6. The thermal expectation value of the volume v̂ per particle in the P–T
distribution is given by v̄, that is, v̂ = v̄.

Proof. Differentiating (143) and using the chain rule we find


 
∂ψ ∂ ln z ∂ v̄
= −v̄ + −α + . (144)
∂α ∂ v̄ ∂α
22
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

However, by definition v̄ maximizes g(α, β, v̂) so that


∂g ∂ ln z
= −α + = 0, (145)
∂ v̄ ∂ v̄
and hence
∂ψ
= −v̄. (146)
∂α
On the other hand, from (140) we have v̂ = −∂ψ/∂α, and thus v̂ = v̄. 

3.5. Geometry of the van der Waals manifold


As we have just indicated, the functional form of v̄(α, β) is unknown. Nevertheless, we can
implicitly determine expressions for ∂ v̄/∂α, ∂ v̄/∂β, and so on, in the following manner. First,
define  by
∂g 1 a
≡− =α− + β 2. (147)
∂ v̄ v̄ − b v̄
Since v̂ = v̄ maximizes g(α, β, v̂) we have by definition the relation  = 0. This, however,
is just the equation of state for the van der Waals gas. Now, consider the total derivative of :
∂ ∂ ∂
d = dα + dβ + dv̄. (148)
∂α ∂β ∂ v̄
Since d = 0 it follows that
   
∂ ∂ ∂ ∂
dv̄ = − dα − dβ
∂α ∂ v̄ ∂β ∂ v̄
   
∂ v̄ ∂ v̄
= dα + dβ, (149)
∂α β ∂β α
where we have used the general identity:
     
∂α ∂β ∂γ
= −1. (150)
∂β γ ∂γ α ∂α β
On the other hand, from (147) we have the relations
∂ ∂ a ∂ 1 2aβ
= 1, = 2 and = − 3 . (151)
∂α ∂β v̄ ∂ v̄ (v̄ − b)2 v̄
Therefore, substituting these into (149) we deduce that
∂ v̄ 1 ∂ v̄ 1 a
= and = , (152)
∂α D ∂β D v̄ 2
where D is defined by
2aβ 1
D(α, β) = 3 − . (153)
v̄ (v̄ − b)2
The derivatives of v̄ with respect to the parameters α and β are required in order to determine
the components of the Fisher–Rao metric on the van der Waals manifold. Specifically we
obtain the following result:
Proposition 7. In terms of the pressure–temperature coordinates (α, β) the Fisher–Rao metric
on the van der Waals manifold is given by
 
1 −1 −a/v̄ 2
Gij = . (154)
D −a/v̄ 2 23 β −2 − (a/v̄ 2 )2
23
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

In particular, in the ideal gas limit a → 0 and b → 0, the metric (154) reduces to the metric
(99) for the ideal gas.

Proof. The components of the metric are determined by the matrix ∂i ∂j ψ(α, β). We have, in
proposition 6, established that ∂ψ/∂α = −v̄, and, using (145), we have
∂ψ ∂ v̄ ∂ ln z ∂ v̄ ∂ ln z ∂ ln z
= −α + + = . (155)
∂β ∂β ∂ v̄ ∂β ∂β ∂β
Therefore, we obtain
∂ 2ψ ∂ v̄ ∂ 2ψ ∂ v̄ ∂ 2ψ ∂ 2 ln z ∂ 2 ln z ∂ v̄
=− , =− and = + , (156)
∂α 2 ∂α ∂α∂β ∂β ∂β 2 ∂β 2 ∂β∂ v̄ ∂β
whence the desired expression for the metric follows from the formula (127) for the canonical
partition function. In the ideal gas limit a → 0 and b → 0, we have D → v̄. However,
from the ideal gas equation of state we have v̄ = α −1 , hence we recover (99) at once in this
limit. 

To describe the geometry of the van der Waals gas it will be convenient to introduce
the concept of a spinodal curve. In general a spinodal curve consists of the points in the
thermodynamic phase space at which the second derivative of the free energy with respect to
an order parameter vanishes. For a gas of interacting molecules, the mean volume v̄ constitutes
the order parameter of the system, and the vanishing of the second derivative of ln z(β, v̄) with
respect to v̄ thus determines the spinodal curve in the phase diagram. For the van der Waals
gas, by (145), we have the relation
∂α ∂ 2 ln z
=− =0 (157)
∂ v̄ ∂ v̄ 2
that determines the spinodal curve. In other words, the locus of points at which the derivative
of pressure with respect to volume vanishes for some temperature determines the spinodal
curve. This is schematically illustrated for the van der Waals equation in figure 6.
It is evident from figure 6 that the region in the phase diagram enveloped by the spinodal
curve is unstable because (∂P /∂V )T > 0 in this region, i.e. the pressure increases with
increasing volume. The spinodal curve thus forms the boundary of a semi-stable region in
the phase diagram. In view of the first relation in (152), the spinodal curve is determined
by the condition D = 0. On the other hand, from expression (154) for the Fisher–Rao metric on
the van der Waals manifold we see that each component of the metric Gij as well as its
determinant −3/(2β 2 D) is singular along the spinodal curve. Is this singular behaviour
merely due to the specific choice of coordinates or is it an intrinsic feature of the van der Waals
manifold? We can answer this question by calculating the scalar curvature of the manifold.
The exact expression for the curvature is as follows.

Proposition 8. The scalar curvature of the van der Waals manifold is given by
  
4 aβ aβ
R= − D . (158)
3D 2 v̄ v̄ 3
In particular, R diverges along the entire spinodal curve specified by D = 0, which includes
the critical point (Pc , Vc , Tc ). The scalar curvature vanishes in the ideal gas limit for which
a → 0 and b → 0.

Proof. Since we have chosen the canonical parameterization (α, β), we can use the determinant
in (48) to calculate the curvature. To compute the entries in the determinant we differentiate

24
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Pc T>Tc

T=Tc
T<Tc
Unphysical
Maxwell’s curve
Vc Spinodal curve V

Figure 6. Schematic illustration of equations of state for van der Waals gas. The scalar curvature
on the parameter space diverges along the spinodal boundary which envelopes the unphysical
region. The critical point is that where the spinodal curve is tangent to the Maxwell equal area
boundary. The divergence of the curvature along the spinodal boundary may be interpreted as
‘preventing’, in some sense, entry into the unphysical domain in the phase diagram.

(156) and use the chain rule, together with (152), thereby obtaining
 
2 1 aβ a 2a
ψ111 = 3 − 3 4 ≡ X, ψ112 = 2 X + ,
D (v̄ − b) 3 v̄ v̄ v̄D 2
    (159)
a2 4 a3 6 3
ψ122 = 4 X + , ψ 222 = X + − .
v̄ v̄D 2 v̄ 6 v̄D 2 β3
Substituting these results into (48) we obtain, after some algebra, the desired expression in
(158). The fact that the curvature diverges along the spinodal curve D = 0 is evident from
expression (158). Also, from the definition (157) of the spinodal curve and the condition
(110) for the critical point, it is clear that the critical point lies on the spinodal curve. The
vanishing of the curvature in the ideal gas limit a, b → 0 is also evident from the expression
in (158). 
The van der Waals manifold possesses the structure of a Riemann surface over a planar
base space with coordinates (α, β), branched around the singularities specified by the spinodal
curve. Now, suppose we slowly change the variables (α, β) along a closed contour C in the
planar base space. Then, the lifted curve in the van der Waals manifold does not, in general,
return to the same sheet (i.e. to the same thermodynamic state) if C encloses the critical
point or crosses the spinodal curve, while Maxwell’s relation ensures that an infinitesimal
closed contour enclosing no singularities is thermodynamically trivial. Thus, the presence
of singularities may give rise to changes in the thermodynamic state v̄ of the system upon
following a closed contour in the parameter base space which encloses a point of divergency.
Conversely, if a closed contour in the parameter space does not enclose the critical point or
cross the spinodal curve, then the corresponding curve in M is closed and thus gives rise
to a well-defined holonomy. This leads naturally to the following open problem: what is
the physical interpretation or relevance of the holonomy (analogue of the geometric phase in
quantum mechanics) in classical statistical mechanics?
25
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

β
vanishing curvature
(R = 0)

R<0

R>0 Maxwell’s boundary


27b spinodal curve (D = 0)
8a
R<0
3b

Figure 7. Geometric phase diagram for the van der Waals gas. The gas is divided into positive
and negative curvature phases by the vanishing curvature R = 0 curve. The change of phase along
R = 0 is analytic, while the curvature exhibits singular behaviour along the spinodal curve D = 0.

Finally we note that the scalar curvature R on the van der Waals manifold vanishes along
the curve specified by
v̄ 3
β= , (160)
a(v̄ − b)2
and its sign changes smoothly from positive to negative as one decreases the temperature
in the (v̄, β) plane. The sign of the scalar curvature in the (v̄, β) plane and its relation to
the spinodal and Maxwell boundaries are schematically illustrated in figure 7. One might
refer to the smooth change in the sign of the scalar curvature in the phase diagram as a
geometric phase transition. Unlike the conventional phase transitions associated with singular
behaviour, however, such geometric phase transitions are not associated with any divergence.
There are indications that the scalar curvature can be viewed as a measure of stability of the
system. However, the precise correspondence between physical characteristics of the system
and properties of the curvature, apart from the singular behaviour along the spinodal boundary,
remains an open research problem.

4. Bibliographical notes

When the first author was invited to write a survey article on a topic of interest, it seemed
appropriate to utilize this opportunity by briefly reviewing some applications of information
geometry in physics. This reflects the fact that interest in this area has continued to grow: two
international conferences on applications of information geometry have been held in recent
years, while diverse new applications continue to emerge—for example, in the area of shape
recognition in computer science (Maybank 2005, Peter and Rangarajan 2006), in connection
with out of equilibrium measures (Crooks 2007), in the characterization of quantum phase
transitions (Zanardi et al 2007), in various mathematical extensions (e.g., Cena and Pistone
2007), in anyon statistics (Mirza and Mohammadzadeh 2008) or in black hole thermodynamics
(Ruppeiner 2008). The application of information geometry to statistical physics, however, has
26
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

generated a vast amount of literature, and it seemed neither feasible nor helpful to cover all the
results which have emerged in this area. What seemed more appropriate was to focus attention
upon one specific topic that nevertheless incorporates all the essential ingredients. It also
appeared desirable to explain the basic ideas of information geometry and its emergence from
probabilistic and statistical considerations in a language accessible to a graduate student in
theoretical physics. For these reasons we begin the paper with a rather elementary background
material, and then consider its application to the theory of vapour–liquid equilibrium. This
leads to the analysis of the van der Waals gas model, which, in our opinion, not only embodies
all the essential features of a phase transition in statistical mechanics but also admits an
elegant geometric characterization. To keep the exposition at a fairly elementary level, we
have excluded ad hoc citations from the main text so as to avoid impeding the continuity of
the exposition. Instead, references to the literature are presented more informally in these
bibliographical notes.
To the authors’ best knowledge, the use of geometric methods in statistical analysis
was first introduced by P C Mahalanobis, the founder of the Indian Statistical Institute and
also the founding editor of Sankhyā (The Indian Journal of Statistics), in the early 1930s
(Mahalanobis 1930, 1936). Mahalanobis was a physicist and statistician who taught relativity
and wrote an introduction to the translation by M Saha of Minkowski’s work on relativity.
His articles on relativity, written jointly with S N Bose, were published by the Calcutta
University. He introduced a measure of mutual separation in the study of statistical data arising
from anthropometric measurements. An alternative measure of separation was subsequently
introduced by Bhattacharyya (1943, 1946), and was defined in section 1 as the Bhattacharyya
spherical distance between two probability densities.
The geometric description of the parameter-space manifold was initiated at around the
same time by Rao (1945, 1947, 1954). The seminal paper by Rao (1945) is significant in two
respects: on the one hand, it introduced the so-called Cramér–Rao inequality as a lower bound
for the variance, while on the other hand it pointed out that the information measure introduced
previously by Fisher (1925) defines a Riemannian metric on the parameter-space manifold
of a statistical model. Rao then proposed the associated geodesic distance as a measure of
dissimilarity between probability distributions. The formulation
√ considered by Rao was based
on the Hilbert space embedding pθ (x) → ξθ (x) = pθ (x). As a consequence, many of
the constructions in Rao (1945) bear a formal resemblance to the geometric formulation of
quantum mechanics, developed by physicists some time later in the 1980s and 1990s (see
Brody and Hughston 2001 and references cited therein). Concurrently with Rao’s work
on the application of geometry in statistics, Jeffreys (1946) also introduced the concept of
uninformative priors based on the use of a Riemannian metric.
It is worth noting incidentally that the inner product of probability measures via the
square-root embedding was introduced earlier by Hellinger (1909) in connection with unitary
invariants of self-adjoint operators in Hilbert space. The distance measure of Hellinger was
subsequently extended by von Neumann and Kakutani (see Kakutani 1946), who introduced
an inner product of probability measures in an abstract measure-theoretic context and applied
this to investigate equivalence and orthogonality relations between product measures. The
Kakutani inner product was used by Brody (1971) to provide a simple proof of the Gaussian
dichotomy theorem, and has also been extended by Bures (1969) in the context of operator
algebras.
Interest in the application of geometrical techniques to statistical inference appear to have
somewhat diminished subsequently, but reemerged with the appearance of Efron’s seminal
paper (1975) on information loss and statistical curvature. Efron considered the logarithmic
embedding pθ (x) → lθ (x) = ln pθ (x) and demonstrated that in the space of log-likelihood

27
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

density functions the curvature of the curve lθ measures the deviation of the density function
from the exponential family. Furthermore, the squared statistical curvature was shown to
determine the loss of information resulting from the use of the maximum likelihood estimator
for the unknown parameter θ .
The work of Efron (1975, 1978)—and to some extent that of Čencov (1982) who showed
that the Fisher information metric is unique under certain assumptions including invariance—
evoked considerable interest in the geometrical approach to asymptotic inference and related
topics in statistics. Numerous research papers (for example, Atkinson and Mitchell 1981), as
well as review papers (for example, Kass 1989) and monographs (for example, Amari 1985,
Amari et al 1987, Murray and Rice 1993, and Amari and Nagaoka 2000, Arwini and Dodson
2008), were subsequently published.
In parallel with these developments in statistics, the application of information geometry
to the description of the equilibrium properties of thermodynamic systems was considered by a
number of authors. One of the initiators was Ingarden (1981), who considered various Banach
space embeddings of probability density functions and their relation to the concepts of entropy
and Fisher information, and suggested their application to statistical physics. The works of
Ingarden and his collaborators led to the establishment of a Polish School of researchers
systematically investigating various geometric aspects of physical systems described by
equilibrium distributions. Janyszek and Mrugała (1989a), for instance, clarified the relation
of the Fisher–Rao geometry to contact geometry (the latter characterizes the geometry of
Legendre transformations), and investigated the physical interpretation of the metric tensor for
a system of gas molecules characterized by the pressure–temperature distribution. Janyszek
and Mrugała (1989b) calculated the scalar curvatures of the parameter space manifolds of
the one-dimensional Ising model in the thermodynamic limit and of the mean-field model.
Janyszek and Mrugała (1990) also extended information geometric analysis to the investigation
of the stability of ideal quantum gases. Some of these ideas were further extended by others
in the context of various spin models in statistical mechanics (see, for example, Janke et al
2002, Brody and Ritz 2003, Johnston et al 2003).
An independent line of investigation on the various geometric properties of
thermodynamic state spaces was initiated by Weinhold (1975) and also by Ruppeiner (1979).
Weinhold proposed the existence of a metric structure for the thermodynamic state space
arising from empirical laws of thermodynamics. This line of thinking, which led to the notion
of the so-called thermodynamic length, was extended in a variety of ways by various authors
(see, for example, Salamon and Berry 1983, Schlögl 1985, Mrugała et al 1990).
Ruppeiner went a step further and considered the second derivative of the entropy,
discussed briefly here in section 1.5, as a Riemannian metric on the thermodynamic state
space. The metric considered by Ruppeiner, based upon the Shannon–Wiener entropy, agrees
with Rao’s entropy derivative metric (cf Rao 1984), and is also related to the Fisher–Rao metric
through a Legendre transformation. The key idea is that the convexity of entropy implies the
positive-definiteness of the associated Hessian matrix, which can therefore be used to define a
Riemannian metric on the thermodynamic state space. The simplest system to consider in this
respect is naturally that of a noninteracting gas of classical particles. This was investigated
by Ruppeiner (1979), and also by Mijatović et al (1987). Geometric aspects of various other
physical systems have also been investigated along these lines (Ruppeiner 1990, 1991). For
a comprehensive bibliography on this topic, see the reference list in the review article by
Ruppeiner (1995).
Ingarden and Tamassy (1993) also applied an entropic measure of divergence to explain
the thermodynamic arrow of time. While the infinitesimal form of the entropic measure of
divergence (relative entropy) gives rise to a Riemannian structure characterized in general by

28
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

the Burbea–Rao metric (85), the metrics arising from entropies generally possess Finslerian
structure. In particular, the Finsler metric arising from relative entropy is not symmetric. The
idea of Ingarden and Tamassy (1993) consists in exploiting the lack of symmetry in such
metrics to explain the thermodynamic arrow of time, without the introduction of dissipation.
An important application of information geometry to the properties of renormalization
group flow in statistical physics—motivated in part by the observation of Dawid (1975) that
Efron’s results might be represented more concisely in terms of Hilbert space geometry—was
proposed by Brody (1987). Closely related ideas were developed further by O’Connor and
Stephens (1993), Dolan (1998), Brody and Ritz (1998), and Brody (2000). (See also Diósi
et al (1984) for an alternative approach to analysis of renormalization group flow using Rao’s
entropy derivative metric.)
The ideas of information geometry have also been extended to the quantum domain by
replacing the density functions of classical probability theory by density matrices. Substantial
work has been done on quantum information geometry (for example, Petz and Sudar 1996,
Uhlmann 1996, Grasselli and Streater 2001, Petz 2002, Streater 2004, Jenčová and Petz 2006,
Gibilisco and Isola 2007, and Gibilisco et al 2007), as well as its application to quantum
statistical inference (Brody and Hughston 1998, Barndorff-Nielsen and Gill 2000).
Turning more specifically to the subject matter of the present review, as indicated above,
the spherical distance (7) representing the dissimilarity of probability densities was introduced
by Bhattacharyya (1943), and the concept of a statistical manifold represented by the metric
(23) was introduced by Rao (1945). The uniqueness of the Einstein metric on a complex
projective space was conjectured by Calabi, and later proven by Yau (1977). The implication
of this result in quantum mechanics—that the solution to the vacuum Einstein equation in
the space of pure quantum states determines transition probabilities—was pointed out to the
first author by G W Gibbons in the late 1990s. The relevance of the projective space and
the associated metric (47) to statistical mechanics was demonstrated by Brody and Hughston
(1999).
The expression in proposition 2 for the scalar curvature in terms of the determinant of a
3 × 3 matrix, valid for the two-dimensional statistical manifold associated with a canonical
density function, is given in Janyszek and Mrugała (1989b). The fact that the statistical
manifold associated with the normal density function possesses constant negative curvature,
shown in equation (57), was observed by Amari (1982). However, the significance of the
scalar curvature (or in fact the Riemann tensor itself) in statistical analysis remains somewhat
obscure.
A survey article by Burbea (1986) deals systematically with expressions for geodesic
curves associated with a number of standard density functions used in statistics (see also the
article by Rao in Amari et al 1987). This includes, in particular, the distance between Gaussian
density functions as given in (69) and (71). Analogous results for gamma-distributed densities
have been calculated in some detail by Burbea et al (2002).
The fact that the leading-order term in the Taylor expansion of the relative entropy of
a neighbouring pair of parametric density functions gives rise to the Fisher–Rao metric was
observed by Ingarden (1981). A more detailed and thorough analysis was given by Burbea
and Rao (1984), and constitutes the basis for the discussion in section 1.5. The specific form
of entropy defined in (80), sometimes referred to as the α-order entropy, was introduced by
Havrad and Charvát (1967) in the context of quantifying classification schemes. See also
Burbea and Rao (1982a, 1982b) for details concerning various properties of this entropy. The
use of the α-order entropy in statistical mechanics has been proposed by Tsallis (1988).
In section 2 we considered the information geometry of the pressure–temperature
distribution representing the equilibrium state of a gas of noninteracting particles (classical

29
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

ideal gas). The flatness of the information manifold of a classical ideal gas was pointed out by
Ruppeiner (1979) using the entropy derivative metric. The solution to the associated geodesic
equations, in the form expressed in proposition 5, does not seem to appear elsewhere. (An
alternative representation of the geodesics appears in Mijatović et al (1987).)
In sections 3.1–3.4 we have provided a brief account of the classical theory of the van der
Waals gas, as a background for the subsequent geometric description. Our exposition follows
closely the classic treatise of Mayer and Mayer (1940). There is a series of inspiring papers
by Kac et al (1963), Uhlenbeck et al (1963), Hemmer et al (1964), and also Hemmer (1964),
analysing the vapour–liquid equilibrium of the van der Waals gas in great detail. These papers
extend the earlier work of Kac (1959), which provides a method for determining the partition
function of interacting gas molecules.
Other related work on systems of interacting gas molecules includes the following: Tonks
(1936) determined the equations of state for gases composed of hard elastic spheres with
finite radius. van Hove (1950) calculated the free energy of a system of molecules with
nonvanishing incompressible radii, interacting according to a finite range force. He showed
that in one dimension the system exhibits no phase transition. Lebowitz and Percus (1963)
studied the properties of the correlation functions. Van Kampen (1964) showed that a gas
of molecules with hard sphere repulsive forces and long-range attractive interactions exhibits
condensation, and calculated the density fluctuations. Rigorous bounds for the free energy of
the van der Waals gas were derived by Lebowitz and Penrose (1966).
Detailed analyses of the curvature of some of these classical systems of interacting
molecules were presented by Ruppeiner and Chance (1990). The geometry of the van der
Waals gas associated with the entropy derivative metric is considered in papers by Diósi and
Lukács (1986) and in Diósi et al (1989), wherein the authors determine the scalar curvature
using the density and temperature as coordinates. Using these coordinates, they have also
shown that on this statistical manifold there exists no solution to the Killing equations (i.e. no
vector field such that the associated flow preserves geodesic distances).
The description of the geometry of the van der Waals manifold presented in section 3.5
follows closely the analysis outlined by Janyszek (1990) and also by Brody and Rivier (1995).
In particular, the expressions for the metric tensor in proposition 7 and for the scalar curvature
in proposition 8 were derived by Brody and Rivier (1995), who also suggested that the
curvature of the statistical manifold might play a role in statistical mechanics analogous to
that of geometric phases in quantum mechanics. This remains an open issue, although recent
work on quantum phase transitions indicate that there is indeed a close analogy between these
two concepts.
Finally, the present authors regret that owing to the huge volume of literature on this
subject there are many other valuable contributions which have not been mentioned in these
brief biographical notes.

References

Amari S 1982 Differential geometry of curved exponential families—curvatures and information loss Ann. Stat.
10 357–85
Amari S 1985 Differential-Geometrical Methods in Statistics (Lecture Notes in Statistics vol 28) (New York: Springer)
Amari S, Barndorff-Nielsen O E, Kass R E, Lauritzen S L and Rao C R 1987 Differential Geometry in Statistical
Inference (Institute of Mathematical Statistics Lecture Notes. Monograph Series vol 10) (Hayward, CA: Institute
of Mathematical Statistics)
Amari S and Nagaoka H 2000 Methods of Information Geometry (AMS Translations of Mathematical Monograph
vol 191) (Oxford: Oxford University Press)
Arwini K A and T Dodson C J 2008 Information Geometry (Lecture Notes in Mathematics vol 1953) (Berlin: Springer)

30
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Atkinson C and S Mitchell A F 1981 Rao’s distance measure Sankhyā 43 345–65


Barndorff-Nielsen O E and Gill R D 2000 Fisher information in quantum statistics J. Phys. A: Math. Gen. 33 4481–90
Bhattacharyya A 1943 On a measure of divergence between two statistical populations defined by their probability
distributions Bull. Calcutta Math. Soc. 35 99–109
Bhattacharyya A 1946 On a measure of divergence between two multinomial populations Sankhyā 7 401–6
Brody D C 2000 Differential renormalisation flow in random lattice gauge theories Phys. Lett. B 485 422–8
Brody D C and Hughston L P 1998 Statistical geometry in quantum mechanics Proc. R. Soc. Lond. A 454 2445–75
Brody D C and Hughston L P 1999 Geometrisation of statistical mechanics Proc. R. Soc. Lond. A 455 1683–715
Brody D C and Hughston L P 2001 Geometric quantum mechanics J. Geom. Phys. 677 1–35
Brody D C and Ritz A 1998 On the symmetry of real-space renormalisation Nucl. Phys. B 522 588–604
Brody D C and Ritz A 2003 Information geometry of finite Ising models J. Geom. Phys. 47 207–20
Brody D C and Rivier N 1995 Geometrical aspects of statistical mechanics Phys. Rev. E 51 1006–11
Brody E J 1971 An elementary proof of the Gaussian dichotomy theorem Z. Wahrscheinlichkeitstheor. Verwandte
Geb. 20 217–26
Brody E J 1987 Applications of the Kakutani metric to real-space renormalization Phys. Rev. Lett. 58 179–82
Burbea J 1986 Informative geometry of probability spaces Expositiones Math. 4 347–78
Burbea J, Oller J M and Reverter F 2002 Some remarks on the information geometry of the Gamma distribution
Commun. Stat. Theory Methods 31 1959–75
Burbea J and Rao C R 1982a On the convexity of some divergence measures based on entropy functions IEEE Trans.
Inf. Theory IT-28 489–95
Burbea J and Rao C R 1982b On the convexity of higher order Jensen differences based on entropy functions IEEE
Trans. Inf. Theory IT-28 961–3
Burbea J and Rao C R 1984 Differential metrics in probability spaces Probab. Math. Stat. 3 241–58
Bures D 1969 An extension of Kakutani’s theorem on infinite product measures to the tensor product of semifinite
w ∗ -algebras Trans. Am. Math. Soc. 135 199–212
Cena A and Pistone G 2007 Exponential statistical manifold Ann. Inst. Stat. Math. 59 27–56
Čencov N N 1982 Statistical Decision Rules and Optimal Inference (Translations of Mathematical Monographs
vol 53) (Providence, RI: American Mathematical Society) (Originally published as Statistiqeskie
Rexa w ie Pravila i Optimalbnye Vyvody Moskva: Nauka 1972.)
Crooks G E 2007 Measuring thermodynamic length Phys. Rev. Lett. 99 100602
Dawid A P 1975 Discussion on Professor Efron’s paper Ann. Stat. 3 1231–4
Diósi L, Forgács G, Lukács B and Frisch H L 1984 Metricization of thermodynamic-state space and the renormalization
group Phys. Rev. A 29 3343–5
Diósi L and Lukács B 1986 Spatial correlations in diluted gases from the viewpoint of the metric of the thermodynamic
state space J. Chem. Phys. 84 5081–4
Diósi L, Lukács B and Rácz A 1989 Mapping the van der Waals state space J. Chem. Phys. 91 3061–7
Dolan B P 1998 Geometry and thermodynamic fluctuations of the Ising model on a Bethe lattice Proc. R. Soc. Lond.
A 454 2655–65
Efron B 1975 Defining the curvature of a statistical problem (with applications to second order efficiency). With a
discussion by C R Rao, D A Pierce, D R Cox, D V Lindley, L LeCam, J K Ghosh, J Pfanzagl, N Keiding,
A P Dawid, J Reeds and with a reply by the author Ann. Stat. 3 1189–242
Efron B 1978 The geometry of exponential families Ann. Stat. 6 362–76
Fisher R A 1925 Theory of statistical estimation Proc. Camb. Phil. Soc. 122 700–25
Gibilisco P and Isola T 2007 Uncertainty principle and quantum Fisher information Ann. Inst. Stat. Math. 59 147–59
Gibilisco P, Imparato D and Isola T 2007 Uncertainty principle and quantum Fisher information: II J. Math.
Phys. 48 072109
Grasselli M R and Streater R F 2001 On the uniqueness of the Chentsov metric in quantum information geometry
Infinite Dimens. Anal. Quantum Probab. Relat. Top. 4 173–82
Havrda J and Charvát F 1967 Quantification method of classification processes Kybernetika 3 30–5
Hellinger E 1909 Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen J. Reine
Angew. Math. 136 210–71
Hemmer P C 1964 On the van der Waals theory of the vapour–liquid equilibrium: IV. The pair correlation function
and equation of state for long-range forces J. Math. Phys. 5 75–84
Hemmer P C, Kac M and Uhlenbeck G E 1964 On the van der Waals theory of the vapour–liquid equilibrium: III.
Discussion of the critical region J. Math. Phys. 5 60–74
Ingarden R S 1981 Information geometry in functional spaces of classical and quantum finite statistical systems Int.
J. Eng. Sci. 19 1609–33

31
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Ingarden R S and Tamassy L 1993 On parabolic geometry and irreversible macroscopic time Rep. Math.
Phys. 32 11–33
Janke W, Johnston D A and C. Malmini R P K 2002 Information geometry of the Ising model on planar random
graphs Phys. Rev. E 66 056119
Janyszek H 1990 Riemannian geometry and stability of thermodynamical equilibrium systems J. Phys. A: Math.
Gen. 23 477–90
Janyszek H and Mrugała R 1989a Geometrical structure of the state space in classical stastical and phenomological
thermodynamics Rep. Math. Phys. 27 145–59
Janyszek H and Mrugała R 1989b Riemannian geometry and the thermodynamics of model magnetic systems Phys.
Rev. A 39 6515–23
Janyszek H and Mrugała R 1990 Riemannian geometry and the stability of ideal quantum gases J. Phys. A: Math.
Gen. 23 467–76
Jeffreys H 1946 An invariant form for the prior probability in estimation problems Proc. R. Soc. Lond. A 186 453–61
Jenčová A and Petz D 2006 Sufficiency in quantum statistical inference: a survey with examples Infinite Dimens.
Anal. Quantum Probab. Relat. Top. 9 331–51
Johnston D A, Janke W and Kenna R 2003 Information geometry, one, two, three (and four) Acta Phys. Pol. B 34
4923–37
Kac M 1959 On the partition function of a one-dimensional gas Phys. Fluids 2 8–12
Kac M, Uhlenbeck G E and Hemmer P C 1963 On the van der Waals theory of the vapour–liquid equilibrium: I.
Discussion of a one-dimensional model J. Math. Phys. 4 216–28
Kakutani S 1948 On equivalence of infinite product measures Ann. Math. 49 214–24
Kass R 1989 The geometry of asymptotic inference. With comments and a rejoinder by the author Stat. Sci. 4 188–234
Lebowitz J L and Penrose O 1966 Rigorous treatment of the van der Waals–Maxwell theory of liquid–vapour transition
J. Math. Phys. 7 98–113
Lebowitz J L and Percus J K 1963 Asymptotic behaviour of the radial distribution function J. Math. Phys. 4 248–54
Mahalanobis P C 1930 On tests and meassures of groups divergence J. Asiatic Soc. Bengal 26 541–88
Mahalanobis P C 1936 On the generalised distance in statistics Proc. Natl. Inst. Sci. India A 2 49–55
Maybank S 2005 The Fisher–Rao metric for projective transformations of the line Int. J. Comput. Vis. 63 191–206
Mayer J E and Mayer M G 1940 Statistical Mechanics (New York: Wiley)
Mijatović M, Veselinović V and Trenčevski K 1987 Differential geometry of equilibrium thermodynamics Phys. Rev.
A 35 1863–7
Mirza B and Mohammadzadeh H 2008 Ruppeiner geometry of anyon gas Phys. Rev. E 78 021127
Mrugała R, Nulton J D, Schön J C and Salamon P 1990 Statistical approach to the geometric structure of
thermodynamics Phys. Rev. A 41 3156–60
Murray M K and Rice J W 1993 Differential Geometry and Statistics (London: Chapman and Hall)
O’Connor D and Stephens C R 1993 Geometry, the renormalisation group and gravity Directions in General Relativity
Proc. 1993 Int. Symp., MA vol 1, ed Hu B. L., M P Ryan (Jr) and C V Vishveshwava (Cambridge: Cambridge
University Press)
Peter A and Rangarajan A 2006 Shape analysis using the Fisher–Rao Riemannian metric: unifying shape
representation and deformation Proc. 3rd IEEE Int. Symp. on Biomedical Imaging: Nano to Macro pp 1164–7
Petz D 2002 Covariance and Fisher information in quantum mechanics J. Phys. A: Math. Gen. 35 929–39
Petz D and Sudar C 1996 Geometries of quantum states J. Math. Phys. 37 2662–73
Rao C R 1945 Information and the accuracy attainable in the estimation of statistical parameters Bull. Calcutta Math.
Soc. 37 81–91
Rao C R 1947 The problem of classification and distance between two populations Nature 159 30–1
Rao C R 1954 On the use and interpretation of distance functions in statistics Bull. Inst. Int. Stat. 34 90–7
Rao C R 1984 Convexity properties of entropy functions and analysis of diversity Inequalities in Statistics and
Probability Proc. Symp. on Inequalities in Statistics and Probability (Lincoln, Nebraska 1982) (Institute of
Mathematical Statistics Lecture Notes Monograph Series vol 5) ed Y L Tong (Hayward, CA: Institute of
Mathematical Statistics)
Ruppeiner G 1979 Thermodynamics: a Riemannian geometric model Phys. Rev. A 20 1608–13
Ruppeiner G 1990 Thermodynamic curvature of a one-dimensional fluid J. Chem. Phys. 92 3700–9
Ruppeiner G 1991 Riemannian geometric theory of critical phenomena Phys. Rev. A 44 3583–95
Ruppeiner G 1995 Reimannian geometry in thermodynamic fluctuation theory Rev. Mod. Phys. 67 605–59
Ruppeiner G 2008 Thermodynamic curvature and phase transitions in Kerr–Newman black holes Phys. Rev. D
78 024016
Ruppeiner G and Chance J 1990 Reimannian geometry in thermodynamic fluctuation theory J. Chem. Phys. 92 3700–9
Salamon P and Berry R S 1983 Thermodynamic length and dissipated availability Phys. Rev. Lett. 51 1127–30

32
J. Phys. A: Math. Theor. 42 (2009) 023001 Topical Review

Schlögl F 1985 Thermodynamic metric and stochastic measures Z. Phys. B 59 449–54


Streater R F 2004 Duality in quantum information geometry Open Syst. Inf. Dyn. 11 71–7
Tonks L 1936 The complete equation of state of one, two, and three-dimensional gases of hard elastic spheres Phys.
Rev. 50 955–63
Tsallis C 1988 Possible generalization of Boltzmann–Gibbs statistics J. Stat. Phys. 52 479–87
Uhlenbeck G E, Hemmer P C and Kac M 1963 On the van der Waals theory of the vapour–liquid equilibrium: II.
Discussion of the distribution functions J. Math. Phys. 4 229–47
Uhlmann A 1996 Spheres and hemispheres as quantum state space J. Geom. Phys. 18 76–92
van Hove L 1950 Sur l’intégral de configuration pour les systèmes de particules à une dimension Physica 16 137–43
van Kampen N G 1964 Condensation of a classical gas with long-range attraction Phys. Rev. 135 A362–9
Weinhold F 1975 Metric geometry of equilibrium thermodynamics J. Chem. Phys. 63 2479–83
Yau S T 1977 Calabi’s conjecture and some new results in algebraic geometry Proc. Natl. Acad. Sci. USA 74 1798–9
Zanardi P, Giorda P and Cozzini M 2007 Information-theoretic differential geometry of quantum phase transitions
Phys. Rev. Lett. 99 100603

33

You might also like