[go: up one dir, main page]

0% found this document useful (0 votes)
74 views264 pages

Collisionless Dynamics

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 264

Dynamics of Collisionless Systems

Summer Semester 2005, ETH Zürich

Frank C. van den Bosch


Outline
Lecture 1: Introduction & General Overview
Lecture 2: Cancelled
Lecture 3: Potential Theory
Lecture 4: Orbits I (Introduction to Orbit Theory)
Lecture 5: Orbits II (Resonances)
Lecture 6: Orbits III (Phase-Space Structure of Orbits)
Lecture 7: Equilibrium Systems I (Jeans Equations)
Lecture 8: Equilibrium Systems II (Jeans Theorem in Spherical Systems)
Lecture 9: Equilibrium Systems III (Jeans Theorem in Spheroidal Systems)
Lecture 10: Relaxation & Virialization (Violent Relaxation & Phase Mixing)
Lecture 11: Wave Mechanics of Disks (Spiral Structure & Bars)
Lecture 12: Collisions between Collisionless Systems (Dynamical Friction)
Lecture 13: Kinetic Theory (Fokker-Planck Eq. & Core Collapse)
Lecture 14: Cancelled
Summary of Vector Calculus I
⃗·B
A ⃗ = scalar = |A|
⃗ |B|⃗ cos ψ = Ai Bi (summation convention)
⃗×B
A ⃗ = vector = ϵijk ⃗
ei Aj Bk (with ϵijk the Levi-Civita Tensor)

Useful to Remember
A⃗×A ⃗=0
A⃗×B ⃗ = −B ⃗ ×A ⃗
A⃗ · (A
⃗ × B)
⃗ =0
A⃗ · (B
⃗ × C)
⃗ = B·(⃗ C ⃗ × A)⃗ = C·(
⃗ A ⃗ × B)

A⃗ × (B⃗ × C)
⃗ = B(⃗ A ⃗ · C)
⃗ − C(⃗ A⃗ · B)

(A⃗ × B)
⃗ · (C⃗ × D)
⃗ = (A ⃗ · C)(
⃗ B ⃗ · D)
⃗ − (A⃗ · D)(
⃗ B ⃗ · C)

⃗ =vector operator= ( ∂ ,
∇ ∂
, ∂
)
∂x ∂y ∂z


∇S = gradS = vector
⃗ ·A
∇ ⃗ = divA⃗ = scalar
⃗ ×A
∇ ⃗ = curlA⃗ = vector
Summary of Vector Calculus II
⃗ ·∇
⃗ = scalar operator = ∂2 ∂2 ∂2
Laplacian: ∇ = ∇
2
∂x2
+ ∂y 2
+ ∂z 2

∇2 S ⃗ · (∇S)
= ∇ ⃗ = scalar

∇2 A ⃗ · ∇)
= (∇ ⃗ A
⃗ = vector
⃗ ∇
∇( ⃗ · A)
⃗ ̸= ∇2 A⃗ = vector

⃗ × (∇S)
∇ ⃗ = 0 curl(gradS) = 0
⃗ · (∇
∇ ⃗ × A)
⃗ = 0 ⃗ =0
div(curl A)
⃗ × (∇
∇ ⃗ × A)
⃗ ⃗ ∇
= ∇( ⃗ · A)
⃗ − ∇2 A ⃗


∇(ST ) = ⃗ + T ∇S
S ∇T ⃗
⃗ · (S A)
∇ ⃗ = S(∇⃗ · A)
⃗ +A ⃗ · ∇S

⃗ × (S A)
∇ ⃗ = S(∇⃗ × A)
⃗ −A ⃗ × ∇S⃗
⃗ · (A
∇ ⃗ × B)
⃗ = ⃗ · (∇
B ⃗ × A)
⃗ −A ⃗ · (∇
⃗ × B)

Integral Theorems I
Gradient Theorem: Let γ be a curve running from ⃗ x0 to ⃗ x1 , d⃗l is the
x) is a scalar field then:
directed element of length along γ , and φ(⃗

!1
x
⃗ !1
x

⃗ · d⃗l =
∇φ dφ = φ(⃗
x1 ) − φ(⃗
x0 )
x0
⃗ x0

It follows that
"
⃗ · d⃗l = 0
∇φ

Divergence Theorem (Gauss’ Theorem): Let V be a 3D volume


⃗ x) be a vector field, then:
bounded by a 2D surface S , and let A(⃗
! !
⃗ ⃗
∇·Ad ⃗ 3
x= SA ⃗ · d2 S

V

Curl Theorem (Stokes’ Theorem): Let S be a 2D surface bounded by a


⃗ x) be a vector field, then:
1D curve γ , and let A(⃗
! "
⃗ ⃗ 2⃗
(∇ × A) d S = γ A ⃗ · d⃗l
S
Integral Theorems II
NOTE: Since a conservative force F ⃗ can always be written as the gradient of
a scalar field φ, we have from the gradient theorem that
"
⃗ · d⃗l = 0
F
From the curl theorem we immediately see that

⃗ ×F
∇ ⃗ =0

We immediately infer that a conservative force is curl free, and that the
⃗ · d⃗
amount of work done (dW = F r ) is independent of the path taken.

From the divergence theorem we infer that


! ! !
⃗ ·A
φ∇ ⃗ d3 ⃗
x= ⃗ · d2 S
φA ⃗− ⃗ · ∇φ
A ⃗ d3 x

V S V

which is the three-dimensional analog of integration by parts


! dv
! !
u dx dx = d(uv) − v du
dx
dx
Curvi-Linear Coordinate Systems I
In addition to the Cartesian coordinate system (x, y, z), we will often work
with cylindrical (R, φ, z) or spherical (r, θ, φ) coordinate systems
Let (q1 , q2 , q3 ) denote the coordinates of a point in an arbitrary coordinate
system, defined by the metric tensor hij . The distance between (q1 , q2 , q3 )
and (q1 + dq1 , q2 + dq2 , q3 + dq3 ) is

ds2 = hij dqi dqj (summation convention)

We will only consider orthogonal systems for which hij = 0 if i ̸= j , so


that ds2 = h2
i dq 2
i with
∂⃗
x
hi ≡ hii = | ∂q i
|
The differential vector is
∂⃗
x ∂⃗
x ∂⃗
x
d⃗
x= ∂q1
dq1 + ∂q2
dq2 + ∂q3
dq3
The unit directional vectors are
∂⃗
x ∂⃗
x 1 ∂⃗x
ei =
⃗ ∂qi
/| ∂qi
| = hi ∂qi
#
so that d⃗
x= hi dqi ⃗
ei and d3 ⃗
x = h1 h2 h3 dq1 dq2 dq3 .
i
Curvi-Linear Coordinate Systems II
The gradient:

⃗ =
∇ψ 1 ∂ψ

e
hi ∂qi i

The divergence:
$ %
⃗ ·A
∇ ⃗= 1 ∂
(h2 h3 A1 ) + ∂
(h3 h1 A2 ) + ∂
(h1 h2 A3 )
h1 h2 h3 ∂q1 ∂q2 ∂q3

The curl (only one component shown):


$ %
⃗ × A)
(∇ ⃗ 3= 1 ∂
(h2 A2 ) − ∂
(h1 A1 )
h1 h2 ∂q1 ∂q2

The Laplacian:
$ & ' & ' & '%
1 ∂ h2 h3 ∂ψ ∂ h3 h1 ∂ψ ∂ h1 h2 ∂ψ
∇2 ψ = h1 h2 h3 ∂q1 h1 ∂q1
+ ∂q2 h2 ∂q2
+ ∂q3 h3 ∂q3
Cylindrical Coordinates
For cylindrical coordinates (R, φ, z) we have that
x = R cos φ y = R sin φ z=z
The scale factors of the metric are:
hR = 1 hφ = R hz = 1

x = R⃗
and the position vector is ⃗ eR + z⃗
ez
⃗ = AR ⃗
Let A eR + Aφ⃗
eφ + Az ⃗
ez an arbitrary vector, then

AR = Ax cos φ − Ay sin φ
Aφ = −Ax sin φ + Ay cos φ
Az = Az

v = Ṙ⃗
Velocity: ⃗ e˙ R + ż⃗
eR + R⃗ ez = Ṙ⃗
eR + Rφ̇⃗
eφ + ż⃗
ez
Gradient & Laplacian:
⃗ ·A
⃗= 1 ∂ 1 ∂Aφ ∂Az
∇ R ∂R
(RAR ) + R ∂φ
+ ∂z
& '
1 ∂ ∂ψ 1 ∂2ψ ∂2ψ
∇ ψ=
2
R ∂R
R ∂R + R2 ∂φ2
+ ∂z 2
Spherical Coordinates
For spherical coordinates (r, θ, φ) we have that
x = r sin θ cos φ y = r sin θ sin φ z = r cos θ
The scale factors of the metric are:
hr = 1 hθ = r hφ = r sin θ

x = r⃗
and the position vector is ⃗ er
⃗ = Ar ⃗
Let A er + Aθ ⃗
eθ + Aφ⃗
eφ an arbitrary vector, then

Ar = Ax sin θ cos φ + Ay sin θ sin φ + Az cos θ


Aθ = Ax cos θ cos φ + Ay cos θ sin φ − Az sin θ
Aφ = −Ax sin φ + Ay cos φ

v = ṙ⃗
Velocity: ⃗ e˙ r = ṙ⃗
er + r ⃗ er + r θ̇⃗
eθ + r sin θ φ̇⃗

Gradient & Laplacian:
⃗ ·A
⃗= 1 ∂ 1 ∂ 1 ∂Aφ
∇ 2
r ∂r
(r 2
Ar ) + r sin θ ∂θ
(sin θAθ ) + r sin θ ∂φ
& ' & '
∂2ψ
∇ ψ=
2 1 ∂
r 2 ∂r
r 2 ∂ψ
∂r
+ 2
1 ∂
r sin θ ∂θ
sin θ ∂φ
∂θ
+ 1
r 2 sin2 θ ∂ψ 2
Introduction
COLLISIONLESS DYNAMICS: The study of the motion of large numbers of
point particles orbiting under the influence of their mutual self-gravity

EAMPLES OF COLLISIONLESS SYSTEMS

• Galaxies (ellipticals & disk galaxies) N ∼ 106 − 1011


• Globular clusters N ∼ 104 − 106
• Galaxy clusters N ∼ 102 − 103
• Cold Dark Matter haloes N ≫ 1050

MAIN GOALS

• Infer mass distribution from observed kinematics. Comparison with


light distribution ⇒ learn about dark matter and black holes
• Understand observed structure of galaxies:
1. Galaxies formed this way ⇒ learn about Galaxy Formation
2. Galaxies evolved this way ⇒ learn about Stability of galaxies
Newtonian Gravity
Newton’s First Law: A body acted on by no forces moves with
uniform velocity in a straight line
Newton’s Second Law: ⃗ = m d⃗v =
F d⃗
p
(equation of motion)
dt dt
Newton’s Third Law: ⃗ij = −F
F ⃗ji (action = reaction)
⃗ij = −G mi mj 3 (⃗
Newton’s Law of Gravity: F xj )
xi − ⃗
|⃗
xi −⃗
xj |

G = 6.67 × 10−11 N m2 kg−2 = 4.3 × 10−9 ( km s−1 )2 M⊙ −1 Mpc

Gravity is a conservative Force. This implies that:

• ∃ scalar field V (⃗ ⃗ = −∇V


x) (potential energy), so that F ⃗ (⃗
x)
• The total energy E = 1
mv 2
+ V (⃗
x) is conserved
2

• Gravity is a curl-free field: ∇


⃗ ×F
⃗ =0

Gravity is a central Force. This implies that:

• The moment about the center vanishes: ⃗ ⃗ =0


r×F

• Angular momentum J⃗ = m⃗ v is conserved: ( ddtJ = ⃗
r×⃗ ⃗ = 0)
r×F
The Gravitational Potential
Potential Energy: ⃗ (⃗
F ⃗ (⃗
x) = −∇V x)
V (⃗
x)
Gravitational Potential: Φ(⃗
x) = m
⃗ (⃗
F x)
Gravitational Field: g (⃗
⃗ x) = ⃗
= −∇Φ(⃗
x)
m

⃗ is the force per unit mass so that F


From now on F ⃗ (⃗ ⃗
x) = −∇Φ(⃗
x)

For a point mass M at x


⃗ 0: x) = − |⃗xGM
Φ(⃗ −⃗
x0 |
! ρ(⃗x′ ) 3 ′
x) :
For a density distribution ρ(⃗ Φ(⃗
x) = −G x′ −⃗
|⃗ x|
d ⃗
x

x) and gravitational potential Φ(⃗


The density distribution ρ(⃗ x) are related to
each other by the Poisson Equation

∇2 Φ = 4πGρ
For ρ = 0 this reduces to the Laplace equation: ∇2 Φ = 0.
see B&T p.31 for derivation of Poisson Equation
Gauss’s Theorem & Potential Theorem
If we integrate the Poisson Equation, we obtain
! ! !
4πG ρd x
3
⃗ = 4πGM = ∇ Φd x
2
⃗ = 3 ⃗
∇Φd 2

s
V S
!
Gauss’s Theorem: ⃗ d2 ⃗
∇Φ s = 4πGM
S

Gauss’s Theorem states that the integral of the normal component of the
g (⃗
gravitational field [⃗ ⃗ ] over any closed surface S is equal to 4πG
r) = ∇Φ
times the total mass enclosed by S .
!
cf. Electrostatics: ⃗ ·⃗
E n d 2
s=
⃗ Qint
S ϵ0

x) the total potential energy is:


For a continuous density distribution ρ(⃗

1
!
W = 2
x) Φ(⃗
ρ(⃗ x) d3 ⃗
x
(see B&T p.33 for derivation)

NOTE: Here we follow B&T and use the symbol W instead of V .


The Discrete N -body Problem
The gravitational force on particle i due to particle j is:

⃗i,j = Gmi mj
F |⃗ xj | 3
xi −⃗
(⃗
xi xj )
−⃗

(Newton’s Inverse Square Law)


Equations of Motion: F ⃗ = m d⃗v
dt
For particle i, the equations of motion are:

dvk,i #
N
mj
dt
= G (xk,i −xk,j )2
j=1,j̸=i
dxk,i
dt
= vk,i (k = 1, 3)
This corresponds to a closed set of 6N equations, for a total of 6N
unknowns (x, y, z, vx , vy , vz )

Since N is typically very, very large, we can’t make progress studying the
dynamics of these systems by solving the 6N equations of motion.
Even with the most powerful computers to date, we can only run N -body
simulations with N ∼< 106
From Discrete to Smooth
The density distribution and gravitational potential of N -body system are:

#
N
ρN (⃗
x) = ⃗ i)
x−x
mi δ(⃗
i=1

x) the Dirac delta function (B&T p.652), and


with δ(⃗

#
N
Gmi
ΦN (⃗
x) = − |⃗
x−⃗
xi |
i=1

#
N
mj
⃗i
F = G (⃗
xj ⃗ i)
−x
|⃗ xi | 3
xj −⃗
j=1,j̸=i

#
N ! (⃗ xi )
xj −⃗
= G |⃗ xi | 3
xj −⃗
mj δ(⃗
xj ⃗ )d3 x
−x ⃗
j=1,j̸=i

! (⃗ xi )
xj −⃗
= G |⃗
xj −⃗
xi | 3 ρN (⃗
x )d 3
x

We will replace ρN (⃗
x) and ΦN (⃗
x) with smooth and continuous functions
x) and Φ(⃗
ρ(⃗ x)
From Discrete to Smooth
For systems with large N , it is useful to try to use statistical descriptions of
the system (cf. Thermodynamics)
Replacing a discrete density distribution by a continuous density distribution
is familiar to us from fluid dynamics and plasma physics

However, there is one important difference:

Plasma & Fluid ⇐⇒ short range forces


Gravitational System ⇐⇒ long range forces

Plasma: electrostatic forces are long-range forces, but because of Debye


schielding the total charge → 0 at large r : short-range forces dominate.
Plasma may be collisionless.
Fluid: collisional system dominated by short-range van der Waals forces
between dipoles of molecules. Always attractive, but for large r dipoles
vanish. For very small r force becomes strongly repulsive.
For both plasma and fluid energy is an extensive variable: total energy is
sum of energies of subsystems.
For gravitational systems, energy is a non-extensive variable: sub-systems
influence each other by long-range gravitational interaction.
From Discrete to Smooth
FLUID
• mean-free path of molecules ≪ size of system
• molecules collide frequently, giving rise to a well defined collisional
pressure. This pressure balances gravity in hydrostatic equilibrium.
• Pressure related to density by equation of state. Ie, the EOS determines
the (hydrostatic) equilibrium.

GRAVITATIONAL SYSTEM
• mean-free path of particles ≫ size of system
• No collisional pressure, although kinetic energy of particles act as a
source of ‘pressure’, balancing the potential energy in virial equilibrium.
• No equivalent of equation of state. Pressure follows from kinetic
energy, but kinetic energy follows from the actual orbits within
gravitational potential, which in turn follows from the spatial distribution
of the particles (Self-Consistency Problem)
The Self-Consistency Problem
Given a density distribution ρ(⃗x), the Poisson equation yields the
gravitational potential Φ(⃗
x). In this potential I can integrate orbits using
Newton’s equations of motion. The self-consistency problem is the problem
of finding that combination of orbits that reproduces ρ(⃗ x).

Poisson Eq.
Density Potential

aw
dl
2n
?

’s
on
wt
Orbits

Ne
Think of self-consistency problem as follows: Given Φ(⃗ x), integrate all
possible orbits Oi (⃗
x), and find the orbital weights wi such that
#
x) =
ρ(⃗ wi Oi (⃗
x). Here Oi (⃗x) is the density contributed to ⃗
x by orbit i.
Timescales for Collisions
Following fluid dynamics and plasma physics, we replace our discrete
ρN (⃗x) with a smooth, continuous ρ(⃗
x). Orbits are then integrated in the
corresponding smooth potential Φ(⃗x).
In reality, the true orbits will differ from these orbits, because the true
potential is not smooth.

In addition to direct collisions (‘touching’ particles), we also have long-range


collisions, in which the long-range gravitational force of the granularity of the
potential causes small deflections.
Over time, these deflections accumulate to make the description based on
the smooth potential inadequate.

It is important to distinguish between long-range interactions, which only


cause a small deflection per interaction, and short-range interactions, which
cause a relatively large deflection per interaction.
Direct Collisions
Consider a system of size R consisting of N identical bodies of radius r
The cross section for a direct collision is σ = 4πr 2
1 3N
The mean free path of a particle is λ = nσ , with n = 4πR 3 the number

density of bodies

λ 4πR3
( R )2 1
R
= 3 N 4πr 2 R
≃ r N

It takes a crossing time tcross ∼ R/v to cross the system, so that the time
scale for direct collisions is
( R )2 1
tcoll = r
t
N cross

Example: A Milky Way like galaxy has R = 10 kpc = 3.1 × 1017 km,
v ≃ 200 km s−1 , N ≃ 1010 , and r is roughly the radius of the Sun
(r = 6.9 × 105 km). This yields λ = 2 × 1013 R. In other words, a direct
collision occurs on average only once per 2000 billion crossings! The
crossing time is tcross = R/v = 5 × 107 yr, so that tcoll ≃ 1021 yr.
This is about 1011 times the age of the Universe!!!
Relaxation Time I
Now that we have seen that direct collisions are completely negligble, let’s
focus on encounters
Consider once again a system of size R consisting of N identical particles
of mass m. Consider one such particle crossing the system with velocity v .
As we will see later, a typical value for the velocity is
* *
GM GN m
v= R
= R

We want to calculate how long it takes before the cumulative effect of many
encounters has given our particle a kinetic energy Ekin ∝ v 2 in the
direction perpendicular to its original motion of the order of its its initial
kinetic energy.
Note that for a sufficiently close encounter, this may occur in a single
encounter. We will treat this case seperately, and call such an encounter a
close encounter.
Relaxation Time II
First consider a single encounter

x v

F b
θ
m

Here b is the impact parameter, x = v t, with t = 0 at closest approach,


$ ( vt )2 %−1/2
and cos θ = √ 2b 2 = 1+ b
x +b

At any given time, the gravitational force in the direction perpendicular to the
direction of the particle is
2
$ ( vt )2 %−3/2
Gm2
F⊥ = G x2m+b2 cos θ = b2
1+ b

This force F⊥ causes an acceleration in the ⊥-direction: F⊥ = m dv⊥


dt
We now compute the total ∆v⊥ integrated over the entire encounter, where
we make the simplifying assumption that the particle moves in a straight line.
This assumption is OK as long as ∆v⊥ ≪ v
Relaxation Time III
!
∞ $ ( vt )2 %−3/2
Gm
∆v⊥ = 2 b2
1+ b dt
0
2Gm b
!

= b2 v
(1 + s2 )−3/2 ds
0
2Gm
= bv
As discussed above, this is only valid as long as ∆v⊥ ≪ v . We define the
minimum impact parameter bmin , which borders long- and short-range
interactions as: ∆v⊥ (bmin ) = v

2Gm
bmin = v2
≃ R/N

For a MW-type galaxy, with R = 10 kpc and N = 1010 we have that


bmin ≃ 3 × 107 km ≃ 50 R⊙
In a single, close encounter ∆Ekin ∼ Ekin . The time scale for such a close
encounter to occur can be obtained from the time scale for direct collisions,
by simply replacing r by bmin .
& '2
R tcross
tshort = bmin N
= N tcross
Relaxation Time IV
Now we compute the number of long-range encounters per crossing. Here
we use that (∆v⊥ )2 adds linearly with the number of encounters. (Note:
this is not the case for ∆v⊥ because of the random directions).
When the particle crosses the system once, it has n(< b) encounters with
an impact parameter less than b, where
πb2
( b )2
n(< b) = N πR2
=N R
Differentiating with respect to b yields
2N b
n(b)db = R2
db
Thus the total (∆v⊥ )2 per crossing due to encounters with impact
parameter b, b + db is
( 2Gm )2 2N b
( Gm )2 db
(∆v⊥ ) (b)db =
2
bv R2
db = 8N Rv b
Integrating over the impact parameter yields
( Gm )2 !R db
( Gm )2
(∆v⊥ ) = 8N
2
Rv b
≡ 8N Rv
lnΛ
bmin
& '
R
with lnΛ = ln bmin
= lnN the Coulomb logarithm
Relaxation Time V
We thus have that
( GN m )2 1 8lnN
(∆v⊥ ) =2
R v2 N

Substituting the characteristic value for v then yields that

(∆v⊥ )2 10lnN
v2
≃ N

Thus it takes of the order of N/(10lnN ) crossings for (∆v⊥ )2 to become


comparable to v 2 . This defines the relaxation time

N
trelax = t
10lnN cross
Summary of Time Scales
Let R be the size of the system, r the size of a particle (e.g., star), v the
typical velocity of the particles, and N the number of particles in the system.

Hubble time: The age of the Universe. tH ≃ 1/H0 ≃ 1010 yr



Formation time: The time it takes the system to form. tform = M
≃ tH
Crossing time: The typical time needed to cross the system. tcross = R/v
Collision time: The typical time between two direct collisions.
( R )2 tcross
tcoll = r N
Relaxation time: The time over which the change in kinetic energy due to the
long-range collisions has accumulated to a value that is
comparable to the intrinsic kinetic energy of the particle.
N
trelax = t
10lnN cross
Interaction time: The typical time between two short-range interactions that
cause a change in kinetic energy comparable to the
intrinsic kinetic energy of the particle.
tshort = N tcross
For Trully Collisionless systems:
tcross ≪ tH ≃ tform ≪ trelax ≪ tshort ≪ tcoll
Some other useful Time Scales
*
GM 3M
NOTE: Using that v = R
and ρ̄ = 4πR 3 we can write

*
3
tcross = 4πGρ̄

Dynamical time: the time required to travel halfway across the system.
*
3π π
tdyn = 16Gρ
= t
2 cross

Free-fall time: the time it takes a sphere with zero pressure to collapse to
a point.
* √

tff = 32Gρ
= tdyn / 2
Orbital time: the time it takes to complete a (circular) orbit.
*

torb = Gρ
= 2πtcross

NOTE: All these timescales are the same as the crossing time, except for
some pre-factors

tcross ∼
< t
ff ∼ tdyn ∼ torb
< <
Example of Time Scales
System Mass Radius Velocity N tcross trelax
M⊙ kpc km s−1 yr yr
Galaxy 1010 10 100 1010 108 > 1015
DM Halo 1012 200 200 > 1050 109 > 1060
Cluster 1014 1000 1000 103 109 ∼ 1010
Globular 104 0.01 2 104 5 × 106 5 × 108
• Dark Matter Haloes and Galaxies are collisionless
• Collisions may or may not be important in clusters of galaxies
• Relaxation is expected to have occured in (some) globular clusters
*
GM
NOTE: For a self-gravitating system, the typical velocities are v ≃ R
* *
R R3 3
For the crossing time this implies: tcross = v = GM
= 4πGρ

Useful to remember: 1 km/ s ≃ 1 kpc/ Gyr


1 yr ≃ π × 107 s
1 M⊙ ≃ 2 × 1030 kg
1 pc ≃ 3.1 × 1013 km
The Distribution Function I
We have seen that the dynamics of our discrete system of N point masses is
given by 6N equations of motion, which allow us to compute 6N unknowns
(⃗ v ) as function of time t.
x, ⃗
The system is completely specified by 6N initial conditions (⃗ v0 )
x0 , ⃗
We can specify these initial conditions by defining the distribution function
(DF), also called the phase-space density
#N #N
f (⃗ v , t0 ) =
x, ⃗ i=1 δ(⃗ ⃗ i,0 )
x−x i=1 vi,0 )
v −⃗
δ(⃗

Once f (⃗ v , t) is specified at any time t, we can infer f (⃗


x, ⃗ v , t′ ) at any
x, ⃗
other time t′

The DF f (⃗
x, ⃗
v , t) completely specifies a collisionless system
In the case of our smooth density distribution we define the 6 dimensional
phase-space density :

f (⃗ xd3⃗
v , t)d3 ⃗
x, ⃗ v

NOTE: A necessary, physical condition is that f ≥ 0


The Distribution Function II
x) follows from f (⃗
The density ρ(⃗ v ) by integrating over velocity space:
x, ⃗
!!!
x, t) =
ρ(⃗ f (⃗ v , t)d3⃗
x, ⃗ v
while the total mass follows from
!!! ! !
M (t) = ρ(⃗ x=
x, t)d ⃗ 3
d x
⃗ 3
d3⃗
v f (⃗
x, ⃗
v , t)

It is useful to think about the DF as a probability function (once normalized


by M ), which expresses the probability of finding a star in a phase-space
volume d3 x ⃗ d3 ⃗
v . This means we can compute the expectation value for any
quantity Q as follows:

1
!
x, t)⟩ =
⟨Q(⃗ ρ(⃗x)
d3⃗v Q(⃗ v ) f (⃗
x, ⃗ x, ⃗
v , t)
1
! ! 3
⟨Q(t)⟩ = M
d ⃗
3
x d ⃗v Q(⃗ v ) f (⃗
x, ⃗ x, ⃗v , t)

EXAMPLES:
1
! !
RMS velocity: ⟨vi2 ⟩ = M
d ⃗
3
x d3⃗
v vi2 f (⃗
x, ⃗
v , t)
!!!
Velocity Profile: L(x, y, vz ) = f (⃗ v , t) dz dvx dvy
x, ⃗
!!!!
Surface Brightness: Σ(x, y) = f (⃗
x, ⃗v , t) dz dvx dvy dvz
The Distribution Function III
Each particle (star) follows a trajectory in the 6D phase-space (⃗ v ), which
x, ⃗
is completely governed by Newtonian Dynamics (for a collisionless system).
This trajectory projected in the 3D space ⃗
x is called the orbit of the particle.
As we will see later, the Lagrangian time-derivative of the DF, i.e. the
time-derivative of f (⃗x, ⃗
v , t) as seen when travelling through phase-space
along the particle’s trajectory, is
df
dt
=0
This simple equation is the single most important equation for collisionless
dynamics. It completely specifies the evolution of a collisionless system, and
is called the Collisionless Boltzmann Equation (C.B.E.) or Vlasov equation.

The flow in phase-space is incompressible.

∂f
NOTE: Don’t confuse this with ∂t = 0!!!!
This is the Eulerian time-derivative as seen from a fixed phase-space
location, which is only equal to zero for a system in steady-state equilibrium.
Collisionless Dynamics in a Nutshell
!
x)
ρ(⃗ = f (⃗ v ) d3 ⃗
x, ⃗ v
∇2 Φ(⃗
x) = 4πGρ(⃗ x)
df
dt
= 0

x) is
The self-consistency problem of finding the orbits that reproduce ρ(⃗
equivalent to finding the DF f (⃗ v ) which yields ρ(⃗
x, ⃗ x).

Problem: For most systems we only have constraints on a 3D projection of


the 6D distribution function.
!!!
Recall: L(x, y, vz ) = f (⃗ v , t) dz dvx dvy
x, ⃗
Circular & Escape Velocities I
Consider a spherical density distribution ρ(r) for which the Poisson
Equation reads
1 ∂
( )
r 2 ∂r
r 2 ∂Φ
∂r
= 4πGρ(r)
from which we obtain that
!r
r 2 ∂Φ
∂r
= 4πG 0
ρ(r) r 2 dr = GM (r)
with M (r) the enclosed mass. This allows us to write
⃗grav (⃗ ⃗ GM (r)
F r ) = − dΦ
r) = −∇Φ(⃗ dr

er = − r2

er
Because gravity is a central, conservative force, both the energy and angular
momentum are conserved, and a particle’s orbit is confined to a plane.
Introducing the polar coordinates (r, θ) we write

E = 12 (ṙ 2 + r 2 θ̇ 2 ) + Φ(r)
J = r 2 θ̇

Eliminating θ̇ we obtain the Radial Energy Equation:

1 2 J2
2
ṙ + 2r 2
+ Φ(r) = E
Circular & Escape Velocities II
In the co-rotating frame, the equation of motion reduces to a
one-dimensional radial motion under influence of the effective potential
J2
U (r) = 2r 2
+ Φ(r). The ‘extra’ term arises due to the non-inertial nature
of the reference frame, and corresponds to the centrifugal force
& '
⃗cen = d J2 J2 vθ2
F − dr 2r 2
er =
⃗ ⃗
e
r3 r
= r r

e

⃗cen = −F
For a circular orbit we have that F ⃗grav , so that we obtain the
circular speed.
* *
GM (r)
vc (r) = r dΦ
dr
= r

Thus, rvc2 (r) measures the mass enclosed within radius r (in spherical
symmetry). Note that for a point mass vc (r) ∝ r −1/2 , which is called a
Keplerian rotation curve
Escape Speed: The speed a particle needs in order to ‘escape’ to infinity
+
vesc (r) = 2|Φ(r)|

Recall: The energy per unit mass is E = 12


v 2 + Φ(r). In order to escape to
infinity we need E ≥ 0, which translates into v 2 ≥ 2|Φ(r)|
Projected Surface Density
Consider a spherical system with intrinsic, 3D luminosity distribution ν(r).
An observer, at large distance, observes the projected, 2D surface brightness
distribution Σ(R)

z To observer
R r

z = distance along line−of−sight


R = projected radius from center

!
∞ !

Σ(R) = 2 ν(r)dz = 2 ν(r) √ r2dr
0 R r −R2

This is a so-called Abel Integral, for which the inverse is:


!

ν(r) = − π1 dΣ
dR
√ dR
r
2R −r 2

Thus, an observed surface brightness distribution Σ(R) of a spherical


system can be deprojected to obtain the 3D light distribution ν(r). However,
because it requires the determination of a derivative, it can be fairly noisy.
Spherical Potential-Density Pairs
To compute the potential of a spherical density distribution ρ(r) we can
make use of Newton’s Theorems

First Theorem A body inside a spherical shell of matter experiences


no net gravitational force from that shell.

Second Theorem The gravitational force on a body that lies outside a


closed spherical shell of mass M is the same as that
of a point mass M at the center of the shell.

Based on these two Theorems, we can compute Φ(r) by splitting ρ(r) in


spherical shells, and adding the potentials of all these shells:
, -
1
!r !

Φ(r) = −4πG r
ρ(r ′ ) r ′2 dr ′ + ρ(r ′ ) r ′ dr ′
0 r
!r
Using the definition of the enclosed mass M (r) = 4π 0 ρ(r ′ ) r ′2 dr ′
this can be rewritten as

!

Φ(r) = − GMr (r) − 4πG ρ(r ′ ) r ′ dr ′
r
Power-law Density Profiles I
Consider a spherical system with a simple power-law density distribution
& '−α
r
ρ(r) = ρ0 r0

!

Σ(R) = 2 ρ(r) √ r2dr = ρ0 r0α B( α
2
− 1 1
,
2 2
) R 1−α
R r −R2

!r α
4πρ0 r0
M (< r) = 4π ρ(r ) r dr =
′ ′2 ′
3−α
r 3−α
(α < 3)
0
!
∞ α
4πρ0 r0
M (> r) = 4π ρ(r ) r dr =
′ ′2 ′
α−3
r 3−α
(α > 3)
r

NOTE: For α ≥ 3 the enclosed mass is infinite, while for α ≤ 3 the total
mass (r → ∞) is infinite: A pure power-law system can not exist in nature!

A more realistic density distribution consists of a double power-law:


At small radii: ρ ∝ r −α with α < 3
At large radii: ρ ∝ r −β with β > 3
B(x, y) is the so-called Beta-Function, which is related to the Gamma Function Γ(x)
Γ(x) Γ(y)
B(x, y) = Γ(x+y)
= B(y, x)
Power-law Density Profiles II
The potential of a power-law density distribution is:
. α
4πGρ0 r0
r 2−α
if 2 < α < 3
Φ(r) = (α−3)(α−2)
∞ otherwise

The circular and escape velocities of a power-law density distribution are:


α
G M (r) 4πGρ0 r0
vc2 (r) = r dΦ
dr
= r
= 3−α
r 2−α

2
2
vesc (r) = α−2
v 2
c (r)

α = 2: Singular Isothermal Sphere vc = constant (flat rotation curve)


α = 0: Homogeneous Sphere vc ∝ r (solid body rotation)

NOTE: For α > 3 you find that vc (r) falls off more rapidly than Keplerian.
How can this be? After all, a Keplerian RC corresponds to a delta-function
density distribution (point mass), which is the most concentrated mass
distribution possible....
Power-law Density Profiles II
The potential of a power-law density distribution is:
. α
4πGρ0 r0
r 2−α
if 2 < α < 3
Φ(r) = (α−3)(α−2)
∞ otherwise

The circular and escape velocities of a power-law density distribution are:


α
G M (r) 4πGρ0 r0
vc2 (r) = r dΦ
dr
= r
= 3−α
r 2−α

2
2
vesc (r) = α−2
v 2
c (r)

α = 2: Singular Isothermal Sphere vc = constant (flat rotation curve)


α = 0: Homogeneous Sphere vc ∝ r (solid body rotation)

NOTE: For α > 3 you find that vc (r) falls off more rapidly than Keplerian.
How can this be? After all, a Keplerian RC corresponds to a delta-function
density distribution (point mass), which is the most concentrated mass
distribution possible....

answer: the circular velocity is defined via the gradient of the potential. As shown above,
Φ is only defined for 2 < α < 3, and therefore so does vc
Power-law Density Profiles: Summary
It is very useful to remember the following scaling relations:

ρ(r) ∝ r −α
Σ(R) ∝ R1−α
Φ(r) ∝ r 2−α (2 < α < 3)
vc2 (r) ∝ r 2−α (2 < α < 3)
M (< r) ∝ r 3−α (α < 3)
M (> r) ∝ r 3−α (α > 3)
Double Power-law Density Profiles
As we have seen, no realistic system can have a density distribution that is
described by a single power-law. However, many often used density
distributions have a double power-law.
C
ρ(r) = r γ (1+r 1/α )(β−γ)α

At small radii, ρ ∝ r −γ , while at large radii ρ ∝ r −β . The parameter α


determines the ‘sharpness’ of the break.
NOTE: In order for the mass to be finite, γ < 3 and β > 3

(α, β, γ) Name Reference


(1, 3, 1) NFW Profile Navarro, Frenk & White, 1997, ApJ, 490, 493
(1, 4, 1) Hernquist Profile Hernquist, 1990, ApJ, 356, 359
(1, 4, 2) Jaffe Profile Jaffe, 1983, MNRAS, 202, 995
(1, 4, 32 ) Moore Profile Moore et al., 1999, MNRAS, 310, 1147
( 12 , 2, 0) Modified Isothermal Sphere Sacket & Sparke, 1990, ApJ, 361, 409
( 12 , 3, 0) Modified Hubble Profile Binney & Tremaine, p. 39
( 12 , 4, 0) Perfect Sphere de Zeeuw, 1985, MNRAS, 216, 273
( 12 , 5, 0) Plummer Model Plummer, 1911, MNRAS, 71, 460
Ellipsoids I
Thus far we have only considered spherical systems. However, only very few
systems in nature are trully spherical. A more general, though still not fully
general, form to consider is the ellipsoid.
Without loosing generality, we will use the following definition of the
ellipsoidal radius

#
3
x2
m =
2
a21 i
a2
a1 ≥ a 2 ≥ a 3
i
i=1

Note that we have taken the three principal axes to be aligned with our
Cartesian coordinate system (x, y, z). If a1 > a2 > a3 then the ellipsoid
is said to be triaxial
A body whose isodensity surfaces are concentric ellipsoids is called an
ellipsoidal body.
1−(a2 /a1 )2
Triaxiality Parameter: T ≡ 1−(a3 /a1 )2

A spheroid is an axisymmetric ellipsoid with two equal principal axes:

• Oblate Spheroid: a1 = a2 > a3 (T = 0) (i.e. Earth)


• Prolate Spheroid: a1 > a2 = a3 (T = 1) (i.e. Cigar)
Ellipsoids II Sphere

Oblate Spheroid
d
oi
er
a3 /a1

ph
eS
at
ol
Pr
T=1 T=2/3 T=1/3 T=0

Needle Circular Disk


Elliptic Disk
a 2 /a1

For an oblate spheroid with axis ratio q = a3 /a1 , we define:


Ellipticity: ε=1−q
+
Eccentricity: e = 1 − q2
Ellipsoids III
A shell of similar, concentric ellipsoids is called a homoeoid. Note that the
perpendicular distance d between the two ellipsoids is a function of the
angular position.
In what follows we consider the family of ellipsoidal bodies whose density
distribution is the sum of thin homoeoids.

Homoeoid Theorem: The exterior isopotential surface of a homoeoidal


shell of negligble thickness are the spheroids that are confocal with the shell
itself. Insider the shell the potential is constant.
This implies that:

• The equipotentials of a homoeoid become spherical at large radii.


• The equipotential of a thin homoeoid has the same shape as the
homoeoid at the location of the homoeoid.

NOTE: the Homoeoid Theorem applies only to thin homoeoids. However, for
any homoeoid of any thickness we have:

Newton’s Third Theorem: A mass that is inside a homoeoid experiences


no net gravitational force from the homoeoid. Φinside = constant
Ellipsoids IV
Consider a spheroidal density distribution ρ(R, z) = ρ(m2 ) with
m2 = R2 + z 2 /(1 − e2 ), then the potential is:
√ , -
1−e2 a0 e
!

ψ(m) dτ
Φ(R, z) = −2πG e
ψ(∞)arcsine − 2

0 (τ +a2
0) τ +b2
0

Here
m2 R2 z2
a2
= τ +a2
+ τ +b2
0 0 0

with a0 any constant and b0 = 1 − e2 a0 , and
! m2
ψ(m) ≡ 0 ρ(m2 ) dm2

The corresponding circular velocity in the equatorial plane z = 0 is

∂Φ
√ !R ρ(m2 ) m2 dm
vc2 (R) = R ∂R = 4πG 1− e2 √ 2 2 2
0 R −m e
Ellipsoids IV
Consider a spheroidal density distribution ρ(R, z) = ρ(m2 ) with
m2 = R2 + z 2 /q 2 , then the potential is:

!

ψ(m) dτ
Φ(R, z) = −2πGq arcsine
e
ψ(∞) + πGq √
0 (τ +1) τ +q 2

+
Here e = 1 − q 2 is the eccentricity,
R2 z2
m =
2
τ +1
+ τ +q 2

and
! m2
ψ(m) ≡ 0
ρ(m′2 ) dm′2

The corresponding circular velocity in the equatorial plane z = 0 is

∂Φ
!R ρ(m2 ) m2 dm
vc2 (R) = R ∂R = 4πGq √ 2 2 2
0 R −m e
Ellipsoids V
• In general one finds that vc (R) increases with larger flattening q :
Flatter systems with the same spheroidal, enclosed mass have larger
circular speeds at given R.
• Let ερ = 1 − q the ellipticity of the density distribution. One always
has that εΦ ≤ ερ . At a few characteristic radii, a reasonable rule of
thumb is that εΦ ∼ 1 ε
3 ρ

We can generalize the equations on the previous page for a triaxial,


x) = ρ(m2 ) with
ellipsoidal density distribution ρ(⃗
#
3
x2
m = 2
a21 i
a2
i
i=1
The corresponding potential is
& '∞
!
a2 a3 [ψ(∞)−ψ(m)] dτ
Φ(⃗
x) = −πG a1

0 (τ +a2 2 2
1 )(τ +a2 )(τ +a3 )

with
m2 #
3
x2
a2
= i
τ +a2
1 i
i=1
Multipole Expansion I
In order to calculate the potential of an arbitrary density distribution, it is
useful to consider a Multipole expansion.
Using separation of variables, Φ(r, θ, φ) = R(r) P (θ) Q(φ), one can
write

#
∞ #
l
Ylm (θ,φ)
Φ(r, θ, φ) = −4πG 2l+1
, l=0 m=−l -
1
!
r !

dr ′

r (l+1) ρ lm (r ′
)r ′(l+2)
dr ′
+ r l
ρlm (r ′
) r ′(l−1)
0 r

Here

!π !

ρlm (r) = sin θdθ Ŷlm (θ, φ)ρ(r, θ, φ)
0 0

and
*
2l+1 (l−|m|)! |m|
Ylm (θ, φ) = 4π (l+|m|)!
Pl (cos θ)eimφ

with Pl (x) the associated Legendre functions, and Ŷlm (θ, φ) the complex
conjugate of Ylm (θ, φ)
Multipole Expansion II
Monopole l = 0 1 term
Dipole l = 1 3 terms
Quadrupole l = 2 5 terms
Octopole l = 3 7 terms
Hexadecapole l = 4 9 terms
The monopole term describes the potential of a spherical system with
√ √
ρ(r, θ, φ) = ρ(r). Since Y00 (θ, φ) = 1/ 4π and ρ00 = 4πρ(r), the
(l = m = 0)-term of the multipole expansion is simply the equation for the
potential of a spherical system:
, -
1
!r !

Φ(r) = −4πG r
ρ(r ′ ) r ′2 dr ′ + ρ(r ′ ) r ′ dr ′
0 r

In electrostatics you have both positive and negative charges. Consequently,


the monopole term of the electrostatic potential often vanishes at large radii,
while the dipole terms comes to dominate.
In gravity we have only positive charges (mass). Consequently, the monopole
term always dominates at large radii, while the dipole term vanishes. The
quadrupole term depends on the flattening of the density distribution.
Potentials of Disks
Since many galaxies have a dominant, thin disk component, it is useful to
consider the potentials of infinitessimally thin disks.
There are three methods to compute the potential of an infinitesimally thin
disk:

• Use the formalism for ellipsoids, and apply the limit q → 0.


Cumbersome! involving complicated double integrals...this method is
seldomly used.
• Use the general definition of the potential, which results in an
expression in terms of Elliptic Integrals.
• Use the Laplace equation subject to appropriate boundary conditions
on the disk and at infinity.
Disk Potentials via Elliptic Integrals
The potential of a thin disk with surface density Σ(R) can be written as

! x′ ) 3 ′
ρ(⃗ !
∞ !

dφ′
Φ(⃗
x) = −G |⃗
x−⃗x′ |
d ⃗
x = −G Σ(R ) R dR
′ ′ ′
|⃗
x−⃗x′ |
0 0

Expressing |⃗ x′ | in (R, φ = 0, z) and (R′ , φ′ , z ′ = 0) yields


x−⃗

2G
!
∞ √
Φ(R, z) = −√R
K(k) k Σ(R ) ′
R′ dR′
0

with k2 ≡ 4 R R′ /[(R + R′ )2 + z 2 ]. The corresponding circular velocity


can be obtained from

∂φ G
!
∞ √
R ∂R (R, z) = √
R
dR k Σ(R ) R′ ×
′ ′

$ 0 & 2 '& ′ 2
' %
K(k) − 14 1−k
k
2
R
R
− R
R′
+ z
RR′
E(k)

with K(k) and E(k) so called complete elliptic integrals. In principle the
evaluation at z = 0 is complicated (contains integrable singularity); in
practice it often suffices to approximate the above at small z
Disk Potentials via Bessel Functions
The potential of a thin disk with surface density Σ(R) can be written as

!

Φ(R, z) = S(k) J0 (kR) e−k|z| dk
0

with
!

S(k) = −2πG J0 (kR) Σ(R) R dR
0

Here J0 (x) is the cylindrical Bessel function of order zero.


The corresponding circular velocity is given by

( ∂Φ ) !

vc2 (R) = R ∂R z=0
= −R S(k) J1 (kR) k dk
0

This method is simple, and most of the time well behaved. For an exponential
disk with Σ(R) = Σ0 e−R/Rd one finds

vc2 (R) = 4πGΣ0 Rd y 2 [I0 (y)K0 (y) − I1 (y)K1 (y)]


R
with y = 2R and In (x) and Kn (x) modified Bessel functions of the first
d
and second kinds
Orbits in Central Force Fields I
Consider the central force field F (r) associated with a spherical density
distribution ρ(r).
As we have seen before, the orbits are planar, so that we consider the polar
coordinates (r, θ)
2
The equations of motion are: d
dt

r
2 = F (r)⃗
er

Solving these requires a careful treatment of the unit vectors in polar


coordinates:


er = − cos θ⃗
ex + sin θ⃗
ey

eθ = − sin θ⃗
ex + cos θ⃗
ey
d⃗
r d
dt
= dt
(r cos θ⃗ex + r sin θ⃗ey )
= ex − r θ̇ sin θ⃗
ṙ cos θ⃗ ex + ṙ sin θ⃗
ey + r θ̇ cos θ⃗
ey
= er + r θ̇⃗
ṙ⃗ eθ
and similarly one obtains that
d2 ⃗
r
dt 2 = (r̈ − r θ̇ 2 )⃗
er + (2ṙ θ̇ + r θ̈)⃗

Orbits in Central Force Fields II
We thus obtain the following set of equations of motions:

r̈ − r θ̇ 2 = F (r) = − dΦ
dr
2ṙθ̇ + r θ̈ = 0

Multiplying the second of these equations with r yields, after integration, that
d
dt
(r 2
θ̇) = 0. This simply expresses the conservation of the orbit’s angular
momentum L = r 2 θ̇ , i.e., the equations of motion can be written as

r̈ − r θ̇ 2 = − dΦ
dr
r 2 θ̇ = L = constant

In general these equations have to be solved numerically. Despite the very


simple, highly symmetric system, the equations of motion don’t provide
much insight. As we’ll see later, more direct insight is obtained by focussing
on the conserved quantities. Note also that the equations of motion are
different in different coordinate systems: in Cartesian coordinates (x, y):

ẍ = Fx = − ∂Φ
∂x
∂Φ
ÿ = Fy = − ∂y
Orbits in Central Force Fields III
As shown before, one can use the second equation of motion (in polar
coordinates) to eliminate θ̇ in the first, which yields the radial energy
equation

1 2 J2
2
ṙ + 2r 2
+ Φ(r) = E
which can be rewritten as
*
dr J2
dt
= ± 2[E − Φ(r)] − 2r 2

where the ± sign is required because r can both increase and decrease.
Solving for the turn-around points, where dr/dt = 0, yields

1 2[E−Φ(r)]
r2
= −J 2

which has two solutions: the apocenter r+ and the pericenter r− ≤ r+ .


These radii reflect the maximum and minimum radial extent of the orbit.
It is customary to define the orbital eccentricity as
r+ − r−
e= r+ + r−

where e = 0 and e = 1 correspond to circular and radial orbits, resp.


The Lagrangian
The equations of motion as given by Newton’s second law depend on the
choice of coordinate system
Their derivation involves painful vector calculus when curvi-linear
coordinates are involved
In the Lagrangian formulation of dynamics, the equations of motion are valid
for any set of so-called generalized coordinates (q1 , q2 , .., qn ), with n the
number of degrees of freedom
Generalized coordinates are any set of coordinates that are used to describe
the motion of a physical system, and for which the position of every particle
in the system is a function of these coordinates and perhaps also time:
r=⃗
⃗ r(qi , t). If ⃗
r=⃗
r(qi ) the system is said to be natural.
Define the Lagrangian function: L = T − V
with T and V the kinetic and potential energy, respectively.
In Cartesian coordinates, and setting the mass m = 1, we have
1
L= 2
(ẋ2
+ ẏ 2 + ż 2 ) − Φ(x, y, z)
In Generalized coordinates we have that L = L(qi , q̇i ).
Actions and Hamilton’s Principle
Define the action integral (also just called the action)
! t1
I = t0
L dt

which is the integral of the Lagrangian along a particle’s trajectory as it


moves from time t0 to t1 .

Hamilton’s Principle, also called Principle of least action: The equations of


motion are such that the action integral is stationary (i.e., δI = 0) under
arbitrary variations δqi which vanish at the limits of integration t0 to t1 .

Note that these stationary points are not necessarily minima. They may also
be maxima or sadle points.

In order to derive these equations of motion, we first familiarize ourselves


with the calculus of variations
Calculus of Variations I
We are interested in finding the stationary values of an integral of the form
!1
x
I= f (y, ẏ)dx
x0

where f (y, ẏ) is a specified function of y = y(x) and ẏ = dy/dx.

y x(1)

δy y(x)

x(0)

x
Consider a small variation δy(x), which vanishes at the endpoints of the
integration interval: δy(x0 ) = δy(x1 ) = 0
Calculus of Variations II
Using that
∂f ∂f
δf = ∂y
δy + ∂ ẏ
δ ẏ
d
with δ ẏ = dx δy(x), the stationary values obey
!1 $
x
∂f ∂f d
%
δI = ∂y
δy + ∂ ẏ dx
δy dx = 0
x0

Using integration by parts, and δy(x0 ) = δy(x1 ) = 0, this reduces to


!1 $
x
∂f d
&
∂f
'%
δI = ∂y
+ dx ∂ ẏ
δy dx = 0
x0
which yields the so-called Euler-Lagrange equations
& '
∂f d ∂f
∂y
− dx ∂ ẏ
=0

These are second-order differential equations for y(x), whose solutions


contain two arbitrary constants that may be determined form the known
values of y at x0 and x1 .
The Lagrangian Formulation I
Application of the Euler-Lagrange equations to the Lagrangian L(qi , q̇i )
yields
& '
∂L d ∂L
∂qi
− dt ∂ q̇i
=0

which are the Lagrange equations (one for each degree of freedom), which
represent the equations of motion according to Hamilton’s principle. Note
that they apply to any set of generalized coordinates
In addition to the generalized coordinates we also define the generalized
momenta pi (also called conjugate momenta) and the generalized forces Fi :
∂L ∂L
pi ≡ ∂ q̇i
Fi ≡ ∂qi

With these definitions the Lagrange equations reduce to


∂L
ṗi = ∂qi
= Fi

NOTE: in general pi and Fi are not components of the momentum vector p



⃗ !!! Whenever qi is an angle, the conjugate momentum
or the force vector F
pi is an angular momentum.
The Lagrangian Formulation II
As an example, let’s consider once again motion in a central force field. Our
generalized coordinates are the polar coordinates (r, θ), and the Lagrangian
is
1 2
L= 2
ṙ + 12 r 2 θ̇ 2 − Φ(r)

The Lagrange equations are

∂L d
( ∂L ) ∂Φ d
∂r
− dt ∂ ṙ
=0 ⇒ r θ̇ 2 − ∂r
− dt
(ṙ) =0 ⇒ r̈ − r θ̇ 2 = − ∂Φ
∂r

& '
∂L d ∂L d
∂θ
− dt ∂ θ̇
=0 ⇒ − dt (r 2 θ̇) = 0 ⇒ r 2 θ̇ = L = cst

Note that the Lagrangian formulation allows you to write down the equations
of motion much faster than using Newton’s second law!
The Hamiltonian Formulation I
The Hamiltonian H(qi , pi ) is related to the Lagrangian L(qi , q̇i ) via a
Legendre Transformation
In general, a Legendre Transformation is a transformation of a function
∂f ∂g
f (x, y) to g(u, y), where u = ∂x
and ∂u = x

g(u, y) = f − u x
NOTE: You might be familiar with Legendre Transformations from
Thermodynamics where they are used to compute different thermodynamic
potentials from the internal energy U = U (S, V ), such as
enthalpy: H = H(S, p) = U + p V
Helmholtz free energy: F = F (T, V ) = U − T S

Using a similar Legendre transformation we write the Hamiltonian as

#
n
H(⃗ ⃗, t) =
q, p pi q̇i (⃗ ⃗) − L(⃗
q, p ⃗˙(⃗
q, q ⃗), t)
q, p
i=1

To compute H(⃗ ⃗, t), first compute L(⃗


q, p ⃗˙, t), next compute the
q, q
conjugate momenta pi = ∂L/∂ q̇i , compute H = p ⃗˙ − L(⃗
⃗·q ⃗˙, t) and
q, q
finally express the q̇i in terms of p
⃗ and q

The Hamiltonian Formulation II
Differentiating H with respect to the conjugate momenta yields

∂H #
n
∂ q̇i #
n
∂L ∂ q̇i
∂pj
= q˙j + pi ∂p j
− ∂ q̇i ∂pj
i=1 i=1

The second and third terms vanish since pi = ∂L/∂ q̇i , so that we obtain
that ∂H/∂pj = q̇j . Similarly we obtain that

∂H #
n
∂ q̇i ∂L #
n
∂L ∂ q̇i
∂qj
= pi ∂q j
− ∂qj
− ∂ q̇i ∂qj
i=1 i=1
Here the first and third terms cancel, and since the Lagrange equations tell
us that ∂L/∂qj = ṗj , we obtain that ∂H/∂qj = −ṗj .

This yields the Hamiltonian equations of motion


∂H ∂H
∂pi
= q̇i ∂qi
= −ṗi

Note that whereas Lagrange’s equations are a set of n second-order


differential equations, Hamilton’s equations are a set of 2n first-order
differential equations. Although they are easier to solve, deriving the
Hamiltonian itself is more involved.
The Hamiltonian Formulation III
The Hamiltonian description is especially useful for finding conserved
quantities, which will play an important role in describing orbits.
If a generalized coordinate, say qi , does not appear in the Hamiltonian, then
the corresponding conjugate momentum pi is a conserved quantity!!!

In the case of motion in a fixed potential, the Hamiltonian is equal to the total
energy, i.e., H = E
DEMONSTRATION: for a time-independent potential Φ = Φ(⃗
x) the
Lagrangian is equal to L = 1 ˙ 2 − Φ(⃗
x
⃗ x). Since p x˙ = ⃗
⃗ = ∂L/∂ ⃗ x˙ we
2
have that H = x⃗˙ · ⃗
x˙ − 12 ⃗
x˙ 2 + Φ(⃗ x˙ 2 + Φ(⃗
x) = 12 ⃗ x) = E

The 2n-dimensional phase-space of a dynamical system with n degrees of


freedom can be described by the generalized coordinates and momenta
(⃗ ⃗). Since Hamilton’s equations are first order differential equations, we
q, p
can determine q ⃗(t) and p
⃗(t) at any time t once the initial conditions
(⃗ ⃗0 ) are given. Therefore, through each point in phase-space there
q0 , p
passes a unique trajectory Γ[⃗ q (⃗q0 , p ⃗(⃗
⃗0 , t), p q0 , p
⃗0 , t)]. No two trajectories
Γ1 and Γ2 can pass through the same (⃗ ⃗0 ) unless Γ1 = Γ2 .
q0 , p
The Hamiltonian Formulation IV
As an example, let’s consider once more the motion in a central force field.
Our generalized coordinates are the polar coordinates (r, θ), and, as we
have seen before the Lagrangian is L = 1
2
ṙ 2
+ 1 2 2
2
r θ̇ − Φ(r)
The conjugate momenta are pr = ∂L
∂ ṙ
= ṙ and pθ = ∂L
= r 2
θ̇
∂ θ̇
so that the Hamiltonian becomes
2
1 2 1 pθ
H= p
2 r
+ 2 r2
+ Φ(r)
Hamilton’s equations now become

∂H p2 ∂Φ ∂H
∂r
= − r3θ + ∂r
= −ṗr ∂θ
= 0 = −ṗθ

∂H ∂H pθ
∂pr
= pr = ṙ ∂pθ
= r2
= θ̇

which reduce to

r̈ − r θ̇ 2 = − ∂Φ
∂r
pθ = r 2 θ̇ = cst

Note that θ does not appear in the Hamiltonian: consequently pθ is a


conserved quantity
Noether’s Theorem
In 1915 the German mathematician Emmy Noether proved an important
theorem which plays a trully central role in theoretical physics.

Noether’s Theorem: If an ordinary Lagrangian posseses some continuous,


smooth symmetry, then there will be a conservation
law associated with that symmetry.

• Invariance of L under time translation → energy conservation


• Invariance of L under spatial translation → momentum conservation
• Invariance of L under rotational translation → ang. mom. conservation
• Gauge Invariance of electric potential → charge conservation

Some of these symmetries are immediately evident from the Lagrangian:

• If L does not explicitely depend on t then E is conserved


• If L does not explicitely depend on qi then pi is conserved
Poisson Brackets I
DEFINITION: Let A(⃗ ⃗) and B(⃗
q, p ⃗) be two functions of the generalized
q, p
coordinates and their conjugate momenta, then the Poisson bracket of A
and B is defined by
n $
# %
∂A ∂B ∂A ∂B
[A, B] = ∂qi ∂pi
− ∂pi ∂qi
i=1

Let f = f (⃗
q, p
⃗, t) then
∂f ∂f ∂f
df = ∂qi
dqi + ∂pi
dpi + ∂t
dt
where we have used the summation convention. This differential of f ,
combined with Hamilton’s equations, allows us to write
df ∂f ∂f ∂H ∂f ∂H
dt
= ∂t
+ ∂qi ∂pi
− ∂pi ∂qi

which reduces to
df ∂f
dt
= ∂t
+ [f, H]
This is often called Poisson’s equation of motion. It shows that the
time-evolution of any dynamical variable is governed by the Hamiltonian
through the Poisson bracket of the variable with the Hamiltonian.
Poisson Brackets II
Using the Poisson brackets we can write
dH ∂H ∂H ∂L
dt
= ∂t
+ [H, H] = ∂t
= ∂t

where the latter equality follows from H = p ⃗˙ − L.


⃗·q
For an equilibrium system with a time-independent potential, ∂Φ/∂t = 0,
we have that ∂H/∂t = 0 and thus also dH/dt = 0. Since in this case the
Hamiltonian is equal to the total energy, this simply reflects the energy
conservation. Note that for any conservative system, H does not explicitely
depend on time, and thus dH/dt = 0

With the help of the Poisson brackets we can write Hamilton’s equations in a
more compact form

q̇i = [qi , H] ṗi = [pi , H]


Note that it is explicit that these equations of motion are valid in any system
of generalized coordinated (q1 , q2 , .., qn ) and their conjugate momenta
(p1 , p2 , .., pn ). As we will see next, in fact Hamilton’s equations hold for
any so-called canonical coordinate system.
Canonical Coordinate Systems
If we write wi = qi and wn+i = pi with i = 1, .., n and we define the
symplectic matrix c as
.
±1 if α = β ∓ n
cαβ ≡ [wα , wβ ] =
0 otherwise

with α, β ∈ [1, 2n], then


#
2n
∂A ∂B
[A, B] = cαβ ∂w α ∂wβ
α,β=1

DEFINITION: Any set of 2n phase-space coordinates {wα , α = 1, .., 2n}


is called canonical if [wα , wβ ] = cαβ .
Hamilton’s equations can now be written in the extremely compact form:

ẇα = [wα , H]
which makes it explicit that they hold for any canonical coordinate system.
Note that the generalized coordinates and momenta (⃗ ⃗) form a canonical
q, p
coordinate system, since they obey the canonical commutation relations

[qi , qj ] = [pi , pj ] = 0 [pi , qj ] = δij


Canonical Transformations I
Canonical Transformation: a transformation (⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) between two
canonical coordinate systems that leaves the equations of motion invariant.

In order to reveal the form of these transformations, we first demonstrate the


non-uniqueness of the Lagrangian.

Consider a transformation L → L′ = L + dF
dt
where F = F (⃗
q , t)
Under this transformation the action integral becomes
!t1 !t1 !t1 dF
I′ = L′ dt = Ldt + dt
dt = I + F (t1 ) − F (t0 )
t0 t0 t0

Recall that the equations of motion correspond to δI = 0 (i.e., the action is


stationary). Since the addition of dF /dt only adds a constant, namely
F (t1 ) − F (t0 ) to the action, it leaves the equations of motion invariant.
Canonical Transformations II
Now consider our transformation (⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) with corresponding
˙
Lagrangians L(⃗ ⃗˙, t) and L′ (Q,
q, q ⃗ Q,
⃗ t).
We start by writing the Lagrangians in terms of the corresponding
Hamiltonians:

L(⃗q, p
⃗, t) = p
⃗·q⃗˙ − H(⃗q, p
⃗, t)
⃗ P
L′ (Q, ⃗ , t) = P
⃗ ·Q ⃗˙ − H′ (Q,
⃗ P ⃗ , t)

In order for the equations of motion to be invariant, we have the requirement


that
L(⃗ ⃗ P
⃗, t) = L′ (Q,
q, p ⃗ , t) + dF
$dt %
˙
dF
⇔ dt = p ⃗·q⃗˙ − H(⃗ ⃗, t) − P
q, p ⃗ ·Q
⃗ − H′ (Q,
⃗ P⃗ , t)
⇔ dF = pi dqi − Pi dQi + (H′ − H)dt

If we take F = F (⃗ ⃗ t) then we also have that


q , Q,
∂F ∂F ∂F
dF = ∂qi
dqi + ∂Qi
dQi + ∂t
dt
Canonical Transformations III
Equating the two expressions for the differential dF yields the
transformation rules
∂F ∂F ∂F
pi = ∂qi
Pi = − ∂Qi
H′ = H + ∂t

The function F (⃗ ⃗ t) is called the generating function of the canonical


q , Q,
transformation (⃗ ⃗ P
⃗) → (Q,
q, p ⃗)

In order to transform (⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) one proceeds as follows:
• Find a function F (⃗ ⃗ so that pi = ∂F /∂qi . This yields Qi (qj , pj )
q , Q)
• Substitute Qi (qj , pj ) in Pi = ∂F /∂Qi to obtain Pi (qj , pj )

As an example consider the generating function F (⃗ ⃗ = qi Oi .


q , Q)
According to the transformation rules we have that
∂F ∂F
pi = ∂qi
= Qi Pi = − ∂Qi
= −qi
We thus have that Qi = pi and Pi = −qi : the canonical transformation
has changed the roles of coordinates and momenta, eventhough the
equations of motion have remained invariant! This shows that there is no
special status to either generalized coordinates or their conjugate momenta
Canonical Transformations IV
For reasons that will become clear later, in practice it is more useful to
consider a generating function of the form S = S(⃗ ⃗ , t), i.e., one that
q, P
depends on the old coordinates and the new momenta.
To derive the corresponding transformation rules, we start with the
generating function F = F (⃗ ⃗ t), and recall that
q , Q,
dF = pi dqi − Pi dQi + (H′ − H)dt
using that Pi dQi = d(Qi Pi ) − Qi dPi , we obtain

d(F + Qi Pi ) = pi dqi + Qi dPi + (H′ − H)dt

Defining the new generator S(⃗ ⃗ P


q , Q, ⃗ , t) ≡ F (⃗ ⃗ t) + Q
q , Q, ⃗ ·P
⃗ , for which
∂S ∂S ∂S ∂S
dS = ∂qi
dqi + ∂Qi
dQi + ∂Pi
dPi + ∂t
dt

Equating this to the above we find the transformation rules

∂S ∂S ∂S ∂S
pi = ∂qi
Qi = ∂Pi ∂Qi
=0 H′ = H + ∂t

Note that the third of these rules implies that S = S(⃗ ⃗ , t) as intended.
q, P
Canonical Transformations V
The potential strength of canonical transformations becomes apparent from
the following: Suppose one can find a canonical transformation
(⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) such that H(⃗ ⃗) → H′ (P
q, p ⃗ ), i.e., such that the new
Hamiltonian does not explicitely depend on the new coordinates Qi .
Hamilton’s equation of motion then become

∂H′ ∂H′
∂Qi
= −Ṗi = 0 ∂Pi
= −Q̇i

Thus, we have that all the conjugate momenta Pi are constant, and this in
turn implies that none of Q̇i can depend on time either. The equations of
motion in our new, canonical coordinate system are therefore extremely
simple:

Qi (t) = Ωi t + ki Pi = constant
Here Ωi = ∂H′ /∂Pi are constants and ki are integration constants. Any
generalized coordinate whose conjugate momentum is a conserved quantity,
is called a cyclic variable. The question that remains now is how to find the
generator S(q, P, t) of the canonical transformation (⃗ ⃗ P
⃗) → (Q,
q, p ⃗)
which leads to only cyclic variables Qi .
The Hamilton-Jacobi Equation I
Recall the transformation rules for the generator S(⃗ ⃗ , t):
q, P
∂S ∂S ∂S
pi = ∂qi
Qi = ∂Pi
H′ = H + ∂t

If for simplicity we consider a generator that does not explicitely depend on


time, i.e., ∂S/∂t = 0 then we have that H(⃗ ⃗) = H′ (P
q, p ⃗ ) = E . If we
now substitute ∂S/∂qi for pi in the original Hamiltonian we obtain
& '
∂S
H ∂qi
, qi =E

This is the Hamilton-Jacobi equation, which is a partial differential equation.


If it can be solved for S(⃗ ⃗ ) than, as we have seen above, basically the
q, P
entire dynamics are solved.
Thus, for a dynamical system with n degrees of freedom, one can solve the
dynamics in one of the three following ways:

• Solve n second-order differential equations (Lagrangian formalism)


• Solve 2n first-order differential equations (Hamiltonian formalism)
• Solve a single partial differential equation (Hamilton-Jacobi equation)
The Hamilton-Jacobi Equation II
Although it may seem an attractive option to try and solve the
Hamilton-Jacobi equation, solving partial differential equations is in general
much more difficult than solving ordinary differential equations, and the
Hamilton-Jacobi equation is no exception.
However, in the specific case where the generator S is separable, i.e., if
#
n
S(⃗ ⃗) =
q, P fi (qi )
i=1

with fi a set of n independent functions, then the Hamilton-Jacobi equation


splits in a set of n ordinary differential equations which are easily solved by
quadrature. The integration constants are related to the (constant) conjugate
momenta Pi .

A Hamiltonian is called ‘integrable’ if the Hamilton-Jacobi equation is separable

Integrable Hamiltonians are extremely rare. Mathematically speaking they


form a set of measure zero in the space of all Hamiltonians. In what follows,
we establish the link between so-called isolating integrals of motion and
whether or not a Hamiltonian is integrable.
Summary of the Above
d2 ⃗
r ⃗
• Newton’s second law: dt2
= −∇Φ(⃗
r)

− Complicated vector arithmetic & coordinate system dependence


& '
∂L d ∂L
• Lagrangian Formalism: ∂q i
− dt ∂ q̇i
=0

− n second-order differential equations


∂H ∂H
• Hamiltonian Formalism: ∂pi
= q̇i ∂qi
= −ṗi

− 2n first-order differential equations


& '
∂S
• Hamilton-Jacobi equation: H ∂q i
, qi = E

S(⃗ ⃗) is generator of canonical transformation (⃗


q, p q, p⃗) → (Q,⃗ P
⃗ ) for
which H(⃗ ⃗) → H′ (P
q, p ⃗ ). If S(⃗ ⃗) is separable then the Hamilton-Jacobi
q, p
equation breaks up in n ordinary differential equations which can be solved
by simple quadrature. The resulting equations of motion are:
& ′
'
∂H
Pi (t) = Pi (0) Qi (t) = ∂Pi
t + ki
Constants of Motion
Constants of Motion: any function C(⃗ q, p
⃗, t) of the generalized coordinates,
⃗(t)
conjugate momenta and time that is constant along every orbit, i.e., if q
⃗(t) are a solution to the equations of motion, then
and p
q (t1 ), p
C[⃗ ⃗(t1 ), t1 ] = C[⃗
q (t2 ), p
⃗(t2 ), t2 ]
for any t1 and t2 . The value of the constant of motion depends on the orbit,
but different orbits may have the same numerical value of C
A dynamical system with n degrees of freedom always has 2n independent
constants of motion. Let qi = qi [⃗ ⃗0 , t] and pi = pi [⃗
q0 , p q0 , p
⃗0 , t] describe
the solutions to the equations of motion. In principle, these can be inverted
to 2n relations qi,0 = qi,0 [⃗
q (t), p
⃗(t), t] and pi,0 = pi,0 [⃗ q (t), p⃗(t), t].
By their very construction, these are 2n constants of motion.
If Φ(⃗
x, t) = Φ(⃗x), one of these 2n relations can be used to eliminate t.
This leaves 2n − 1 non-trivial constants of motion, which restricts the
system to a 2n − (2n − 1) = 1-dimensional surface in phase-space,
namely the phase-space trajectory Γ(t)
Note that the elimination of time reflects the fact that the physics are invariant
to time translations t → t + t0 , i.e., the time at which we pick our initial
conditions can not hold any information regarding our dynamical system.
Integrals of Motion I
Integrals of Motion: any function I(⃗ v ) of the phase-space coordinates
x, ⃗
(⃗ v ) alone that is constant along every orbit, i.e.
x, ⃗
x(t1 ), ⃗
I[⃗ v (t1 )] = I[⃗
x(t2 ), ⃗
v (t2 )]
for any t1 and t2 . The value of the integral of motion can be the same for
different orbits. Note that an integral of motion can not depend on time.
Thus, all integrals are constants, but not all constants are integrals.

Integrals of motion come in two kinds:

Isolating Integrals of Motion: these reduce the dimensionality of the


trajectory Γ(t) by one. Therefore, a trajectory in a dynamical system with n
degrees of freedom and with i isolating integrals of motion is restricted to a
2n − i dimensional manifold in the 2n-dimensional phase-space. Isolating
integrals of motion are of great practical and theoretical importance.

Non-Isolating Integrals of Motion: these are integrals of motion that do not


reduce the dimensionality of Γ(t). They are of essentially no practical value
for the dynamics of the system.
Integrals of Motion II
A stationary, Hamiltonian system (i.e., H(⃗ ⃗, t) = H(⃗
q, p ⃗)) with n
q, p
degrees of freedom always has 2n − 1 independent integrals of motion,
which restrict the motion to the one-dimensional phase-space trajectory
Γ(t). The number of isolating integrals of motion can, depending on the
Hamiltonian, vary between 1 and 2n − 1.

DEFINITION: Two functions I1 and I2 of the canonical phase-space


coordinates (⃗ q, p⃗) are said to be in involution if their Poisson bracket
vanishes, i.e., if
∂I1 ∂I2 ∂I1 ∂I2
[I1 , I2 ] = ∂qi ∂pi
− ∂pi ∂qi
=0

A set of k integrals of motion that are in involution forms a set of k isolating


integrals of motion.

Liouville’s Theorem for Integrable Hamiltonians


A Hamiltonian system with n degrees of freedom which posseses n
integrals of motion in involution, (and thus n isolating integrals of motion) is
integrable by quadrature.
Integrable Hamiltonians I
LEMMA: If a system with n degrees of freedom has n constants of motion
Pi (⃗ ⃗, t) [or integrals of motion Pi (⃗
q, p ⃗)] that are in involution, then there
q, p
will also be a set of n functions Qi (⃗ ⃗, t) [or Qi (⃗
q, p q, p⃗)] which together
with the Pi constitute a set of canonical variables.

Thus, given n isolating integrals of motion Ii (⃗ ⃗) we can make a


q, p
canonical transformation (⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) with Pi = Ii (⃗ ⃗) =constant
q, p
and with Qi (t) = Ωi t + ki

An integrable, Hamiltonian system with n degrees of freedom always has a


set of n isolating integrals of motion in involution. Consequently, the
trajectory Γ(t) is confined to a 2n − n = n-dimensional manifold
phase-space.
The surfaces specified by (I1 , I2 , .., In ) =constant are topologically
equivalent to n-dimensional tori. These are called invariant tori, because any
orbit originating on one of them remains there indefinitely.
In an integrable, Hamiltonian system phase-space is completely filled (one
says ‘foliated’) with invariant tori.
Integrable Hamiltonians II
To summarize: if, for a system with n degrees of freedom, the
Hamilton-Jacobi equation is separable, the Hamiltonian is integrable and
there exist n isolating integrals of motion Ii in involution. In this case there
exist canonical transformations (⃗ ⃗ P
⃗) → (Q,
q, p ⃗ ) such that equations of
motion reduce to:

Pi (t) = P
& i (0) '

∂H
Qi (t) = ∂Pi
t + ki

One might think at this point, that one has to take Pi = Ii . However, this
choice is not unique. Consider an integrable Hamiltonian with n = 2
degrees of freedom and let I1 and I2 be two isolating integrals of motion in
involution. Now define Ia = 1 2
(I 1 + I 2 ) and I b = 1
2
(I1 − I2 ), then it is
straightforward to proof that [Ia , Ib ] = 0, and thus that (Ia , Ib ) is also a
set of isolating integrals of motion in involution. In fact, one can construct
infinitelly many sets of isolating integral of motion in involution. Which one
should we choose, and in particular, which one yields the most meaningful
description of the invariant tori?
Answer: The Action-Angle variables
Action-Angle Variables I
ω2

ω2

Γ(t) .
A2

A1 .

γ1
ω1 γ2

Let’s be guided by the idea of our invariant tori. The figure illustrates a
2D-torus (in 4D-phase space), with a trajectory Γ(t) on its surface. One can
specify a location on this torus by the two position angles ω1 and ω2 . The
torus itself is characterized by the areas of the two (hatched) cross sections
labelled A1 and A2 . The action variables J1 and J2 are intimitally related to
A1 and A2 , which clearly are two integrals of motion.
The action variables are defined by:
1
"
Ji = 2π γi
⃗ · d⃗
p q

with γi the closed loop that bounds cross section Ai .


Action-Angle Variables II
The angle-variables ωi follow from the canonical transformation rule
ωi = ∂S
∂Ji
q , J⃗) the generator of the canonical transformation
with S = S(⃗
(⃗ ⃗) → (⃗
q, p ω , J⃗). Since the actions Ji are isolating integrals of motion we
have that the corresponding conjugate angle coordinates wi obey
& ′'
ωi (t) = ∂H∂Ji
t + ω0

⃗) the Hamiltonian in action-angle variables (⃗


with H′ = H′ (J ω , J⃗).
We now give a detailed description of motion on invariant tori:
Orbits in integrable, Hamiltonian systems with n degrees of freedom are
characterized by n constant frequencies

∂H′
Ωi ≡ ∂Ji

This implies that the motion along each of the n degrees of freedom, qi , is
periodic in time, and this can occur in two ways:
• Libration: motion between two states of vanishing kinetic energy
• Rotation: motion for which the kinetic energy never vanishes
The Pendulum
To get insight into libration and rotation consider a pendulum, which is a
integrable Hamiltonian system with one degree of freedom, the angle q ..

The figures below shows the


corresponding phase-diagram.
l
q
p
F

r
s l=libration
l
r=rotation
−π +π q s=separatrix

• Libration: q(ω + 2π) = q(ω).


• Rotation: q(ω + 2π) = q(ω) + 2π
To go from libration to rotation, one needs to cross the separatrix
Action-Angle Variables III
Why are action-angle variables the ideal set of isolating integrals of motion to
use?

• They are are the only conjugate momenta that enjoy the property of
adiabatic invariance (to be discussed later)
• The angle-variables are the natural coordinates to label points on
invariant tori.
• They are ideally suited for perturbation analysis, which is used to
investigate near-integrable systems (see below)
• They are ideally suited to study the (in)-stability of a Hamiltonian system
Example: Central Force Field
As an example, to get familiar with action-angle variables, let’s consider once
again motion in a central force field.
As we have seen before, the Hamiltonian is
2
1 2 1 pθ
H= p
2 r
+ 2 r2
+ Φ(r)

where pr = ṙ and pθ = r 2 θ̇ = L.
In our planar description, we have two integrals of motion, namely energy
I1 = E = H and angular momentum I2 = L = pθ .
These are classical integrals of motion, as they are associated with
symmetries. Consequently, they are also isolating.
Let’s start by checking whether they are in involution
$ % $ %
∂I1 ∂I2 ∂I1 ∂I2 ∂I1 ∂I2 ∂I1 ∂I2
[I1 , I2 ] = ∂r ∂pr
− ∂pr ∂r
+ ∂θ ∂pθ
− ∂pθ ∂θ

∂I2 ∂I2 ∂I1 ∂I2


Since ∂p
r
= ∂r
= ∂θ
= ∂θ
= 0 one indeed finds that the two
integrals of motion are in involution.
Example: Central Force Field
The actions are defined by
1
" 1
"
Jr = 2π γr
pr dr Jθ = 2π γθ
pθ dθ

In the case of Jθ the θ -motion is one of rotation. Therefore the


closed-line-integral is over an angular interval [0, 2π].

1
!

Jθ = 2π
I2 dθ = I2
0

In the case of Jr , we need to realize that the r -motion is a libration between


apocenter r+ and pericenter r− . Using that I1 = E = H we can write
+
pr = 2[I1 − Φ(r)] − I22 /r 2
The radial action then becomes
r!+ +
1
Jr = π
2[I1 − Φ(r)] − I22 /r 2 dr
r−

Once we make a choise for the potential Φ(r) then Jr can be solved as
function of I1 and Jθ . Since I1 = H, this in turn allows us to write the
Hamiltonian as function of the actions: H(Jr , Jθ ).
Example: Central Force Field
As an example, let’s consider a potential of the form
β
Φ(r) = − α
r
− r2
with α and β two constants. Substituting this in the above, one finds:
& '1/2 +
1
Jr = α 2|I1 |
− Jθ2 − 2β
Inverting this for I1 = H yields
& + '−2
α2
H(Jr , Jθ ) = 2
Jr + Jθ2 − 2β

Since the actions are isolating integrals of motion, and we have an


expression for the Hamiltonian in terms of these actions, the generalized
coordinates that correspond to these actions (the angles wr and wθ ) evolve
as wi (t) = Ωi t + wi,0
The radial and angular frequencies are
& + '−3
∂H
Ωr = ∂J r
= −α 2
J r + J θ − 2β
2

& + '−3
∂H Jθ
Ωθ = ∂J = −α 2
J r + J 2
θ − 2β √ 2
θ Jθ −2β
Example: Central Force Field
The ratio of these frequencies is
& '1/2
Ωr 2β
Ωθ
= 1− Jθ2

Note that for β = 0, for which Φ(r) = − α


r
, and thus the potential is of the
Kepler form, we have that Ωr = Ωθ independent of the actions (i.e., for each
individual orbit).
In this case the orbit is closed, and there is an additional isolating integral of
motion (in addition to E and L). We may write this ‘third’ integral as
I3 = wr − wθ = Ωr t + wr,0 − Ωθ t − wθ,0 = wr,0 − wθ,0
Without loosing generality, we can pick the zero-point of time, such that
wr,0 = 0. This shows that we can think of the third integral in a Kepler
potential as the angular phase of the line connecting apo- and peri-center.
Quasi-Periodic Motion
In general, in an integrable Hamiltonian system with the canonical
transformation (⃗ ⃗) → (Q,
q, p ⃗ P ⃗ ) one has that qk = qk (ω1 , .., ωn ) with
k = (1, .., n). If one changes ωi by 2π , while keeping the other ωj (j ̸= i)
fixed, then qi then performs a complete libration or rotation.
The Cartesian phase-space coordinates (⃗ v ) must be periodic functions of
x, ⃗
the angle variables ωi with period 2π . Any such function can be expressed
as a Fourier series
#

ω , J⃗) =
x(⃗
⃗ Xlmn (J⃗) exp [i(lω1 + mω2 + nω3 )]
l,m,n=−∞

Using that ωi (t) = Ωi t + ki we thus obtain that

#

x(t) =
⃗ X̃lmn exp [i(lΩ1 + mΩ2 + nΩ3 )t]
l,m,n=−∞

with X̃lmn = Xlmn exp [i(lk1 + mk2 + nk3 )]


x(t) are said to be quasi-periodic functions of time.
Functions of the form of ⃗
Hence, in an integrable systems, all orbits are quasi-periodic, and confined
to an invariant torus.
Integrable Hamiltonians III
When one integrates a trajectory Γ(t) in an integrable system for sufficiently
long, it will come infinitessimally close to any point ω
⃗ on the surface of its
torus. In other words, the trajectory densely fills the entire torus.
Since no two trajectories Γ1 (t) and Γ2 (t) can intersect the same point in
phase-space, we thus immediately infer that two tori are not allowed to
intersect.

In an integrable, Hamiltonian system phase-space is


completely foliated with non-intersecting, invariant tori
Integrable Hamiltonians IV
In an integrable, Hamiltonian system with n degrees of freedom, all orbits
are confined to, and densely fill the surface of n-dimensional invariant tori.
These orbits, which have (at least) as many isolating integrals as spatial
dimensions are called regular
Regular orbits have n frequencies Ωi which are functions of the
corresponding actions Ji . This means that one can always find suitable
values for Ji such that two of the n frequencies Ωi are commensurable, i.e.
for which
l Ωi = m Ωj
with i ̸= j and l, m both integers.
A regular orbit with commensurable frequencies is called a resonant orbit
(also called closed or periodic orbit), and has a dimensionality that is one
lower than that of the non-resonant, regular orbits. This implies that there is
an extra isolating integral of motion, namely
In+1 = lωi − mωj
Note: since ωi (t) = Ωi t + ki , one can obtains that In+1 = lki − mkj ,
Note: and thus is constant along the orbit.
Near-Integrable Systems I
Thus far we have focussed our attention on integrable, Hamiltonian systems.
Given a Hamiltonian H(⃗ ⃗), how can one determine whether the system is
q, p
integrable, or whether the Hamilton-Jacobi equation is separable?
Unfortunately, there is no real answer to this question: In particular, there is
no systematic method for determining if a Hamiltonian is integrable or not!!!
However, if you can show that a system with n degrees of freedom has n
independent integrals of motion in involution then the system is integrable.
Unfortunately, the explicit expression of the integrals of motion in terms of
the phase-space coordinates is only possible in a very so called classical
integrals of motion, those associated with a symmetry of the potential and/or
with an invariance of the coordinate system.
In what follows, we only consider the case of orbits in ‘external’ potentials for
which n = 3. In addition, we only consider stationary potentials Φ(⃗ x), so
that the Hamiltonian does not explicitely on time and
H(⃗ ⃗) = E =constant. Therefore
q, p

Energy is always an isolating integral of motion.


Note: this integral is related to the invariance of the Lagrangian L under time
translation, i.e., to the homogeneity of time.
Near-Integrable Systems II
Integrable Hamiltonians are extremely rare. As a consequence, it is
extremely unlikely that the Hamiltonian associated with a typical galaxy
potential is integrable.
One can prove that even a slight perturbation away from an integrable
potential will almost always destroy any integral of motion other than E .
So why have we spent so must time discussing integrable Hamiltonians?
◃ Because most galaxy-like potentials turn out to be near-integrable.
Definition: A Hamiltonian system is near-integrable if a large fraction of
Definition: phase-space is still occupied by regular orbits
Definition: (i.e., by orbits on invariant tori).

The dynamics of near-integrable Hamiltonians is the subject of the


Kolmogorov-Arnold-Moser (KAM) Theorem which states:

If H0 is an integrable Hamiltonian whose phase-space is completely foliated


with regular orbits on invariant tori, then in a perturbed Hamiltonian
H = H0 + εH1 most orbits will still lie on such tori for sufficiently small ε.
The fraction of phase-space covered by these tori → 1 for ε → 0 and the
perturbed tori are deformed versions of the unperturbed ones.
Near-Integrable Systems III
The stability of the original tori to a perturbation can be proven everywhere
except in small regions around the resonant tori of H0 . The width of these
regions depends on ε and on the order of the resonance.
According to the Poincaré-Birkhoff Theorem the tori around unstable
resonant tori break up and the corresponding regular orbits become irregular
and stochastic.
Definition: A resonant orbit is stable if an orbit starting close to it remains
Definition: close to it. They parent orbit families (see below)
Definition: An irregular orbit is an orbit that is not confined to a
Definition: n-dimensional torus. In general it can wander through
Definition: the entire phase-space permitted by conservation of energy.
Consequently, an irregular orbit is restricted to a higher-dimensional
manifold than a regular orbit. Irregular orbits are stochastic in that they are
extremely sensitive to initial conditions: two stochastic trajectories Γ1 (t)
and Γ2 (t) which at t = t0 are infinitesimally close together will diverge with
time.
Increasing ε, increases the widths of the stochastic zones, which may
eventually ‘eat up’ a large fraction of phase-space.
Near-Integrable Systems IV
Note that unperturbed resonant tori form a dense set in phase-space, just
like the rational numbers are dense on the real axis.
Just like you can always find a rational number in between two real numbers,
in a near-integrable system there will always be a resonant orbit in between
any two tori. Since many of these will be unstable, they create many
stochastic regions
As long as the resonance is of higher order (i.e, 16 : 23 rather than 1 : 2)
the corresponding chaotic regions are very small, and tightly bound by their
surrounding tori.
Since two trajectories can not cross, an irregular orbit is bounded by its
neighbouring regular orbits. The irregular orbit is therefore still (almost)
confined to a n-dimensional manifold, and it behaves as if it has n isolating
integrals of motion.
◃ It may be very difficult to tell whether an orbit is regular or irregular.
However, iff n > 2 an irregular orbit may slip through a ‘crack’ between two
confining tori, a process know as Arnold diffusion.
Because of Arnold diffusion the stochasticity will be larger than ‘expected’.
However, the time-scale for Arnold diffusion to occus is long, and it is
unclear how important it is for Galactic Dynamics.
Near-Integrable Systems V
A few words on nomenclature:
Recall that an (isolating) integral of motion is defined as a function of
phase-space coordinates that is constant along every orbit.
A near-integrable systems in principle has only one isolating integral of
motion, namely energy E .
Nevertheless, according to the KAM Theorem, many orbits in a
near-integrable system are confined to invariant tori.
Although in conflict with the definition, astronomers often say that the
regular orbits in near-integrable systems admit n isolating integrals of
motion.
Astronomers also often use KAM Theorem to the extreme, by assuming that
they can ignore the irregular orbits, and that the Hamiltonians that
correspond to their ‘galaxy-like’ potentials are integrable. Clearly the validity
of this approximation depends on the fraction of phase-space that admits
three isolating integrals of motion. For most potentials used in Galactic
Dynamics, it is still unclear how large this fraction really is, and thus, how
reliable the assumption of integrability is.
Surfaces of Section I
Consider a system with n = 2 degrees of freedom (e.g., planar motion), and
with a Hamiltonian
1
H(⃗ ⃗) =
x, p 2
(p2x + p2y ) + Φ(x, y)
Conservation of energy, E = H, restricts the motion to a three-dimensional
hyper-surface M3 in four-dimensional phase-space.
To investigate whether the orbits admit any additional (hidden) isolating
integrals of motion, Poincaré introduced the surface-of-section (SOS)
Consider the intersection of M3 with the surface y = 0. Integrate the orbit,
and everytime it crosses the surface y = 0 with ẏ > 0, record the position
in the (x, px )-plane. After many orbital periods. the accumulated points
begin to show some topology that allows one to discriminate between
regular, irregular and resonance orbits.
Given (x, px ) and the condition y = 0, we can determine py from
+
py = + 2[E − Φ(x, 0)] − p2x
where the +-sign is chosen because ẏ > 0.
To get insight, and relate orbits to their SOSs, see JAVA-Applet at:
http://burro.astr.cwru.edu/JavaLab/SOSweb/backgrnd.html
Surfaces of Section II
px
= energy surface
= regular box orbit
= regular loop orbit
= irregular (stochastic) orbit
= periodic (resonance) orbit

NOTE: Each resonance orbit creates a


family of regular orbits.
x
Loop orbit: has fixed sense of rotation
about the center; never has x−0

Box orbit: no fixed sense of rotation


about the center. Orbit comes
arbitrarily close to center.

This figure is only an illustration of the topology of various orbits in a SOS. It


does not correspond to an existing Hamiltonian.
Orbits in Spherical Potentials I
A spherical potential has four classical, isolating integrals of motion: Energy
E , associated with time-invariance of the Lagrangian, and the three
components of the angular momentum vector, Lx , Ly , and Lz , associated
with rotational invariance of the potential.
NOTE: Since
[Lx , Ly ] = Lz , [Ly , Lz ] = Lx , [Lz , Lx ] = Ly
the set (Lx , Ly , Lz ) is not in involution. However, the absolute value of the
angular momentum
*
|L| = L2x + L2y + L2z
is in involution with any of its components. We can thus define a set of three
isolating integrals of motion in involution, e.g., (E, |L|, Lz ). The values for
these three integrals uniquely determine the motion, and specify a unique
invariant torus.
Orbits in Spherical Potentials II
We have seen before that for motion in a central force field:
*
dr L2 dθ L
dt
= ± 2[E − Φ(r)] − r2 dt
= r2

From this we immediately infer the nature of the motion:


• θ -motion is rotation (dθ/dt is never zero)
• r -motion is libration (dr/dt = 0 at apo- and pericenter)
The radial period is
r!+ r!+
dr dr
Tr = 2 dr/dt
=2 √
r− r− 2[E−Φ(r)]−L2 /r 2

In the same period the polar angle θ increases by an amount


r!+ r!+ r!+
dθ dθ dt Ldr
∆θ = 2 dr
dr =2 dt dr
dr =2 2

r− r− r− r 2[E−Φ(r)]−L2 /r 2


The azimuthal period can thus be written as Tθ = ∆θ Tr . From this we see
that the orbit will be closed (resonant) if
Tθ 2π n
Tr
= ∆θ
= m
with n and m integers
Orbits in Spherical Potentials III
In general ∆θ/2π will not be a rational number. ◃ orbit not closed.
Instead, a typical orbit resembles a rosette and eventually passes through
every point in between the annuli bounded by the apo- and pericenter.
However, there are two special potentials for which all orbits are closed:
1 2 2
• Spherical Harmonic Oscillator Potential: Φ(r) = 2
Ω r
−Orbits are ellipses centered on center of attraction
− Tθ : Tr = 2 : 1
• Kepler Potential: Φ(r) = − GrM
−Orbits are ellipses with attracting center at one focal point
− Tθ : Tr = 1 : 1
Since galaxies are less centrally concentrated than point masses and more
centrally concentrated than homogeneous spheres, a typical star in a
spherical galaxy changes its angular coordinate by ∆θ during a radial
libration, where π < ∆θ < 2π
QUESTION: What is the resonance that corresponds to a circular orbit?
Orbits in Spherical Potentials III
In general ∆θ/2π will not be a rational number. ◃ orbit not closed.
Instead, a typical orbit resembles a rosette and eventually passes through
every point in between the annuli bounded by the apo- and pericenter.
However, there are two special potentials for which all orbits are closed:
1 2 2
• Spherical Harmonic Oscillator Potential: Φ(r) = 2
Ω r
−Orbits are ellipses centered on center of attraction
− Tθ : Tr = 2 : 1
• Kepler Potential: Φ(r) = − GrM
−Orbits are ellipses with attracting center at one focal point
− Tθ : Tr = 1 : 1
Since galaxies are less centrally concentrated than point masses and more
centrally concentrated than homogeneous spheres, a typical star in a
spherical galaxy changes its angular coordinate by ∆θ during a radial
libration, where π < ∆θ < 2π
QUESTION: What is the resonance that corresponds to a circular orbit?
ANSWER: Although closed, a circular orbit is NOT a resonance orbit
Orbits in Spherical Potentials IV

An example of a rosette orbit with non-commensurable frequencies. Virtually


all orbits in spherical potentials are of this form. The more general name for
this type of orbits is loop orbits. They have a net sense of rotation around the
center.
Orbits in Planar Potentials I
Before we discuss orbits in less symmetric, three-dimensional potentials, we
first focus our attention on Planar Potentials Φ(x, y). This is useful for the
following reasons:

• There are cases in which the symmetry of the potential allows a


reduction of the number of degrees of freedom by means of the
effective potential Φeff . Examples are axisymmetric potentials where
Φeff allows a study of motion in the so-called meridional plane.
• Motion confined to the symmetry-planes of ellipsoidal, spheroidal and
spherical potentials is planar.
• There are mass distributions of astronomical interest with potentials
that are reasonably well approximated by planar potentials (disks).
• To get insight into the various orbit families.
Orbits in Planar Potentials II
As an example, we consider motion in the planar, logarithmic potential
& 2
'
1 y
ΦL (x, y) = 2
v02 ln Rc2 + x2 + q2
(q ≤ 1)

This potential has the following properties:


(i) Equipotentials have constant axial ratio q so that influence of
non-axisymmetry is similar at all radii
+
(ii) For R = x2 + y 2 ≪ Rc a power-series expansion gives
2
& 2
'
v0
ΦL ≃ 2R2 x2 + yq2
c

which is similar to that of a two-dimensional harmonic oscillator, which


corresponds to a homogeneous density distribution.

(iii) For R ≫ Rc and q = 1 we have that ΦL = 1 v


2 0
2
lnR. One can easily
verify that this corresponds to a circular velocity curve vcirc (R) = v0 ; i.e.,
at large radii ΦL yields a flat rotation curve, similar to that of disk galaxies.
Orbits in Planar Potentials III
We start our investigation of orbits in ΦL (x, y) with those that are confined
to R ≪ Rc , i.e., those confined to the constant density core.
Using series expansion, we can approximate the potential by
2
& 2
'
v0 y
ΦL ≃ 2R2
x2 + q2
= Φ1 (x) + Φ2 (y)
c

Note that we can separate the potential. This allows us to immediately


identify two isolating integrals of motion in involution from the Hamiltonian:
I1 = p2x + 2Φ1 (x) I2 = p2y + 2Φ2 (y)
The motion of the system is given by the superposition of the librations along
the two axes, which are the solutions of the decoupled system of equations
ẍ = −ωx2 x ÿ = −ωy2 y
which corresponds to a two-dimensional harmonic oscillator with
frequencies ωx = v0 /Rc and ωy = v0 /qRc . Unless these are
incommensurable (i.e., unless ωx /ωy = n/m for some integers n and m)
the star passes close to every point inside a rectangular box.
These orbits are therefore known as box orbits. Such orbits have no net
sense of circulation about the center.
Orbits in Planar Potentials IV
> R one has to resort to numerical integration.
For orbits at larger radii R ∼ c

This reveals two major orbit families: The first is the family of box orbits,
which have no net sense of cirulation about the center, and which, in the
course of time, will pass arbitrarily close to the center of the potential

Note that the orbit completes a filled curve in the SOS, indicating that it
admits a second isolating integral of motion, I2 . This is not a classical
integral, as it is not associated with a symmetry of the system. We can, in
general, not express I2 in the phase-space coordinates.
Orbits in Planar Potentials V
The second main family is that of loop orbits. These do have a net sense of
circulation, and always maintain a minimum distance from the center of the
potential. Any star launched from R ≫ Rc in the tangential direction with a
speed of the order of v0 will follow such a loop orbit.

Once again, the fact that the orbit completes a filled curve in the SOS,
indicates that it admits a second, (non-classical) isolating integral of motion.
Since we don’t know what this integral is (in terms of the phase-space
coordinates) it is simply called I2 .
Orbits in Planar Potentials VI
In ΦL (x, y) there are two main orbit families: loop orbits and box orbits
Each family of orbits is closely associated with a corresponding closed orbit.
This closed orbit is called the parent of the orbit family. All closed orbits that
are parents to families are said to be stable, in that members of their family
that are initially close to them remain close to them at all times.
Unstable, closed orbits also exist, but they don’t parent an orbit family.
Modulo the irregular orbits, one can obtain a good consensus of the orbits in
a system by finding the stable periodic orbits at each energy.
For our planar, logarithmic potential, the parent of the loop orbits is the
closed loop orbit (which intersects SOS at a single point on ẋ = 0 axis).
The parent of the box orbits is the closed long-axis orbit. This is the orbit that
is confined to the x-axis with y = ẏ = 0. In the SOS (x vs. px ) this is the
orbit associated with the boundary curve which corresponds to
1 2
2
ẋ + ΦL (x, 0) = E i.e. y = ẏ = 0
Finally, there is the closed short-axis orbit. This is the orbit that is confined to
the y -axis x = ẋ = 0. In the SOS this is the orbit associated with a single
dot exactly at the center of the SOS. Clearly, this orbit is unstable: rather than
parenting an orbit family, it marks the transition between loop and box orbits.
Orbits in Planar Potentials VII
If we set the core radius Rc = 0 in our planar, logarithmic potential, we
remove the homogeneous core and introduce a singular R−2 cusp.
This singular logarithmic potential admits a number of new orbit families,
which are associated with resonant parents, which we ID by the frequency
ratio ωx : ωy .
The family associated with the 2 : 1 resonance are called the banana orbits
The family associated with the 3 : 2 resonance are called the fish orbits
The family associated with the 4 : 3 resonance are called the pretzel orbits
All these families together are called boxlet orbits.
The (singular) logarithmic potential is somewhat special in that it shows a
surprisingly regular orbit structure. Virtually the entire phase-space admits
two isolating integrals of motion in involution (E and I2 ). Clearly, the
logarithmic potential must be very near-integrable.
However, upon introducing a massive black hole in the center of the
logarithmic potential, many of the box-orbits become stochastic. One says
that the BH destroys the box-orbits. Recall that each box orbit comes
arbitrarily close to the BH.
Orbits in Singular Logarithmic Potentials

Here is an example of a banana-orbit (member of the 2 : 1 resonance family).


Orbits in Singular Logarithmic Potentials

Here is an example of a fish-orbit (member of the 3 : 2 resonance family).


Orbits in Singular Logarithmic Potentials

Here is an example of a pretzel-orbit (member of the 4 : 3 resonance family).


Orbits in Logarithmic Potentials with BH

Here is an example of a stochastic orbit in a (cored) logarithmic potential


with a central black hole.
Centrophobic versus Centrophilic

The closed boxlets comes in two kinds: centrophobic, which avoid the center
and are stable, and centrophilic, which go through the center and are
unstable. The centrophilic versions of the banana and fish orbits are called
the antibanana and antifish orbits, etc. Because they are unstable, they don’t
parent any families.
Orbits in Planar Potentials: Summary
If Φ(x, y) has a homogeneous core, at sufficiently small E potential is
indistinguishable from that of 2D harmonic oscillator
• x-axial and y -axial closed orbits are stable
• they parent a family of box orbits (filling a rectangular box)

At larger radii, for orbits with larger E :


• y -axial orbit becomes unstable and bifurcates into two families of loop
• orbits (with opposite sense of rotation)
• close to this unstable, closed y -axial orbit a small layer of stochastic
• orbits is present
• x-axial orbit still stable, and parents family of box orbits

If Φ(x, y) is scale-free and/or has central cusp that is sufficiently steep


• x-axial orbit may become unstable. Its family of box orbits then
• becomes family of boxlets associated with (higher-order) resonances.

If central BH is present, discrete scattering events can turn box orbits into
stochastic orbits
Orbits in Axisymmetric Potentials I
Axisymmetric potentials (oblate or prolate) are far more realistic examples to
consider in astronomy. Elliptical galaxies might well be spheroidal (but could
also be ellipsoidal), while disk galaxies almost certainly are axisymmetric
(though highly flattened).
For axisymmetric systems the coordinate system of choise are the
cylindrical coordinates (R, φ, z), and Φ = Φ(R, z).
Solving Newton’s equation of motion in cylindrical coordinates yields:

∂Φ
R̈ −
& R φ̇ 2
' = − ∂R
d
dt
R2 φ̇ = 0
z̈ = − ∂Φ
∂z

The second of these expresses conservation of the component of angular


momentum about the z -axis; Lz = R2 φ̇, while the other two equations
describe the coupled oscillations in the R and z -directions.
NOTE: For stars confined to equatorial plane z = 0, equations of motion are
identical to that of motion in spherical density distribution (not surprising,
since in this case the motion is once again central). Therefore, orbits
confined to equatorial plane are rosette orbits.
Orbits in Axisymmetric Potentials II
As for the spherical case, we can reduce the equations of motion to

R̈ = − ∂Φeff
∂R
z̈ = − ∂Φeff
∂z

L2
with Φeff (R, z) = Φ(R, z) + z
2R2
the effective potential. The
L2z /R2 -term serves as a centrifugal barrier, only allowing orbits with
Lz = 0 near the symmetry-axis.
This allows us to reduce the 3D motion to 2D motion in Meridional Plane
(R, z), which rotates non-uniformly around the symmetry axis according to
φ̇ = Lz /R2 .
In addition to simplifying the problem, it also allows the use of
surfaces-of-section to investigate the orbital properties.
For the energy we can write
$ % & '
1 1
E= 2
Ṙ + (Rφ̇) + ż + Φ =
2 2 2
2
Ṙ + ż
2 2
+ Φeff
so that the orbit is restricted to the area in the meridional plane satisfying
E ≥ Φeff . The curve bounding this area is called the zero-velocity curve
(ZVC) (since for a point on it ⃗v = 0).
Epicycle Approximation I
L2
We have defined the effective potential Φeff = Φ + z
2R2
. This has a
minimum at (R, z) = (Rg , 0), where
∂Φeff ∂Φ L2
∂R
= ∂R
− z
R3
=0
The radius R = Rg corresponds to the radius of a circular orbit with energy
1 2 L2
E = Φ(Rg , 0) + v
2 c
= Φ(Rg , 0) + z
2R2
= Φeff .
g

If we define x = R − Rg and expand Φeff around the point


(x, y) = (0, 0) in a Taylor series we obtain

Φeff = Φeff (Rg , 0) + (Φx )x + (Φy )y + (Φxy )xy + 12 (Φxx )x2 +


1
2
(Φyy )y 2
+ O(xz 2
) + O(x2
z) + etc

where
( ∂Φeff ) & 2
' & 2
'
∂ Φeff ∂ Φeff
Φx = ∂x (Rg ,0)
Φxx = ∂x2
Φxy = ∂x∂y
(Rg ,0) (Rg ,0)

By definition of Rg , and by symmetry considerations, we have that


Φx = Φy = Φxy = 0
Epicycle Approximation II
In the epicycle approximation only terms up to second order are considered:
all terms of order xz 2 , x2 z or higher are considered negligble. Defining

κ2 ≡ Φxx ν 2 ≡ Φyy
we thus have that, in the epicycle approximation,

Φeff = Φeff (Rg , 0) + 12 κ2 x2 + 12 ν 2 y 2


so that the equations of motion in the meridional plane become

ẍ = −κ2 x ÿ = −ν 2 y
Thus, the x- and y -motions are simple harmonic oscillations with the
epicycle frequency κ and the vertical frequency ν .
In addition, we have the circular frequency
* ( ∂Φ )
vc (R) 1 Lz
Ω(R) = R
= R ∂R (R,0)
= R2

which allows us to write


& ' & '
∂ 2 Φeff 2
κ =
2
∂R2
= R dΩ
dR
+ 4Ω 2
(Rg ,0) Rg
Epicycle Approximation III
As we have seen before, for a realistic galactic potential Ω < κ < 2Ω,
where the limits correspond to the homogeneous mass distribution
(κ = 2Ω) and the Kepler potential (κ = Ω)
In the epicycle approximation the motion is very simple:

R(t) = A cos(κt + a) + Rg
z(t) = B cos(νt + b)
2Ω A
φ(t) = Ωg t + φ0 − κRgg sin(κt + a)

with A, B , a , b, and φ0 all constants. The φ-motion follows from


& '−2 & '
Lz Lz x 2x
φ̇ = R2
= R2
1+ Rg
≃ Ωg 1 − Rg
g

Note that there are three frequencies (Ω, κ, ν) and also three isolating
integrals of motion in involution: (ER , Ez , Lz ) with ER = 1
2
(ẋ2
+ κ x )
2 2

and Ez = 1 2
(ż 2
+ ν z )
2 2
◃ all orbits are regular.
The motion in (R, φ) can be described as retrogate motion on an ellipse (the
epicycle), whose guiding center (or epicenter) is in prograde motion around
the center of the system.
Epicycle Approximation IV
An important question is: “When is the epicycle approximation valid?”
First consider the z -motion: The equation of motion, z̈ = −ν 2 z implies a
constant density in the z -direction. Hence, the epicycle approximation is
valid as long as ρ(z) is roughly constant. This is only approximately true
very close to equatorial plane. In general, however, epicycle approx. is poor
for motion in z -direction.
In the radial direction, we have to realize that the Taylor expansion is only
accurate sufficiently close to R = Rg . Hence, the epicycle approximation is
only valid for small librations around the guiding center; i.e., for orbits with
an angular momentum that is close to that of the corresponding circular
orbit.

Epicyclic Motion
Orbits in Axisymmetric Potentials III
thin tube
parent ZVC

Typical orbit in axisymmetric potential. If orbit admits two isolating integrals


of motion, it would (ultimately) fill entire area within ZVC. Rather, orbit is
restricted to sub-area within ZVC, indicating that orbit admits a third isolating
integral of motion.
Since this is not a classical integral of motion, and we don’t know how to
express it in terms of the phase-space coordinates, it is simply called I3 .
Note that the point where orbit touches ZVC can be used to ‘label’ I3 : The
set (E, Lz , I3 ) uniquely defines an orbit.
Orbits in Axisymmetric Potentials IV

The orbit shown on the previous page is a so-called short-axis tube orbit.
This is the main orbit family in oblate potentials, and is associated with
(parented by) the circular orbits in equatorial plane.
Orbits (c), (e) and (f) above are from the same orbit family. Orbits (a), (b) and
(d) are special in that Lz = 0.
Orbits in Axisymmetric Potentials V
Because of the centrifugal barrier only orbits with Lz = 0 will be able to
come arbitrarily close to the center.
However, not all orbits with Lz = 0 are box orbits. There is another family of
zero-angular momentum orbits, namely the two-dimensional loop orbits (e.g.,
orbit (d) on previous page). Their meriodional plane is stationary (i.e.,
φ̇ = 0) and their angular momentum vector is perpendicular to the z -axis.
Hence, I3 = L; note that [L, Lz ] = 0.

Numerous authors have investigated orbits in axisymmetric potentials using


numerical techniques. The main conclusions are:
• Most orbits in axisymmetric potentials designed to model elliptical
galaxies are regular and appear to respect an effective third integral I3 .
• The principal orbit family in oblate potentials is the short-axis tube family,
while two families of inner and outer long-axis tube orbits dominate in
prolate potentials.
• In scale-free or cusped potentials several minor orbits families become
important. These are the (boxlets) associated with resonant parents.
• The fraction of phase-space occupied by stochastic, irregular orbits is
generally (surprisingly) small.
Orbits in Triaxial Potentials I
Consider a triaxial density distribution with the major, intermediate, and
minor axes aligned with the x, y , and z axes, respectively.
In general, triaxial galaxies have four main orbit families: box orbits, and
three tube orbits: short axis tubes, inner long-axis tubes, and outer long-axis
tubes.
Orbit structure different in cusp, core, main body, and outer part (halo).
In central core, potential is harmonic, and motion is that of a 3D harmonic
oscillator. ◃ all orbits are box orbits, parented by stable long-axial orbit
Outside of core region, frequencies become strongly radius (energy)
dependent. There comes an energy where ωx = ωy . At this
1 : 1-resonance the y -axial orbit becomes unstable and bifurcates into
short-axis tube family (two subfamilies with opposite sense of rotation).
At even higher E the ωy : ωz = 1 : 1 resonance makes z -axial orbit
unstable → inner and outer long-axis tube families (each with two
subfamilies with opposite sense of rotation).
At even larger radii (in ‘halo’ of triaxial system) the x-axial orbit becomes
unstable ◃ box orbits are replaced by boxlets and stochastic orbits. The
three families of tube orbits are also present
Orbits in Triaxial Potentials II
box orbit short−axis tube orbit

outer long−axis tube orbit inner long−axis tube orbit


Orbits in Triaxial Potentials III
If center is cusped rather than cored, resonant orbits families (boxlets) and
stochastic orbits take over part of phase-space formerly held by box orbits.
The extent to which this happens depends on cusp slope.
Short-axis tubes contribute angular momentum in z -direction; Long-axis
tubes contribute angular momentum in x-direction ◃ total angular
momentum vector may point anywhere in plane containing long and short
axes. NOTE: this can serve as kinematic signature of triaxiality.
The closed loop orbit around the intermediate y -axis is unstable ◃ no family
of intermediate-axis tubes.
Gas moves on closed, non-intersecting orbits. The only orbits with these
properties are the stable loop orbits around x- and z -axes. Consequently,
gas and/or dust disks in triaxial galaxies can exist in xy -plane and yz -plane,
but not in xz -plane. NOTE: these disks must be ellipsoidal rather than
circular, and the velocity varies along ellipsoids.
Stäckel Potentials I
Useful insight may be obtained from separable Stäckel models. These are the
only known triaxial potentials that are completely integrable.
In Stäckel potentials all orbits are regular and part of one of the four main
families.
Stäckel potentials are separable in ellipsoidal coordinates (λ, µ, ν)

(λ, µ, ν) are the roots for τ of


x2 y2 z2
τ +α
+ τ +β
+ τ +γ
=1

Here α < β < γ are constants and −γ ≤ ν ≤ −β ≤ µ ≤ −α ≤ λ


Surfaces of constant λ are ellipsoids
Surfaces of constant µ are hyperboloids of one sheet
Surfaces of constant ν are hyperboloids of two sheets

Stäckel potentials are of the form:


F1 (λ) F2 (µ) F3 (ν)
Φ(⃗
r) = Φ(λ, µ, ν) = − (λ−µ)(λ−ν) − (µ−ν)(µ−λ)
− (ν−λ)(ν−µ)

with F1 , F2 and F3 arbitrary functions


Stäckel Potentials II
The figure below shows contours of constant (λ, µ, ν) plotted in the three
planes (from left to right) xy , xz and yz
At large distances, the ellipsoidal coordinates become close to spherical.
Near the origin they are close to cartesian. For more details, see de Zeeuw,
1985, MNRAS, 216, 273

In triaxial Stäckel potentials all three integrals (E, I2 , I3 ) are analytical, and
the orbits are confined by contours of constant ellipsoidal coordinates (see
next page).
Although Stäckel are a very special class, the fact that they are separable
makes them ideally suited to get insight. Most triaxial potentials that do not
have a Stäckel form have orbital structures that are similar to that of Stäckel
potentials.
Stäckel Potentials III
box orbits

inner long axis tube

outer long axis tube

short axis tube


Orbits in Ellipsoid Land; Summary

System Dim Orbit Families


Oblate 3D S
Prolate 3D I +O
Triaxial 3D S+I+O+B
Elliptic Disk 2D S+B

B = box orbits
S = short-axis tubes
I = inner long-axis tubes
O = outer long-axis tubes
Libration versus Rotation

Three-dimensional orbits
All tube orbits are build up from 2 librations and 1 rotation.
All box orbits are build up from 3 librations.
All boxlets are build up from 2 librations and 1 rotation.

Two-dimensional orbits
All loop orbits are build up from 1 libration and 1 rotation.
All box orbits are build up from 2 librations.
All boxlets are build up from 2 librations.
Rotating Potentials I
The figures of non-axisymmetric potentials may rotate with respect to inertial
space.
The example of interest for astronomy are barred potentials, which are
rotating with a certain pattern speed.
⃗ p = Ωp⃗
We express the pattern speed in angular velocity Ω ez
In what follows we denote by d⃗
a/dt the rate of change of a vector ⃗
a as
measured by an inertial observer, and by ⃗ a˙ the rate of change as measured
by an observer corotating with the figure.
It is straightforward to show that
d⃗
a
dt
a˙ + Ω
=⃗ ⃗p ×⃗
a

Applying this twice to the position vector ⃗


r, we obtain
& '
d2 ⃗
dt
r
2 = d
dt
r˙ + Ω
⃗ ⃗p ×⃗ r
r¨ + Ω
= ⃗ ⃗p ×⃗ r˙ + Ω⃗ p × d⃗r
&dt '
r¨ + Ω
= ⃗ ⃗p ×⃗ r˙ + Ω⃗p × ⃗ r˙ + Ω⃗p ×⃗
r
& ' & '
r¨ + 2 Ω
= ⃗ ⃗p ×⃗ r˙ + Ω⃗p × Ω ⃗p ×⃗
r
Rotating Potentials II
Since Newton’s laws apply to inertial frames we have that
& ' & '
r¨ = −∇Φ
⃗ ⃗ −2 Ω r˙ − Ω
⃗p ×⃗ ⃗p × Ω
⃗p ×⃗
r
& '
r˙ represents the Coriolis force and
⃗p ×⃗
Note the two extra terms: −2 Ω
& '
⃗p × Ω
−Ω ⃗p ×⃗
r the centrifugal force.
The energy is given by
& '2
1 d⃗
r
E = 2 dt
+ Φ(⃗r)
& '2
= 1 ˙+Ω

r ⃗p ×⃗r + Φ(⃗ r)
2
& ' & '2
1 ˙2 ˙· Ω ⃗p ×⃗ 1 ⃗
= 2

r + ⃗
r r + 2
Ωp × ⃗ r + Φ(⃗ r)
& '
1 ˙2 ˙· Ω ⃗p ×⃗ 1 ⃗
= 2

r + ⃗
r r + 2
|Ωp × ⃗ r|2 + Φ(⃗
r)
& '
= EJ + ⃗ r˙ · Ω⃗p ×⃗ r + |Ω ⃗p ×⃗ r |2
Where we have defined Jacobi’s Integral

r˙ 2 + Φ(⃗
EJ ≡ 12 ⃗ ⃗p ×⃗
r ) − 12 |Ω r |2
Rotating Potentials III
The importance of EJ becomes apparent from the following:
& '
dEJ
dt
r˙ dt
= ⃗ d
r˙ +
⃗ dΦ
dt
− ( ⃗p ×⃗
Ω r ) · d ⃗
dt
(Ωp × ⃗ r)
$ %
r˙ ⃗
= ⃗ r¨ + (Ω⃗p ×⃗ r˙ ) + ∇Φ⃗ ·⃗ r˙ − (Ω ⃗p ×⃗ r˙ )
⃗p ×⃗
r ) · (Ω
r˙ · ⃗
= ⃗ r¨ + ⃗
r˙ · ∇Φ
⃗ − (Ω ⃗p ×⃗
r ) · (Ω⃗p ×⃗ r˙ )
⃗ · (A
Here we have used that A ⃗ × B)
⃗ = 0 and that dΩ
⃗ p /dt = 0. If we
r˙ we obtain that
multiply the equation of motion with ⃗
$ %
r˙ · ⃗
⃗ r¨ + ⃗
r˙ · ∇Φ r˙ · (Ω
⃗ + 2⃗ ⃗p ×⃗ r˙ ) + ⃗
r˙ · Ω⃗ p × (Ω
⃗p ×⃗
r) = 0
r˙ · ⃗
⇔ ⃗ r¨ + ⃗
r˙ · ∇Φ
⃗ + (Ω
⃗p ×⃗ r˙ × Ω
r ) · (⃗ ⃗ p) = 0

⃗ · (B
Where we have used that A ⃗ × C)
⃗ =C
⃗ · (A
⃗ × B)
⃗ . Since
⃗×B
A ⃗ = −B
⃗ ×A
⃗ we have that
dEJ
dt
=0
The Jacobi Integral is a conserved quantity, i.e. an integral of motion.
Rotating Potentials IV
For comparison, since Φ = Φ(t) the Hamiltonian is explictely
time-dependent; consequently, the total energy E is not a conserved
quantity (i.e., is not an integral of motion).
The angular momentum is given by
⃗ =⃗
L r× d⃗
r
dt
=⃗ r˙ + ⃗
r×⃗ ⃗p ×⃗
r × (Ω r)
This allows us to write
$ %
⃗p ·L
Ω ⃗ = Ω
⃗ p · (⃗
r×⃗r˙ ) + Ω
⃗p · ⃗
r × (Ω⃗p ×⃗
r)
r˙ · (Ω
= ⃗ ⃗p ×⃗ ⃗p ×⃗
r ) + |Ω r |2
from which we obtain that

⃗p ·L
EJ = E − Ω ⃗

Thus, in a rotating, non-axisymmetric potential neither energy E not angular


⃗p ·L
momentum L are conserved, but the Jacobi integral EJ = E − Ω ⃗ is.

Note that EJ is the sum of 1 ˙ 2 + Φ, which would be the energy if the frame

r
2
were not rotating, and the quantity − 1 | ⃗p ×⃗
Ω r | 2
= − 1 2 2
Ω R , which can
2 2 p
be thought of as the potential energy corresponding to the centrifugal force.
Rotating Potentials V
If we now define the effective potential

Φeff = Φ − 12 Ω2p R2

the equation of motion becomes

r¨ = −∇Φ
⃗ ⃗ eff − 2(Ω r˙ )
⃗b ×⃗

and the Jacobi integral is EJ = 1 |⃗


r˙ |2 + Φeff
2
An orbit with a given value for it’s Jacobi Integral is restricted in its motion to
regions in which EJ ≤ Φeff . The surface Φeff = EJ is therefore often
called the zero-velocity surface.
The effective potential has five points at which both ∂Φeff /∂x and
∂Φeff /∂y vanish. These points, L1 to L5 , are called the Lagrange Points
(cf. restricted three-body problem).
• Motion around L3 (minimum of Φeff ) always stable.
• Motion around L1 and L2 (saddle points of Φeff ) always unstable.
• Motion around L4 and L5 (maxima of Φeff ) can be stable or unstable
• depending on potential.
NOTE: stable/unstable refers to whether orbits remain close to Lagrange
points or not.
Lagrange Points

Illustration of Lagrange points (L1 to L5 ) in Sun-Earth-Moon system.


Lagrange Points

L5

L2 L3 L1

L4

Illustration of Lagrange points (L1 to L5 ) in logarithmic potential. The


annulus bounded by circles through L1 , L2 and L3 , L4 (depicted as red
circle) is called the region of corotation.
Lindblad Resonances I
Let (R, θ) be the polar coordinates that are corotating with the planar
potential Φ(R, θ). If the non-axisymmetric distortions of the potential,
which has a pattern speed Ωp , is sufficiently small then we may write
Φ(R, θ) = Φ0 (R) + Φ1 (R, θ) |Φ1 /Φ0 | ≪ 1
It is useful to consider the following form for Φ1
Φ1 (R, θ) = Φp (R) cos(mθ)
where m = 2 corresponds to a (weak) bar.
In the epicycle approximation the motion in Φ0 (R) is that of an epicycle,
with frequency κ(R), around a guiding center which rotates with frequency
*
1 dΦ0
Ω(R) = R dR
.

In presence of Φ1 (R, θ), movement of guiding center is


θ0 (t) = [Ω(R) − Ωp ] t. In addition to natural frequencies Ω(R) and
κ(R) there is new frequency Ωp . Because Φ1 (R, θ) has m-fold symmetry,
guiding center at R finds itself at effectively same location in (R, θ)-plane
with frequency m [Ω(R) − Ωp ].
Lindblad Resonances II
Motion in R-direction becomes that of harmonic oscillator of natural
frequency κ(R) that is driven by frequency m [Ω(R) − Ωp ].
At several R the natural and driving frequencies are in resonance.
(1) Corotation: Ω(R) = Ωp
(1) (Guiding center corotates with potential).
(2) Lindblad Resonances: m [Ω(R) − Ωp ] = ±κ(R)
(2) Most important of these are:
κ
Ω(R) − 2
= Ωp : Inner Lindblad Resonance
κ
Ω(R) + 2
= Ωp : Outer Lindblad Resonance
κ
Ω(R) − 4
= Ωp : Ultra Harmonic Resonance
Depending on Φ(R, θ) and Ωp one can have 0, 1, or 2 ILRs. If there are
two, we distinguish between Inner Inner Lindblad Resonance (IILR) and Outer
Inner Lindblad Resonance (OILR).
If cusp (or BH) is present there is always 1 ILR, because Ω(R) − κ(R)/2
increases monotically with decreasing R.
Lindblad Resonances III
Frequency

Ωp
Ω+κ/2

Ω−κ/2
IILR OILR CR OLR Radius
Lindblad Resonances play important role for orbits in barred potentials.
Lindblad Resonances IV
As an example, we discus the orbital families in a planar, rotating,
logarithmic potential

(a) Long-axial orbit → stable, oval, prograde, and oriented ∥ to Φeff .


(x1 -family).
(b) Short-axial orbit → stable, oval, retrograde, and oriented ⊥ to Φeff .
At E > E1 (at IILR), family (b) becomes unstable and bifurcates into two
prograde loop families that are oriented perpendicular to Φeff . The stable
(unstable) family is called the x2 (x3 ) family. At the same energy the
x1 -orbits develop self-intersecting loops.
At E > E2 (at OILR) the x2 and x3 families dissapear. The x1 family looses
its self-intersecting loops.
In vicinity of corotation annulus there are families of orbits around L4 and
L5 (if these are stable).
At large radii beyond CR Ωp ≫ Ω(R). Consequently, the orbits effectively
see a circular potential and the orbits become close to circular rosettes.
The Distribution Function
As we have seen before the distribution function (or phase-space density)
f (⃗ v , t) d3 ⃗
x, ⃗ x d3 ⃗
v gives a full description of the state of any collisionless
system.
Here f (⃗ v , t) d3 ⃗
x, ⃗ x d3 ⃗
v specifies the number of stars having positions in
the small volume d3 ⃗ x centered on ⃗x and velocities in the small range d3⃗
v
centered on ⃗ v , or, when properly normalized, expresses the probability that a
star is located in d3 x⃗ d3 ⃗
v.
Define the 6-dimensional phase-space vector

⃗ = (⃗
w v ) = (w1 , w2 , ..., w6 )
x, ⃗
The velocity of the flow in phase-space is then

⃗˙ = (⃗
w x˙ , ⃗
v˙ ) = (⃗ ⃗
v , −∇Φ)

⃗˙ has the same relationship to w


Note that w ⃗ as the 3D fluid flow velocity
u ⃗˙ has to ⃗
⃗ =x x in an ordinary fluid.
In the absence of collisions (long-range, short-range, or direct) and under the
assumption that stars are neither created nor destroyed, the flow in
phase-space must conserve mass.
The Continuity Equation
Consider an ordinary fluid in an arbitrary closed volume V bounded by a
surface S .
!
The mass of fluid within V is M (t) = V ρ(⃗
x, t)d3 x
⃗ and
dM
! & ∂ρ '
dt
= V ∂t
d3 ⃗
x

The mass flowing out of V through an area element d2 S per unit time is
given by ρ⃗ ⃗ with d2 S
v · d2 S ⃗ an outward pointing vector normal to the
surface S . Thus
!
dM
=− ⃗
v · d2 S
ρ⃗
dt S
so that we obtain ! !
∂ρ
d x⃗ + S ρ⃗
3 ⃗=0
v · d2 S
V ∂t
! !
⃗ ⃗
Using the divergence theorem V ∇ · F d x 3
⃗ = SF ⃗ · d2 S
⃗ we obtain
! $ ∂ρ %
+ ∇⃗ · (ρ⃗v ) d3
⃗ =0
x
V ∂t

Since this must hold for any volume V we obtain the continuity equation:
∂ρ ⃗ · (ρ⃗
+∇ v) = ∂ρ ⃗ ·⃗
+ ρ∇ v +⃗ ⃗ =0
v · ∇ρ
∂t ∂t
The Collisionless Boltzmann Equation I
Similarly, for our 6D flow in phase-space the continuity equation is given by

∂f
∂t
+∇ ⃗˙ = 0
⃗ · (f w)

and is called the Collisionless Boltzmann Equation (hereafter CBE).


To simplify this equation we first write out the second term:
#
6 3 $
# % 3 $
# %
∇ ⃗˙ =
⃗ · (f w) ∂(f ẇi )
∂wi
=f ∂vi
∂xi
+ ∂ v̇i
∂vi
+ ∂f
vi ∂x i
+ v̇ i
∂f
∂vi
i=1 i=1 i=1

Since ∂vi /∂xi = 0 (xi and vi are independent phase-space coordinates)


& '
∂ ∂Φ
and ∂ v˙i /∂vi = ∂v − ∂xi
= 0 (gradient of potential does not depend
i

on velocities), we may (using summation convention rewrite the CBE as


∂f ∂f ∂Φ ∂f
∂t
+ vi ∂x i
− ∂xi ∂vi
=0

or in vector notation
∂f
+⃗ ⃗ − ∇Φ
v · ∇f ⃗ · ∂f
=0
∂t ∂⃗
v
The Collisionless Boltzmann Equation II
Note: Since f = f (⃗
x, ⃗
v , t) we have that
∂f ∂f ∂f
df = ∂t
dt + ∂xi
dxi + ∂vi
dvi
and thus
df ∂f ∂f ∂Φ ∂f
dt
= ∂t
+ v
∂xi i
− ∂xi ∂vi
Using this we can write the CBE in its compact form:
df
dt
=0

df /dt expresses the Lagrangian derivative along trajectories through


phase-space, and the CBE expresses that this flow is incompressible. In
other words, the phase-space density f around the phase-point of a give star
always remains the same.

v˙ = −∇Φ
In the presence of collisions it is no longer true that ⃗ ⃗ , and the
CBE no longer holds. Rather, collisions result in an additional collision term:
df
dt
= Γ(t)

This equation is called the Master Equation. If Γ(t) describes long-range


collisions only, then we call it the Fokker-Planck Equation.
Coarse-Grained Distribution Function
We defined the DF as the phase-space density of stars in a volume d3 x⃗ d3 ⃗
v.
However, in our assumption of a smooth ρ(⃗ r ) and Φ(⃗r), the only
meaningful interpretation of the DF is that of a probability density.
Note that this probability density is also well defined in the discrete case,
even though it may vary rapidly. Since it has an infinitely high resolution, it is
often called the fine-grained DF.
Just as the wave-functions in quantum mechanics, the fine-grained DF is not
measurable. However, we can use it to compute the expectation value of any
phase-space function Q(⃗ v ).
x, ⃗
A measurable DF, one that is actually related to counting objects in a given
phase-space volume, is the so-called coarse-grained DF, f¯, defined as the
average value of the fine-grained DF, f , in some specified small volume:
!!
f¯(⃗ v0 ) =
x0 , ⃗ x−x
w(⃗ v0 ) f (⃗
v−⃗
⃗ 0, ⃗ v ) d3 x
x, ⃗ ⃗ d3 ⃗
v

with w(⃗ v ) some (properly normalized) kernel which rapidly falls to zero
x, ⃗
for |⃗
x| > ϵx and |⃗
v | > ϵv .
NOTE: the fine-grained DF does satisfy the CBE.
NOTE: the coarse-grained DF does not satisfy the CBE.
Moment Equations I
Although the CBE looks very simple (df /dt = 0), solving it for the DF is
virtually impossible. It is more practical to consider moment equations.
The resulting Stellar-Hydrodynamics Equations are obtained by multiplying
the CBE by powers of velocity and then integrating over all of velocity space.
Consider moment equations related to vil vjm vk
n
where the indices (i, j, k)
refer to one of the three generalized coordinates, and (l, m, n) are integers.
Recall that
! !
ρ= fd ⃗
3
v ρ⟨vil vjm vkn ⟩ = vil vjm vkn f d3⃗
v
The (l + m + n)th moment equation of the CBE is
! ! !
vil vjm vkn ∂f∂t
d ⃗
v+
3 ∂f
vil vjm vkn va ∂x d3 ∂Φ ∂f
vil vjm vkn ∂x
v−
⃗ a ∂va
d3
v=0


! ∂
! l m n a
∂Φ
! l m n ∂f 3
⇔ ∂t
vil vjm vkn f d3⃗ v+ ∂xi
vi vj vk va f d3⃗
− ∂xa vi vj vk ∂va d ⃗
v v=0
$ % $ % ! l m n ∂f 3
∂ ∂ ∂Φ
⇔ ∂t
ρ⟨vil vjm vkn ⟩ + ∂x i
ρ⟨vi vj vk va ⟩ − ∂xa vi vj vk ∂va d ⃗
l m n
v=0

1st term: integration range doesn’t depend on t so ∂t
may be taken outside

2nd term: ∂xi
doesn’t depend on vi , so derivative may be taken outside.
∂Φ
3rd term: ∂xi
doesn’t depend on vi , so may be taken outside.
Moment Equations II
Let’s consider the zeroth moment: l = m = n = 0
∂ρ ∂(ρ⟨vi ⟩) ∂Φ
! ∂f 3
∂t
+ ∂xi
− ∂xi ∂vi
d ⃗
v =0
Using the divergence theorem we can write
! !
∂f 3
d ⃗
v = ⃗=0
f d2 S
∂vi

where the last equality follows from the fact that f → 0 if |v| → ∞. The
zeroth moment of the CBE therefore reduces to
∂ρ ∂(ρ⟨vi ⟩)
∂t
+ ∂xi
=0

Note that this is the continuity equation, identical to that of fluid dynamics.
Just a short remark regarding notation:
!
⟨vi ⟩ = vi f d3⃗
v
is used as short-hand for
!
⟨vi (⃗
x)⟩ = vi (⃗
x)f (⃗ v )d3⃗
x, ⃗ v
Thus, ⟨vi ⟩ is a local expectation value. For brevity we do not explicitely write
the ⃗
x-dependence.
Moment Equations III
Next we consider the first-order moment equations (l, m, n) = (1, 0, 0) or
(0, 1, 0) or(0, 0, 1)
! ! !
vj ∂f
∂t
d3

v + ∂f 3
vj vi ∂x i
d ⃗
v − ∂Φ ∂f 3
vj ∂x i ∂vi
d ⃗
v=0
∂(ρ⟨vj ⟩) ∂(ρ⟨vi vj ⟩) ∂Φ
! ∂f 3
⇔ ∂t
+ ∂xi
− ∂xi
vj ∂v i
d ⃗v=0
Using integration by parts we write
! ∂f 3
! ∂(vj f ) 3 ! ∂vj
vj ∂v d ⃗
v = ∂vi
d ⃗v − ∂vi f d3⃗
v
i ! !
= vj f d2 S − δij f d3⃗v
= −δij ρ
so that we obtain
∂(ρ⟨vj ⟩) ∂(ρ⟨vi vj ⟩) ∂Φ
∂t
+ ∂xi
+ ρ ∂x j
=0

These are called the momentum equations. Note that this represents a set of
three equations (for j = 1, 2, 3), and that a summation over i is implied.
The Jeans Equations I
We can obtain the so-called Jeans Equations by subtracting ⟨vj ⟩ times the
continuity equation from the momentum equations:
First we write ⟨vj ⟩ times the continuity equation:
∂(ρ⟨vi ⟩)
⟨vj ⟩ ∂ρ
∂t
+ ⟨v j ⟩ ∂xi
=0
∂(ρ⟨vj ⟩) ∂⟨vj ⟩ ∂(ρ⟨vi ⟩⟨vj ⟩) ∂⟨vj ⟩
⇔ ∂t
−ρ ∂t
+ ∂xi
− ρ⟨vi ⟩ ∂xi
=0
Subtracting this from the momentum equations yields
∂(ρ⟨vi vj ⟩) ∂Φ ∂⟨vj ⟩ ∂(ρ⟨vi ⟩⟨vj ⟩) ∂⟨vj ⟩
∂xi
+ ρ ∂x j
+ρ ∂t
− ∂xi
+ ρ⟨vi ⟩ ∂xi
=0

2
If we define σij = ⟨vi vj ⟩ − ⟨vi ⟩ · ⟨vj ⟩ then we obtain

2
∂⟨v ⟩ ∂⟨v ⟩ ∂Φ ∂(ρσij )
ρ ∂tj + ρ⟨vi ⟩ ∂xji = −ρ ∂x j
− ∂xi

These are called the Jeans Equations. Once again, this represents a set of
three equations (for j = 1, 2, 3), and a summation over i is implied.
The Jeans Equations II
We can derive a very similar equation for fluid dynamics. The equation of
motion of a fluid element in the fluid is
ρ d⃗
v
= − ⃗ − ρ∇Φ
∇P ⃗ ext
dt
with P the pressure and Φext some external potential.

v=⃗
Since ⃗ v (⃗ v = ∂⃗
x, t) we have that d⃗ v
∂t
dt + ∂⃗
v
∂xi
dxi , and thus
& '
d⃗
v
= ∂v
+ ⃗ ⃗ ⃗
v·∇ v
dt ∂t

which allows us to write the equations of motion as


& '
ρ ∂v +ρ ⃗ ⃗ ⃗
v·∇ ⃗ − ρ∇Φ
v = −∇P ⃗ ext
∂t

These are the so-called Euler Equations. A comparison with the Jeans
2
Equations shows that ρσij has a similar effect as the pressure. However,
now it is not a scalar but a tensor.
2
ρσij is called the stress-tensor
The stress-tensor is manifest symmetric (σij = σji ) and there are thus 6
independent terms.
The Jeans Equations III
2
The stress tensor σij measures the random motions of the stars around the
streaming part ⟨vi ⟩⟨vj ⟩.
2
Note that the stress-tensor is a local quantity σij = σij
2
(⃗
x). At each point ⃗
x
it defines the velocity ellipsoid; an ellipsoid whose principal axes are defined
2
by the orthogonal eigenvectors of σij with lengths that are proportional to
the square roots of the respective eigenvalues.
The incompressible stellar fluid experiences anisotropic pressure-like forces.
Note that the Jeans Equations have 9 unknowns (3 streaming motions ⟨vi ⟩
and 6 terms of the stress-tensor). With only three equations, this is not a
closed set.
For comparison, in fluid dynamics there are only 4 unknowns (3 streaming
motions and the pressure). The 3 Euler Equations combined with the
Equation of State forms a closed set.
The Jeans Equations IV
One might think that adding higher-order moment equations of the CBE will
allow to obtain a closed set of equations. However, adding more equations
also adds more unknowns such as ⟨vi vj vk ⟩, etc. The set of CBE moment
equations never closes!

In practice one therefore makes some assumptions, such as assumptions


regarding the form of the stress-tensor, in order to be able to solve the Jeans
Eequations.

If, with this approach, a solution is found, the solution may not correspond to
a physical (i.e., everywhere positive) DF. Thus, although any real DF obeys
the Jeans equations, not every solution to the Jeans equations cprresponds
to a physical DF!!!
Cylindrically Symmetric Jeans Equations
As a worked out example we derive the Jeans equations under cylindrical
symmetry. We therefore write the Jeans equations in the cylindrical
coordinate system (R, φ, z).
The first step is to write the CBE in cylindrical coordinates
df ∂f ∂f ∂f
dt
= ∂t
+ Ṙ ∂R + φ̇ ∂φ + ż ∂f
∂z
+ v̇ R
∂f
∂vR
+ v̇ φ
∂f
∂vφ
+ v̇ z
∂f
∂vz

First we recall from vector calculus that


v = Ṙ⃗
⃗ eR + Rφ̇⃗
eφ + ż⃗
ez = vR⃗
eR + vφ⃗
eφ + vz ⃗
ez
from which we obtain that
a=
⃗ d⃗
v
dt
= R̈⃗ e˙ R + Ṙφ̇⃗
eR + Ṙ⃗ eφ + Rφ̈⃗ e˙ φ + z̈⃗
eφ + Rφ̇⃗ e˙ z
ez + ż⃗

e˙ R = φ⃗
Using that ⃗ e˙ φ = −φ̇⃗
eφ , ⃗ e˙ z = 0 we have that
eR , and ⃗
$ % $ %
a = R̈ − Rφ̇2
⃗ eR + 2Ṙφ̇ + Rφ̈ ⃗
⃗ eφ + z̈⃗
ez

vR = Ṙ ⇒ v̇R = R̈
vφ = Rφ̇ ⇒ v̇φ = Ṙφ̇ + Rφ̈
vz = ż ⇒ v̇z = z̈
Cylindrically Symmetric Jeans Equations
This allows us to write
$ 2

% / vR vφ 0
a = v̇R −
⃗ R
eR +
⃗ R
+ v̇φ ⃗
eφ + v̇z ⃗
ez

Newton’s equation of motion in vector form reads

⃗ ⃗ =
a = −∇Φ ∂Φ

e + 1 ∂Φ

e + ∂Φ

e
∂R R R ∂φ φ ∂z z

Combining the above we obtain


2
∂Φ vφ
v̇R = − ∂R + R
1 ∂Φ vR vφ
v̇φ = − R ∂R
+ R
∂Φ
v̇z = − ∂z

Which allows us to the write the CBE in cylindrical coordinates as


2

$ %
∂f ∂f vφ ∂f ∂f ∂Φ ∂f
∂t
+ vR ∂R + R ∂φ + vz ∂z + R − ∂R ∂v

$ % R
1 ∂Φ ∂f ∂Φ ∂f
R
v R v φ + ∂φ ∂vφ
− ∂z ∂vz
=0
Cylindrically Symmetric Jeans Equations
The Jeans equations follow from multiplication with vR , vφ , and vz and
integration over velocity space. Note that the symmetry requires that all
derivatives with respect to φ must vanish.
The remaining terms are:
! !
vR ∂f
∂t
d3

v = ∂
∂t
vR f d3⃗v = ∂(ρ⟨v ∂t
R ⟩)

! 2 ∂f 3 ∂
! 2 ∂(ρ⟨vR2
⟩)
vR ∂R d ⃗v = ∂R vR f d ⃗ 3
v= ∂R
! !
∂f 3
vR vz ∂z d ⃗ ∂
v = ∂z vR vz f d3⃗ v = ∂(ρ⟨v∂z R vz ⟩)

! 2 $! 2 ! 2 % 2
vR vφ ∂f 1 ∂(v v f ) ∂(v v ) ⟨vφ ⟩
d 3
= d 3
d 3
=
R R
R ∂vR

v R ∂vR
φ

v − ∂vR
φ
f ⃗
v −ρ R
! $! ! %
∂Φ ∂f ∂Φ ∂(vR f ) 3 ∂vR ∂Φ
vR ∂R ∂vR
d 3

v = ∂R ∂vR
d ⃗
v − ∂vR
f d 3

v = −ρ ∂R
! 2
$! 2 ! 2
% 2
vR vφ ∂f 1 ∂(v v f ) ∂(v v ) ⟨vR ⟩
R ∂vφ
d3

v = R
R φ
∂vφ
d3

v − R φ
∂vφ
f d 3

v = −ρ R
! $! ! ∂vz %
∂Φ ∂f 3 ∂Φ ∂(vR f ) 3
vR ∂z ∂vz d ⃗ v = ∂z ∂vz
d ⃗v − ∂vR f d3⃗ v =0
Cylindrically Symmetric Jeans Equations
Working out the similar terms for the other Jeans equations we obtain the
Jeans Equations in Cylindrical Coordinates
2
$ 2
⟨vR ⟩−⟨vφ 2

%
∂(ρ⟨vR ⟩) ∂(ρ⟨vR ⟩) ∂(ρ⟨vR vz ⟩) ∂Φ
∂t
+ ∂R
+ ∂z
+ ρ R
+ ∂R
=0
∂(ρ⟨vφ ⟩) ∂(ρ⟨vR vφ ⟩) ∂(ρ⟨vφ vz ⟩) ⟨vR vφ ⟩
∂t
+ ∂R
+ ∂z
+ $2ρ R
= 0%
2
∂(ρ⟨vz ⟩) ∂(ρ⟨vR vz ⟩) ∂(ρ⟨vz ⟩) ⟨vR vz ⟩ ∂Φ
∂t
+ ∂R
+ ∂z
+ ρ R
+ ∂z
= 0

Note that there are indeed 9 unknowns in these 3 equations. Only if we make
additional assumptions can we solve these equations. In particular, one
often makes the following assumptions:

(1) System is static ⇒ ∂t
-terms are zero and ⟨vR ⟩ = ⟨vz ⟩ = 0

(2) Stress Tensor is diagonal ⇒ ⟨vi vj ⟩ = 0 if i ̸= j


(3) Meridional Isotropy ⇒ ⟨vR
2
⟩ = ⟨vz2 ⟩ = σR
2
= σz2 ≡ σ 2
2
Under these assumptions we have 3 unknowns left: ⟨vφ ⟩, ⟨vφ ⟩, and σ 2 .
Cylindrically Symmetric Jeans Equations
Under the assumptions listed on the previous page the Jeans equation
reduce to
$ σ2 −⟨v2 ⟩ %
∂(ρσ 2 ) ∂Φ
∂R
+ρ R
φ
+ ∂R
=0
∂(ρσ 2 )
∂z
+ ρ ∂Φ
∂z
=0

Note that we have only 2 equations left: the system is still not closed.
If from the surface brightness we can estimate the mass density ρ(R, z)
and hence the potential Φ(R, z), we can solve the second of these Jeans
equations for the meridional velocity dispersion

1
!

σ (R, z) =
2
ρ
ρ ∂Φ
∂z
dz
z
and the first Jeans equation then gives the mean square azimuthal velocity
2
⟨vφ ⟩ = ⟨vφ ⟩2 + σφ
2

2
∂Φ R ∂(ρσ )
2
⟨vφ ⟩(R, z) = σ (R, z) +
2
R ∂R + ρ ∂R
2
Thus, although ⟨vφ ⟩ is uniquely specified by the Jeans equations, we don’t
know how it splits in the actual azimuthal streaming ⟨vφ ⟩ and the azimuthal
2
dispersion σφ . Additional assumptions are needed for this.
Spherically Symmetric Jeans Equations
A similar analysis but for a spherically symmetric system, using the spherical
coordinate system (r, θ, φ), gives the following set of Jeans equations
$ %
∂(ρ⟨vr ⟩) ∂(ρ⟨vr2 ⟩) ρ
∂t
+ ∂r
+ r
2⟨vr2 ⟩ − ⟨vθ2 ⟩ − ⟨vφ
2
⟩ + ρ ∂Φ
∂r
=0
$ & ' %
∂(ρ⟨vθ ⟩) ∂(ρ⟨vr vθ ⟩) ρ
∂t
+ ∂r
+ r 3⟨vr vθ ⟩ + ⟨vθ2 ⟩ − ⟨vφ 2
⟩ cotθ = 0
∂(ρ⟨vφ ⟩) ∂(ρ⟨vr vφ ⟩) ρ
∂t
+ ∂r
+ r
[3⟨vr vφ ⟩ + 2⟨vθ vφ ⟩cotθ] = 0

If we now make the additional assumptions that the system is static and that
also the kinematic properties of the system are spherical symmetric then
there can be no streaming motions and all mixed second-order moments
vanish. Consequently, the stress tensor is diagonal with σθ2 = σφ
2
. Under
these assumptions only one of the three Jeans equations remains:

∂(ρσr2 ) 2ρ
/ 0
∂r
+ r
σr2 − σθ2 + ρ ∂Φ
∂r
=0
Notice once again how the spherical Jeans equation is not sufficient to
determine the dynamics: if the density and potential are presumed known, it
contains two unknown functions σr2 (r) and σθ2 (r) which can therefore not
be determined both.
Spherically Symmetric Jeans Equations
It is useful to define the anisotropy parameter

σθ2 (r)
β(r) ≡ 1 − σr2 (r)

With β thus defined the Jeans equation can be written as


2
1 ∂(ρ⟨vr ⟩) β⟨vr2 ⟩
ρ ∂r
+ 2 r = − dΦ
dr

If we now use that dΦ/dr = GM (r)/r then we obtain


$ %
r⟨vr2 ⟩ d ln ρ d ln⟨vr2 ⟩
M (r) = − G d ln r
+ d ln r
+ 2β

Thus, if we can measure ρ(r), ⟨vr2 ⟩(r), and β(r), we can use the Jeans
equations to infer the mass profile M (r).
Consider an external, spherical galaxy. Observationally, we can measure the
projected surface brightness profile, Σ(R), which is related to the
luminosity density ν(r) = ρ(r)/Υ(r) as
!

Σ(R) = 2 √ν r2 dr
R r −R2

with Υ(r) the mass-to-light ratio.


Spherically Symmetric Jeans Equations

vr

α To Observer
R r

Similarly, the line-of-sight velocity dispersion is an observationally


accessible quantity. As the figure illustrates, it is related to both ⟨vr2 ⟩(r) and
β(r) according to
!

Σ(R)σp2 (R) = 2 ⟨(vr cos α − vθ sin α)2 ⟩ √ν r2 dr
R r −R2
! (
∞ ) ν r dr
= 2 ⟨vr2 ⟩ cos2
α+ ⟨vθ2 ⟩ sin2 α √ 2
R r −R2
! &

R 2
'
ν ⟨vr2 ⟩ r dr
= 2 1 − β r2 √ 2 2
R r −R
Spherically Symmetric Jeans Equations
The 3D luminosity density is trivially obtained from the observed Σ(R):
!

ν(r) = − π1 dΣ
dR
√ dR
r
2R −r 2

In general, we have three unknowns: M (r) (or equivalently ρ(r) or Υ(r)),


⟨vr2 ⟩(r) and β(r).
With our two observables Σ(R) and σp
2
(R) these can only be determined if
we make additional assumptions.
EXAMPLE 1: Assume isotropy (β(r) = 0). In this case we can use the Abel
inversion technique to obtain
!

d(Σσp2
)
ν(r)⟨vr2 ⟩(r) = − π1 dR
√ dR
r R −r 2
2

and the enclosed mass follows from the Jeans equation


$ %
r⟨vr2 ⟩ d ln ν d ln⟨vr2 ⟩
M (r) = − G d ln r
+ d ln r

Note that the first term uses the luminosity density ν(r) rather than the
2
mass density ρ(r), because σp is weighted by light rather than mass.
Spherically Symmetric Jeans Equations
The mass-to-light ratio now follows from

Υ(r) = R rM (r)
4π 0 ν(r) r 2 dr

which can be used to investigate whether system contains dark matter halo
or central black hole, but always under assumption that system is isotropic.
EXAMPLE 2: Assume a constant mass-to-light ratio: Υ(r) = Υ0 . In this
case the luminosity density ν(r) immediately yields the enclosed mass:
!r
M (r) = 4πΥ0 ν(r) r 2 dr
0

We can now use the Jeans Equation to write β(r) in terms of M (r), ν(r)
and ⟨vr2 ⟩(r). Substituting this in the equation for Σ(R)σp
2
(R) allows a
solution for ⟨vr2 ⟩(r), and thus for β(r). As long as 0 ≤ β(r) ≤ 1 the
model is said to be self-consistent witin the context of the Jeans equations.
Almost always, radically different models (based on radically different
assumptions) can be constructed, that are all consistent with the data and
the Jeans equations. This is often referred to as the mass-anisotropy
degeneracy. Note, however, that none of these models need to be physical:
they can still have f < 0.
The Virial Equations I
We can obtain an important tensor equation relating global properties of the
system, by multiplying the CBE by both vj and xk and then integrating over
the entire phase-space.
The first step of this has already been performed in our derivation of the
Jeans equations, and yielded the momentum equations
∂(ρ⟨vj ⟩) ∂(ρ⟨vi vj ⟩) ∂Φ
∂t
+ ∂xi
+ ρ ∂x j
=0

Multiplying all terms with xk and integrating over real space yields

! ! ∂(ρ⟨vi vj ⟩) 3 ! ∂Φ 3
∂t
ρxk ⟨vj ⟩d x = −
3
xk ∂xi
d ⃗x − ρ xk ∂xj
d ⃗
x
Using integration by parts the first term on the r.h.s. becomes
! ∂(ρ⟨vi vj ⟩) 3 !∂(ρxk ⟨vi vj ⟩) 3 !
xk ∂xi
d ⃗x = d x⃗− ρ⟨vi vj ⟩ ∂x k
d3

x
! ∂xi ∂xi
= − δki ρ⟨vi vj ⟩d3 x ⃗
!
= − ρ⟨vk vj ⟩d3 ⃗ x
= −2Kkj
where we have defined the kinetic energy tensor
1
!
Kij = 2
ρ⟨vi vj ⟩d3 x

The Virial Equations II
It is customary to split the kinetic energy tensor into contributions from
ordered and random motions:

Kij ≡ Tij + 12 Πij


where
1
! !
Tij ≡ 2
ρ ⟨vi ⟩ ⟨vj ⟩ d x
3
⃗ Πij ≡ 2
ρ σij d3 x

In addition to the K we also define the potential energy tensor


! ∂Φ 3
Wij ≡ − ρxi ∂x j
d ⃗
x

Combining the above we obtain



!
∂t
ρ xk ⟨vj ⟩d3 x = 2Kkj + Wkj
which allows us to write
1 d
!
2 dt
ρ [xk ⟨vj ⟩ + xj ⟨vk ⟩] = 2Kjk + Wjk

where we have used that K and W are symmetric.


The Virial Equations III
Finally we also define the moment of inertia tensor
!
Iij ≡ ρ xi xj d3 ⃗
x
Differentiating with respect to time, and using the continuity equation (i.e.,
the zeroth moment equation of the CBE) yields
dIjk ! ∂ρ
dt
= ∂t
xj x k d 3

x
! ∂ρ⟨vi ⟩
= − ∂x
xj x k d3

x
! ∂(ρ⟨vi ⟩xj xk ) 3
i ! ∂(xj xk ) 3
= − ∂xi
d ⃗
x+ ρ⟨vi ⟩ ∂xi
d x⃗
!
= ρ⟨vi ⟩ [xj δik + xk δij ] d3 x

!
= ρ [xj ⟨vk ⟩ + xk ⟨vj ⟩] d3 x

so that
1 d
! 2
1 d Ijk
2 dt
ρ [xk ⟨vj ⟩ + xj ⟨vk ⟩] = 2 dt2
which allows us to write the Tensor Virial Theorem as
2
1 d Ijk
2 dt2
= 2Tjk + Πjk + Wjk
which relates the gross kinematic and structural properties of gravitational
systems.
The Virial Equations IV
If the system is in a steady-state the moment of inertia tensor is stationary,
and the Tensor Virial Theorem reduces to 2Kij + Wij = 0.
Of particular interest is the trace of the Tensor Virial Theorem, which relates
the total kinetic energy K = 1 2
M ⟨v 2
⟩ to the total potential energy
1
!
W = 2 ρ(⃗ x) Φ(⃗ x) d3 ⃗
x.
#
3
1
! / 2 0 3
tr(K) ≡ Kii = 2
x) ⟨v1 ⟩(⃗
ρ(⃗ x) + ⟨v2 ⟩(⃗
2
x) + ⟨v3 ⟩(⃗
2
x) d ⃗x
i=1 !
1
= 2
ρ(⃗x ) ⟨v 2
⟩(⃗
x )d 3
x

1
= 2
M ⟨v 2 ⟩ = K
where we have used that !
1
⟨v ⟩ =
2
M
x)⟨v 2 ⟩(⃗
ρ(⃗ x)d3 ⃗
x
Similarly, the trace of the potential energy tensor is equal to the total
potential energy (see next page for derivation):
1
!
tr(W) = W = 2
x) Φ(⃗
ρ(⃗ x) d3 ⃗
x
We thus obtain the scalar virial theorem
2K + W = 0
The Potential Energy Tensor I
We have defined the potential energy tensor as
! ∂Φ 3
Wij ≡ − ρxi ∂x j
d ⃗
x
! ρ(⃗ x)
Using that Φ(⃗
x) = −G |⃗ ′
x −⃗ x|
d3
x
⃗ we obtain
!! ′
′ xi (xj −xj ) 3 ′ 3
Wij = G x) ρ(⃗
ρ(⃗ x ) |⃗x′ −⃗x|3 d ⃗xd x ⃗
Using that ⃗ x′ are dummy variables, we may relabel them, and write
x and ⃗
!! x′j (xk −x′k ) 3
Wij = G ρ(⃗
x ′
) ρ(⃗
x) |⃗x−⃗x′ |3 d ⃗ xd3 ⃗
x′
Interchanging the order of integration and summing the above two equations
yields the manifestly symmetric expression
!! ′ ′
′ (xj −xj )(xk −xk ) 3 ′ 3
Wij = −G
2
x) ρ(⃗
ρ(⃗ x) x′ −⃗
|⃗ x| 3
d x
⃗ d ⃗x
This expression allows us to write
#
3 !! ′ |⃗x′ −⃗x| 2 3 ′ 3
tr(W) ≡ Wii = −G
2
x)ρ(⃗
ρ(⃗ x ) |⃗x′ −⃗x|3 d ⃗
xd x ⃗
i=1
G
! ! ρ(⃗x′ ) 3 ′ 3 1
!
= − 2 ρ(⃗ x) x′ −⃗
|⃗ x|
d ⃗
xd ⃗x = 2
x)Φ(⃗
ρ(⃗ x)d3 x
⃗ =W
The Surface Pressure Term
In our derivation on the previous pages we obtained
! ∂(ρ⟨vi vj ⟩) 3 !∂(ρxk ⟨vi vj ⟩) 3 ! ∂xk 3
xk ∂xi
d ⃗x = d x
⃗ − ρ⟨v i v j ⟩ d ⃗
x
! ∂xi ∂xi
= − ρ⟨vk vj ⟩d3 ⃗ x = −2Kkj
where we have used that
! ∂(ρxk ⟨vi vj ⟩) 3 !
∂xi
d x⃗ = ρxk ⟨vk vj ⟩d2 S = 0
based on the assumption that ρ(r) = 0 when r → ∞. However, this is
only true for an isolated system with ‘vacuum’ boundary conditions.
In reality, a halo or galaxy is embedded in a cosmological density field, often
with ongoing infall. This yields a non-zero surface pressure. In its most
general form the scalar virial theorem therefore reads

2K + W + Sp = 0
with the surface pressure term
!
Sp = − ⟨v 2 ⟩⃗ ⃗
nd 2 S
r·⃗
As long as Sp ̸= 0 we thus expect that 2K/|W | ̸= 1.
See Shapiro et al. (astro-ph/0409173) for a detailed discussion.
The Virial Equations V
From a simple dimensional analysis one finds that |W | ∝ GM 2 /R with
M the system’s mass and R a characteristic radius.
A useful characteristic radius is the so-called gravitational radius defined by

GM 2
rg ≡ |W |

One can relate the gravitational radius to the half-mass radius rh , defined as
radius enclosing half the total mass. As shown by Spitzer (1969), typical
stellar systems have rg ≃ 2.5rh .
Combining this with the scalar virial theorem we can write that

rh ⟨v2 ⟩
M ≃ 2.5 G
which is a useful equation to obtain a (rough) estimate of the virial mass from
a measure of the half-mass radius and the rms motion
The Virial Equations VI
Using the scalar virial theorem we obtain
1
E = K + W = −K = 2
W

Consider the formation of a virialized object. If the system forms by


collecting material from large radii, the initial conditions are well
approximated by Kinit = Winit = Einit = 0.
Because of gravity the matter starts to collapse. Since W = −GM 2 /rg
this makes W more negative. At the same time K increases. Initially, during
the early collapse, E = T + W = 0.
After the first shell crossing, the system starts to virialize. When virialization
is complete, 2T + W = 0 and E = W/2.
Therefore, half the gravitational energy released by collapse is invested in
kinetic form. The system somehow disposes of the other half in order to
achieve a binding energy Eb = −E .
QUESTION Where does the other half of the energy go?
Application: M/L of Spherical Systems
As an application of the Virial Theorem, consider spherical, non-rotating
systems (spherical galaxies or globulars)

If the mass-to-light ratio Υ does not depend on radius then


! 1 Υ
!
Kxx = 2
ρ(⃗
x )⟨v 2
x ⟩d 3
x
⃗ = 2
x)⟨vx2 ⟩d3 ⃗
ν(⃗ x
where ν(⃗ x) = ρ(⃗x)/Υ is the 3D luminosity distribution, and Kxx is the
kinetic energy associated with motion in the x-direction

Since a spherical, non-rotating system is isotropic we have that


K = Kxx + Kyy + Kzz = 3Kxx
If one has observationally determined the surface brightness profile Σ(R)
2
and the line-of-sight velocity dispersion σp (R) then it is easy to see that

!
2π !
∞ !

K= 3Υ
2
dφ dRRΣ(R)σp2 (R) = 3πΥ dRRΣ(R)σp2 (R) ≡ ΥJ
0 0 0

where we defined the observationally accessible J = J (Σ, σp


2
)
Application: M/L of Spherical Systems
!

M 2 (r)
As seen in exersizes, for spherical system: W = −G
2 r2
dr
0
!r
Using that M (r) = 4π ρ(r ′ )r ′2 dr ′
0

where the density profile is related to Σ(R) according to


!

ρ(r) = −Υ
π

dR
√ dR
r R −r 2
2

we obtain that
,r -2
!
∞ ! !

W = −8Υ2 dr
r2
dr ′ r ′2 dΣ
dR
√ dR ≡ Υ2 J˜
0 0 r′ R −r 2
2

where we have defined the observationally accessible integral J˜ = J˜(Σ)


According to the virial theorem 2K + W = 0, and thus −2K/W = 1.
Substituting K = ΥJ and W = Υ2 J˜ we thus obtain that

Υ = − 2J

Flattening of Oblate Spheroids I
As another application of the virial theorem we relate the flattening of an
oblate spheroid to its kinematics.
Consider an oblate system with it’s symmetry axis along the z -direction.
Because of symmetry considerations we have that

⟨vR ⟩ = ⟨vz ⟩ = 0 ⟨vR vφ ⟩ = ⟨vz vφ ⟩ = 0


If we write that
⟨vx ⟩ = ⟨vφ ⟩ sin φ ⟨vy ⟩ = ⟨vφ ⟩ cos φ
we obtain
1
!
Txy = 2
ρ⟨vx ⟩⟨vy ⟩d3 x

1
!
2π !
∞ !

= 2
dφ sin φ cos φ dR dzρ(R, z)⟨vφ ⟩2 (R, z)
0 0 −∞
= 0
A similar analysis shows that all other non-diagonal elements of T , Π, and
W have to be zero.
In addition, because of symmetry considerations we must have that
Txx = Tyy , Πxx = Πyy , and Wxx = Wyy .
Flattening of Oblate Spheroids II
Given these symmetries, the only independent, non-trivial virial equations are
2Txx + Πxx + Wxx = 0, 2Tzz + Πzz + Wzz = 0
Taking the ratio we find that
2Txx +Πxx Wxx
2Tzz +Πzz
= Wzz

The usefulness of this equation lies in the fact that, for density distributions
that are constant on similar concentric spheroids, i.e., ρ = ρ(m2 ), the ratio
Wxx /Wzz depends only on the axis ratio c/a of the spheroids, and is
independent of the density profile! For an oblate body, to good approximation

Wxx
( c )−0.9
Wzz
≃ a

Let us start by considering isotropic, oblate rotators.

Then Πxx = Πzz = M σ̃ 2 , Tzz = 0 and Txx + Tyy = 2Txx = 1


2
M ṽ 2
.

Here M is the total mass, σ̃ 2 is the mass-weighted rms-average of the


intrinsic one-dimensional velocity dispersion, and ṽ 2 is the mass-weighted
rms rotation velocity.
Flattening of Oblate Spheroids III
Thus, for an isotropic, oblate rotators we have that
1 2
2 M ṽ +M σ̃
2 ( c )−0.9
M σ̃ 2
≃ a
which reduces to

+
σ̃
≃ 2[(c/a)−0.9 − 1]
This specifies the relation between the flattening of the spheroid and the
ratio of streaming motion to random motion. Note that you need a rather
large amount of rotation to achieve only modest flattening: c/a = 0.7
requires ṽ ∼ 0.9σ̃
Next consider a non-rotating, anisotropic, oblate system:
In this case Πxx = M σ̃xx
2
and Πzz = M σ̃zz
2
, and the virial theorem
gives that

σ̃zz
( c )0.45
σ̃xx
≃ a

Now a flattening of c/a = 0.7 requires only a small anisotropy of


σ̃zz /σ̃xx ≃ 0.85
Flattening of Oblate Spheroids IV
Finally, consider the general case of rotating, anisotropic, oblate systems
Now we have Πzz = (1 − δ)Πxx = (1 − δ)M σ̃ 2 , Tzz = 0 and
1
2Txx = 2
M ṽ 2
, where we have introduced the anistropy parameter δ < 1.
In this case the virial theorem gives

+
σ̃
≃ 2[(1 − δ) (c/a)−0.9 − 1]

This shows that observations of ṽ/σ̃ and the ellipticity ε = 1 − (c/a)


allow us to test whether elliptical galaxies are supported by rotation or by
anisotropic presssure.
A potential problem is that we can not directly measure ṽ nor σ̃ . Rather, we
measure properties that are projected along the line-of-sight. Furthermore, in
general we don’t see a system edge-on but under some unknown inclination
angle i. Note that i also affects the measured v and σ . As shown in Binney
& Tremaine, the overall effect is to move a point on the oblate rotator line
mainly along that line.
Flattening of Oblate Spheroids V
Open Circles: luminous ellipticals
Solid Circles: low−luminosity ellipticals

Oblate Isotropic Rotators

(from: Davies et al. 1983)

Observations reveal a dichotomy: luminous ellipticals are supported by


anisotropic pressure, while fainter ellipticals (and bulges) are consistent with
being oblate, isotropic rotators.
NOTE: If luminous ellipticals are anisotropic, there is no good reason why
they should be axisymmetric: massive ellipticals are triaxial
The Jeans Theorem I
RECALL: An integral of motion is a function I(⃗ v ) of the phase-space
x, ⃗
coordinates that is constant along all orbits, i.e.,
dI
= ∂I dxi
+ ∂I dvi
=⃗ ⃗ − ∇Φ
v · ∇I ⃗ · ∂I
=0
dt ∂xi dt ∂vi dt ∂⃗
v
Compare this to the CBE for a steady-state (static) system:
⃗ − ∇Φ
v · ∇f
⃗ ⃗ · ∂f
=0
∂⃗
v
Thus the condition for I to be an integral of motion is identical with the
condition for I to be a steady-state solution of the CBE. Hence:

Jeans Theorem Any steady-state solution of the CBE depends on the


phase-space coordinates only through integrals of motion. Any function of
these integrals is a steady-state solution of the CBE.

PROOF: Let f be any function of the n integrals of motion I1 , I2 , ...In then

df #
n
∂f dIk
dt
= ∂Ik dt
=0
k=1

which proofs that f satisfies the CBE.


The Jeans Theorem II
More useful than the Jeans Theorem is the Strong Jeans Theorem, which is
due to Lynden-Bell (1962).

Strong Jeans Theorem The DF of a steady-state system in which almost all


orbits are regular can be written as a function of the independent isolating
integrals of motion, or of the action-integrals.
Note that a regular orbit in a system with n degrees of freedom is uniquely,
and completely, specified by the values of the n isolating integrals of motion
in involution. Thus the DF can be thought of as a function that expresses the
probability for finding a star on each of the phase-space tori.
We first consider an application of the Jeans Theorem to Spherical Systems
As we have seen, any orbit in a spherical potential admits four isolating
integrals of motion: E, Lx , Ly , Lz .

Therefore, according to the Strong Jeans Theorem, the DF of any†


⃗ .
steady-state spherical system can be expressed as f = f (E, L)


except for point masses and uniform spheres, which have five isolating
integrals of motion
Jeans Theorem & Spherical Systems
If the system is spherically symmetric in all its properties, then
⃗ : ie., the DF can only depend on
f = f (E, L2 ) rather than f = f (E, L)
the magnitude of the angular momentum vector, not on its direction.
Contrary to what one might naively expect, this is not true in general. In fact,
as beautifully illustrated by Lynden-Bell (1960), a spherical system can rotate
without being oblate.

Consider a spherical system with f (E, L) ⃗ = f (E, −L) ⃗ . In such a system,


for each star S on a orbit O , there is exactly one star on the same orbit O
but counterrotating with respect to S . Consequently, this system is perfectly
spherically symmetric in all its properties.
Now consider all stars in the z = 0-plane, and revert the sense of all those
stars with Lz < 0. Clearly this does not influence ρ(r), but it does give the
system a net sense of rotation around the z -axis.
Thus, although a system with f = f (E, L2 ) is not the most general case,
⃗ are rarely considered in galactic dynamics.
systems with f = f (E, L)
Isotropic Spherical Models I
An even simpler case to consider is the one in which f = f (E).

r) + 12 [vr2 + vθ2 + vφ
Since E = Φ(⃗ 2
] we have that
! & '
1 1 2
⟨vr2 ⟩ = ρ
dvr dvθ dvφ vr2 f Φ + [v
2 r
+ vθ2 + vφ
2
]
! & '
1 1 2
⟨vθ2 ⟩ = ρ
dvr dvθ dvφ vθ f Φ + 2 [vr + vθ + vφ ]
2 2 2

! & '
1
2
⟨vφ ⟩= ρ
dvr dvθ dvφ vφ2
f Φ + 12 [vr2 + vθ2 + vφ
2
]
Since these equations differ only in the labelling of one of the variables of
integration, it is immediately evident that ⟨vr2 ⟩ = ⟨vθ2 ⟩ = ⟨vφ
2
⟩.

Assuming that f = f (E) is identical to assuming that the system is isotropic

Note that from


! & '
1 1 2
⟨vi ⟩ = ρ
dvr dvθ dvφ vi f Φ + [v
2 r
+ vθ2 + vφ
2
]

it is also immediately evident that ⟨vr ⟩ = ⟨vθ ⟩ = ⟨vφ ⟩ = 0. Thus, similar


as for a system with f = f (E, L2 ) a system with f = f (E) has no net
sense of rotation.
Isotropic Spherical Models II
In what follows we define the relative potential Ψ ≡ −Φ + Φ0 and relative
energy E = −E + Φ0 = Ψ − 1
2
v 2
. In general one chooses Φ0 such that
f > 0 for E > 0 and f = 0 for E ≤ 0
Now consider a self-consistent, spherically symmetric system with
f = f (E). Here self-consistent means that the potential is due to the
system itself, i.e.,
!
∇ Ψ = −4πGρ = −4πG
2
f (E)d3⃗
v

(note the minus sign in the Poisson equation), which can be written as

1 d
( ) !Ψ +
r 2 dr
r 2 dΨ
dr
= −16π G
2
f (E) 2(Ψ − E) dE
0

Note: Here we have chosen Φ0 so that Ψ(r → ∞) = 0. In systems with


infinite total mass, such as the logarithmic potential or the isothermal
sphere, the system is more conveniently normalized such that
!Ψ !Ψ
Ψ(r → ∞) = −∞. In that case 0
dE needs to be replace by −∞
dE .
Isotropic Spherical Models III
This relation may be regarded either as non-linear equation for Ψ(r) given
f (E), or as linear equation for f (E) given Ψ(r).

“from ρ to f ”
As an example, consider a stellar-dynamical system with a DF
& 2
'
ρ1 Ψ− 1
2v
f (E) = 2 )3/2
(2πσ0
exp σ02

The corresponding density is



!2Ψ !Ψ + 2
ρ(Ψ) = 4π f (E)v dv = 4π
2
f (E) 2(Ψ − E)dE = ρ1 eΨ/σ0
0 0
The Poisson equation reads
( ) 2 !r 2
1 d
r 2 dr
r 2 dΨ
dr
= −4πGρ1 e Ψ/σ0
⇒ dΨ
dr
= − 4πGρ
r2
1
r 2 eΨ/σ0 dr
0

Inspection shows that the solution for Ψ(r) and the corresponding ρ(r) are
2
σ0
Ψ(r) = −2σ02 lnr ρ(r) = 2πGr 2

which is the potential-density pair of a singular isothermal sphere.


Isotropic Spherical Models IV
Note that the DF of the singular isothermal sphere implies that
v2

f (v) ∝ e 2σ02

which is identical to a Maxwell-Boltzmann distribution, if we set σ02 = km BT


.
Therefore, the structure of a singular isothermal sphere is identical to that of
an isothermal self-gravitating sphere of gas.
The isothermal nature of this system becomes evident if we consider the
Jeans equation. For a system with f = f (E) there is only one non-trivial
Jeans equation:
1 dρσ 2 dΨ
ρ dr
= dr

where σ 2 ≡ ⟨vr2 ⟩ = ⟨vθ2 ⟩ = ⟨vφ


2
⟩.
Substituting the expressions for ρ and Ψ this yields

σ 2 (r) = σ02
thus the local velocity dispersion, which is related to the “temperature”, is
independent of r .
Eddington’s Formula
“from f to ρ”
Using that Ψ is a monotonic function of r , so that ρ can be regarded as a
function of Ψ, we have
! !Ψ +
ρ(Ψ) = fd ⃗
v = 4π
3
f (E) 2(Ψ − E)dE
0

differentiating both sides with respect to Ψ yields


!Ψ f (E) dE
√1 = √
8π dΨ Ψ−E
0
which is an Abel integral equation, whose solution is

!E dρ
f (E) = √1 d √dΨ
8π 2 dE dΨ E−Ψ
0

This is called Eddington’s formula, which may also be written in the form
1 2
!E & '
d2 ρ dρ
f (E) = √1 2
√dΨ + √1
8π 2 dΨ E−Ψ E dΨ Ψ=0
0
Eddington’s Formula
Given a spherically symmetric density distribution, which can be written as
ρ = ρ(Ψ) (which is not always possible), Eddington’s formula yields a
corresponding DF f = f (E).
Note, however, that there is no guarantee that the solution for f (E) satisfies
the physical requirement that f ≥ 0 for all E .
Using Eddington’s formula

!E dρ
f (E) = √1 d √dΨ
8π dE
2 dΨ E−Ψ
0

we see that the requirement f (E) ≥ 0 is identical to the the requirement


that the function
!E dρ
√dΨ
dΨ E−Ψ
0

is an increasing function of E .
If a density distribution ρ(r) does not satisfy this requirement, then the
model obtained by setting the anisotropy parameter β = 0 [i.e., by
assuming that f = f (E)] and solving the Jeans Equations is unphysical.
Anisotropic Spherical Models I
In the more general case, spherical systems (with spherical symmetry in all
their properties) have f = f (E, L2 ).

These models are anisotropic, in that ⟨vr2 ⟩ ̸= ⟨vθ2 ⟩ = ⟨vφ


2
⟩.
Anisotropic spherical models are non-unique: many different f (E, L2 ) can
correspond to a given ρ(r) and Ψ(r). These models differ, though, in their
dynamic properties. No equivalent of the Edddington Formula thus exists,
that allows to compute f (E, L2 ) given ρ(r).
Additional assumptions need to be made. For example, Kent & Gunn (1982)
discussed models with f (E, L2 ) = g(E)L−2β , which have a constant
anisotropy, i.e., β(r) = β .
An other example are the so called Osipkov-Merritt models (Osipkov 1979;
Merritt 1985) were the assumption is made that f (E, L2 ) = f (Q) with

L2
Q=E− 2
2ra

Here ra is the so-called anisotropy radius.


Anisotropic Spherical Models II
The usefulness of the Osipkov-Merritt models becomes apparent from
& ' !Ψ +
r2
ρQ (r) ≡ 1 + 2
ra
ρ(r) = 4π f (Q) 2(Ψ − Q)dQ
0

Thus [ρQ (r), f (Q)] are similarly related as [ρ(r), f (E)] so that we may
use Eddington’s formula to write

!Q dρQ
f (Q) = √1 d √ dΨ
8π dQ
2 dΨ Q−Ψ
0

For Osipkov-Merritt models one can show that


r2
β(r) = r 2 +ra
2

Thus, these models are isotropic for r ≪ ra , become radially anisotropic at


around ra , and become competely radial at large r .
Since purely radial orbits contribute density at the center, models with
constant density cores can only have DFs of the Osipkov-Merritt form, i.e.,
f = f (E, L2 ) = f (Q), for sufficiently large ra . Alternatively, if ra is
relatively small, the (self-consistent) ρ(r) needs to have a central cusp.
Anisotropic Spherical Models III
Next we consider the family of Quasi-Separable DFs (Gerhard 1991):
L
f (E, L2 ) = g(E) h(x) x= L0 +Lc (E)

with L0 a constant, and Lc (E) the angular momentum of the circular orbit
with energy E .
Here g(E) controls the distribution of stars between energy surfaces, while
the circularity function h(x) describes the distribution of stars over orbits of
different angular momenta on surfaces of constant E .
Depending on the choise for h(x) one can construct models with different
anisotropies. If h(x) decreases with increasing x, the model will be radially
anisotropic, and vice versa.
Once a choise for h(x) is made, one can (numerically) obtain g(E) for a
given ρ(r).
All these various models are useful to explore how different orbital
anisotropies impact on observables, such as the line-of-sight velocity
distributions (LOSVDs).
Anisotropic Spherical Models IV

Velocity profiles, L(v), for the outer parts of spherical f (E, L2 ) models.
Results are shown for β = ∞ (circular orbits), −1, 0 (isotropic model), 0.5,
and 1 (radial orbits). The unit of velocity is the velocity dispersion, which is
different for each curve. (from: van der Marel & Franx 1993)

Velocity profiles are not expected to be Gaussian


Spherical Models: Summary I
In its most general form, the DF of a static, spherically symmetric model has
⃗ . From the symmetry of individual orbits one can see
the form f = f (E, L)
that one always has to have

⟨vr ⟩ = ⟨vθ ⟩ = 0 ⟨vr vφ ⟩ = ⟨vr vθ ⟩ = ⟨vθ vφ ⟩ = 0

This leaves four unknowns: ⟨vφ ⟩, ⟨vr2 ⟩, ⟨vθ2 ⟩, and ⟨vφ


2

If one makes the assumption that the system is spherically symmetric in all
⃗ → f (E, L2 ) and
its properties then f (E, L)

⟨vφ ⟩ = 0 ⟨vθ2 ⟩ = ⟨vφ


2

In this case the only non-trivial Jeans equation is


2
1 ∂(ρ⟨vr ⟩) β⟨vr2 ⟩
ρ ∂r
+ 2 r = − dΦ
dr

with the anisotropy parameter defined by

⟨vr2 ⟩(r)
β(r) = 1 − ⟨vr2 ⟩(r)
Spherical Models: Summary II
Many different models, with different orbital anisotropies, can correspond to
the same density distribution. Examples of models are:
• f (E, L2 ) = f (E) isotropic model, i.e., β = 0

• f (E, L2 ) = g(E)δ(L) radial orbits only, i.e. β = 1

• f (E, L2 ) = g(E)δ[L − Lc (E)] circular orbits only, i.e., β = −∞

• f (E, L2 ) = g(E)L−2β constant anisotropy, i.e. β(r) = β

• f (E, L2 ) = g(E)h(L) anisotropy depends on circularity function h

• f (E, L2 ) = f (E + L2 /2ra2 ) center isotropic, outside radial

Suppose I have measured the surface brightness profile Σ(R) and the
2
line-of-sight velocity dispersion σp (R). Depending on the assumption
regarding β(r) these data imply very different mass distributions M (r).
One can (partially) break this mass-anisotropy degeneracy by using
information regarding the LOSVD shapes.
Axisymmetric Models
Next we consider axisymmetric systems. If we only consider systems for
which most orbits are regular, then the strong Jeans Theorem states that, in
the most general case, f = f (E, Lz , I3 ).
From the symmetries of the individual orbits, it is evident that in this case

⟨vR ⟩ = ⟨vz ⟩ = 0 ⟨vR vφ ⟩ = ⟨vz vφ ⟩ = 0

Note that, in this case, ⟨vR vz ⟩ ̸= 0, which is immediately evident when


considering a thin tube orbit. In other words, in general the velocity ellipsoid
is not aligned with (R, φ, z).
Thus, in a three-integral model with f = f (E, Lz , I3 ) the stress tensor
2 2
contains four unknowns: ⟨vR ⟩, ⟨vφ ⟩, ⟨vz2 ⟩, and ⟨vR vz ⟩.
In this case there are two non-trivial Jeans Equations:
2
$ ⟨v2 ⟩−⟨v2 ⟩ %
∂(ρ⟨vR ⟩) ∂(ρ⟨vR vz ⟩) ∂Φ
∂R
+ ∂z
+ρ R
R
φ
+ =0
2
$ %∂R
∂(ρ⟨vR vz ⟩) ∂(ρ⟨vz ⟩) ⟨vR vz ⟩ ∂Φ
∂R
+ ∂z
+ρ R
+ ∂z
=0

which clearly doesn’t suffice to solve for the four unknowns.


f (E, Lz ) Models I
To make progress, one therefore often makes the additional assumption that
the DF has the two-integral form f = f (E, Lz ).
!
In this case we have that ρ = v2 <2Ψ f (Ψ − 1
2
v 2 , Rvφ )d3⃗
v
where we have imposed the usual condition f = 0 for E < 0.
Let ⃗
vm be the meridional component of ⃗ v and define the cylindrical
coordinates (vm , vφ , ψ) in velocity space, with
vR = vm cos ψ , vz = vm sin ψ
then

!
2π !2Ψ !
ρ= dψ vm dvm 2 <(2Ψ−v 2 )

f [Ψ − 12 (vm
2
+ vφ
2
), Rvφ ]dvφ
m
0 0

Note that one can see now that


1
!
⟨vR vz ⟩ = ρ
vR vz f (Ψ − 12 v 2 , Rvφ )d3⃗
v2 <2Ψ
v=0
! 2π
which follows from the fact that 0 sin ψ cos ψ dψ = 0. Thus, models
with f = f (E, Lz ) have their velocity ellipsoid aligned with (R, φ, z).
f (E, Lz ) Models II
! 2π 2
! 2π
In addition, since 0 sin ψdψ = 0
cos2 ψdψ = π we have that

!2Ψ !
2
⟨vR ⟩ = ⟨vz2 ⟩ =π 3
vm dvm 2 <(2Ψ−v 2 )

f [Ψ − 12 (vm
2
+ vφ
2
), Rvφ ]dvφ
m
0
Thus, we have that

f = f (E, Lz ) =⇒ 2
⟨vR ⟩ = ⟨vz2 ⟩ and ⟨vR vz ⟩ = 0
2 2
Now we have two unknowns left, ⟨vR ⟩ and ⟨vφ ⟩, and the Jeans equations
reduce to
2
$ ⟨v2 ⟩−⟨v2 ⟩ %
∂(ρ⟨vR ⟩) ∂Φ
∂R
+ρ R
R
φ
+ ∂R
=0
2
∂(ρ⟨vz ⟩)
∂z
+ ρ ∂Φ
∂z
=0

which can be solved. Note, however, that the Jeans equations provide no
2
information regarding how ⟨vφ ⟩ splits in streaming and random motions.
In practice one often follows Satoh (1980), and writes that
$ %
⟨vφ ⟩2 = k ⟨vφ
2 2
⟩ − ⟨vR ⟩ . Here k is a free parameter, and the model is
isotropic for k = 1.
f (E, Lz ) Models III
Now let us return to our expression for the density ρ. If we change the
variables of integration from (vm , vφ ) to (E, Ψ), we obtain


!Ψ !
ρ = 0
dE L2 <2(Ψ−E)R2
f (E, Lz )dLz
R z√

!Ψ ! R 2(Ψ−E)
= R 0
dE 0 √
[f (E, Lz ) + f (E, −Lz )] dLz

!Ψ ! R 2(Ψ−E)
= R 0
dE 0
f+ (E, Lz )dLz

where we have defined f+ as the part of the DF that is even in Lz , i.e.,

f (E, Lz ) = f+ (E, Lz ) + f− (E, Lz )


f± (E, Lz ) ≡ 12 [f (E, Lz ) ± f (E, −Lz )]

We thus see that the density depends only on the even part of the DF (i.e., the
density contributed by a star does not depend on its sense of rotation). This
also implies that there are infinitely many DFs f (E, Lz ) that correspond to
exactly the same ρ(R, z), namely all those that only differ in f− (E, Lz ).
f (E, Lz ) Models IV
Thus, given a density distribution ρ(R, z), one can in principle obtain a
unique f+ (E, Lz ). In practice, the computation of f+ (E, Lz ) was
considered very difficult.
It was thought that one needs to (i) be able to express ρ explicitely as a
function of R and Ψ, and (ii) perform a complicated integral transform.
However, this situation changed drastically when Hunter & Qian (1993)
derived an axisymmetric analogue of Eddington’s Formula based on a
complex contour integral, which does not require explicit knowledge of
ρ(R, Ψ).
In addition, Evans (1993, 1994) has found a large and useful family of models
for which all relevant calculations can be done analytically. This is the family
of power-law models, which we already encoutered in the exersizes.
So for a wide range of ρ(R, z), we can compute the unique, corresponding
f+ (E, Lz ). But what about the odd part of the DF, f− (E, Lz )?
While f− (E, Lz ) has no influence on the density distribution, it specifies
the asymmetry between clockwise and counter-clockwise orbits. Hence, it is
responsible for the mean streaming velocity.
f (E, Lz ) Models V
In fact, it is straightforward to show that


!Ψ ! R√2(Ψ−E)
⟨vφ ⟩ = ρR2 0
dE 0
f− (E, Lz ) Lz dLz

In principle, if we were to know ⟨vφ ⟩(R, z), we could solve for f− (E, Lz )
using the Hunter & Qian complex contour integral method, just like we can
recover ρ(R, z) from f+ (E, Lz ).
In practice, the observationally accessible quantities are Σ(x, y) and
vlos (x, y). Unless the system is seen edge-on, one can not uniquely
deproject these for ρ(R, z) and ⟨vφ ⟩(R, z).
f (E, Lz ) Models VI
Most f (E, Lz )-modelling therefore uses the following methodology:
(1a) Assume functional form for ν(R, z) and value for inclination angle i.
Project ν along line-of-sight and adjust free parameters by fitting Σ(x, y).
(1b) Assume value for i and deproject Σ(x, y) using some assumptions.
Examples are the Richardson-Lucy algorithm (Binney, Davies & Illingworth
1990) and the Multi-Gaussian-Expansion method (Emsellem, Monnet &
Bacon 1994). Warning, unless i = 90o these are not unique!
(2) Assume mass-to-light ratio, Υ(R, z), and compute Ψ(R, z) from
ρ(R, z) = Υ(R, z)ν(R, z) using the Poisson equation.
2 2
(3) Solve Jeans Equations for ⟨vR ⟩(R, z) and ⟨vφ ⟩(R, z).
2
(4) Make assumptions regarding split of ⟨vφ ⟩ in streaming motion and
random motion (i.e., pick a value for the Satoh k-parameter).
(5) Project model and compare resulting vlos (x, y) and σlos (x, y) to data
obtained from spectroscopy.
(6) Repeat analysis to constrain i, k and Υ(R, z).

For examples, see Binney, Davies & Illingworth (1990), van der Marel (1991),
Cretton & van den Bosch (1999)
f (E, Lz ) Models VII
The results of f (E, Lz ) modeling can be summarized as follows

• Because σR = σz , flattening of oblate models must come from large


φ-motions; f must be biased towards high-Lz orbits.
• Prolate models require a deficit of high Lz orbits. Strongly elongated,
prolate systems with f = f (E, Lz ) > 0 do not exist.
• Isotropic oblate models, i.e. those with k = 1, in general give poor fits
to the data.
• Anisotropic models fit some galaxies, but not all; those must have
f (E, Lz , I3 ) or be triaxial.
• There is a degeneracy between the mass-to-light ratio and the
anisotropy (e.g., Binney & Mammon 1982). Can be broken by using
higher-order Jeans Equations plus observational constraints on LOSVD
shape (e.g., skewness & kurtosis).
• Several studies have used these models to argue in favor of massive
black holes. However, if f (E, Lz ) model can only fit data with BH, this
is still no proof: need to show that f ≥ 0, and that no f (E, Lz , I3 )
models without BH can fit data equally well.
Example: NGC 4342
Photometry (ground−based + HST) plus MGE fit.

Rotation velocities and velocity dispersions along major axis (WHT + HST)

(from: Cretton & van den Bosch 1999)


Example: NGC 4342
Jeans Models

k = 1, i = 90o

σp in center requires BH
Note: vrms is not well fitted
This is independent of the
assumed value for k, which
only determines how vrms
splits in vrot and σp

⇒ f = f (E, Lz , I3 )

(Cretton & van den Bosch 1999)


Three-Integral Models
In our discussion on orbits we have seen that most orbits in realistic,
axisymmetric galaxy potentials are regular, quasi-periodic, and confined to
the surfaces of invariant tori.
Therefore, most orbits admit three isolating integrals of motion in involution.
According to strong Jeans Theorem, we thus expect f = f (E, Lz , I3 ).
In this case the two non-trivial Jeans equations have four unknowns and can
not be solved. In addition, there is no equivalent of Eddington’s formula to
obtain “f from ρ”.
In the past, axisymmetric three integral models have been constructed using
• special, separable potentials (Stäckel potentials), for which I3 is known
exactly (Bishop 1986; de Zeeuw & Hunter 1990)
• approximate third integrals (Petrou 1983). The most detailed work along
this direction is due to Dehnen & Gerhard (1993), who evaluated the
approximate I3 from resonant perturbation theory.
• orbit superposition techniques. These are based on integrating large
numbers of orbits, and then to find the combination of orbits that best
matches the data (Schwarzschild 1979; Richstone 1984; Cretton et al. 1999).
This method has recently received much attention.
Three-Integral Models
Schwarzschild’s orbit superposition technique for modelling axisymmetric
galaxies is based on the following steps:
(1) Use techniques described above to obtain model for 3D light distribution
ν(R, z), under an assumed value for the inclination angle i.
(2) Assume a value for the stellar mass-to-light ratio Υ, and compute the
corresponding potential Ψ(R, z) from the Poisson equation. To this
potential one may add that of a central BH and/or a dark matter halo.
(3) Integrate a large sample of orbits in the total potential. Make sure to cover
full allowed ranges of E , Lz , and I3 : sampling (E, Lz )-space is trivial,
while I3 is sampled by location of turning point on zero-velocity curve.
(4) Compute for each orbit k it’s contribution akj to each observable j , such
as the value of the velocity profile L(vj ) at the projected location (xj , yj ).
(5) Find the orbital weights wk ≥ 0 that minimize the quantity
# # 2
j [L(xj , yj , vj ) − k w a ]
k kj . Since the number of orbits is typically
much larger than the number of observational constraints, this is typically
done using a Non-Negative Least Squares algorithm.
(6) Repeat entire analysis for different i, Υ, MBH , etc. in order to constrain
these free parameters.
Example: NGC 4342

(Cretton & van den Bosch 1999)


Models without BH are now clearly ruled out. Note though, that unlike the 2I
Jeans models, the 3I models can accurately fit the ground-based V (R) and
σ(R). This owes to the larger amount of freedom of these models.
Triaxial Systems I
The dynamics of triaxial systems is much more complicated than that of
axisymmetric or spherical systems. The main reasons are the lack of
symmetry, and the presence of four main orbit families (as opposed to one).

Motivation for considering Triaxial Systems


• Slow rotation of (massive) ellipticals (Bertola & Capaccioli 1975; Illingworth
1977) implies that they are inconsistent with isotropic oblate rotators. They
are supported by anisotropic pressure, and, as argued by Binney (1978), are
therefore more likely to be triaxial then axisymmetric.
• Some ellipticals reveal zero-velocity curves that are misaligned with
principal axes. This implies triaxiality, as the presence of both long- and
short-axis tubes means that the total angular momentum vector may point
anywhere in the plane containing both the long and the short axes (Franx,
Illingworth & de Zeeuw 1991).
• Some galaxies reveal isophotal twists, in which the position angle of the
isophotes changes with radius. This has a most natural explanation if these
systems are triaxial with axis ratios that vary with radius (e.g., Stark 1977)
• Numerical simulations show that collapsed haloes are often triaxial (van
Albada 1982; Warren et al. 1992)
Observational Hints for Triaxiallity
Isophotal Twist Kinematically Decoupled Core

VCC1010

NGC4365

Ryden et al 1999 The Sauron Team


Triaxial Systems II
But, case for triaxiality may not be that strong:
• N -body simulations that include a dissipative component often reveal
evolution towards axisymmetry (Udry 1993; Dubinsky 1994)
• Pressence of central BH tends to drive system towards axisymmetry
(Norman, May & van Albada 1985; Merritt & Quinlan 1998). In both cases, this
is due to fact that box orbits become stochastic in steep potential.
• In realistic systems with steep central cusp large fraction of phase-space is
occupied by stochastic orbits. Precludes stationary triaxial solutions
(Schwarzschild 1993; Merritt 1997).
• Anistropic pressure in axisymmetric systems can also explain
slow-rotators if system has sufficient amount of counter-rotation. Some
axisymmetric systems like this are known (e.g., NGC 4550; Rix et al. 1992).
• Low luminosity ellipticals, in general, lack isophotal twists, are strongly
cusped, and are rotationally supported: they are perfectly consistent with
being axisymmetric (Bender et al. 1989; Ferrarese et al. 1994; Faber et
al. 1997).
Self-Consistent Triaxial Models
In triaxial systems that are close to separable (i.e., most orbits are regular),
the strong Jeans Theorem implies that f = f (E, I2 , I3 ).
The seminal work of Schwarzschild (1979, 1982) showed that self-consistent
models of triaxial systems exist, both with stationary figures and with slowly
tumbling figures. To this extent he used orbit superposition techniques.
In particular, Schwarzschild’s work has shown that many different orbital
configurations, i.e., many different f (E, I2 , I3 ), can produce the same
density distribution. However, these can have very different kinematical
structures (Statler 1987, 1991).
The orbit superposition technique only works well when most orbits are
regular (in separable potentials, or in potentials with large cores).
In more realistic systems with central cusps, large fraction of phase-space is
occupied by stochastic orbits. These are difficult to deal with in
orbit-superposition techniques: depends on rate of stochastic diffusion.
Most work has focussed on class of separable (Stäckel) potentials (Kuzmin
1973; de Zeeuw 1985) and on scale-free potentials (Gerhard & Binney 1985).
Probably these are not very realistic, but they provide useful insight
Kinematics I
We now focus shortly on how to extract kinematic information from spectra
of galaxies.
The spectrum at a given location (x, y) of the projected galaxy is the sum of
the spectra of all stars along the line-of-sight (los) Doppler shifted according
to their velocity along the los
For a source with los-velocity v the observed and emitted frequencies are
related according to

νobs = (1 − β)γνem

with β ≡ v/c and γ ≡ (1 − β 2 )−1/2 . To lowest order in v/c we have that


γ = 1 and thus
( v
)
νobs = 1 − c
νem
Using that the wavelength λ = c/ν we obtain, again to lowest order in v/c
that
( v
)
λobs = 1 + c
λem
Kinematics II
If we now define x ≡ lnλ then
& ' ( )
λobs ∆v ∆v
∆x = ln λem
= ln 1 + c
≃ c

where we have used the first term of the series expansion of ln(1 + x)
Let S(x) correspond to an observed spectrum, rebinned linearly in x. Then
we may write that

S(x) = T (x) ⊗ B(x)


Here T (x) is the template spectrum, which is luminosity weighted spectrum
of all the various stars along the los, B(x) is the broadening function, which
gives the probability distribution of the los velocities of all these stars, and ⊗
denotes convolution.
Since convolution in real space corresponds to multiplication in Fourier
space we have that

Ŝ(k) = T̂ (k) · B̂(k)

where F̂ indicates the Fourier Transform of F , and k is the wavenumber.


Kinematics III
This gives us immediately an unparameterized method to obtain the
broadening function from the observed spectrum
$3%

B(x) = T̂

This method has the advantage that no assumption is made regarding the
functional form of B(x). However, the disadvantage is that one needs to
adopt some noise filtering which introduces correlations between different
points in the B(x) estimate. This complicates comparison with models.
The alternative is to use a parameterized method, by assuming a functional
form B̃(x) for the broadening function which has n free parameters. The
best-fit parameters are obtained by minimizing
! $ %2
χ2 = S(x) − B̃(x) ⊗ T (x) dx

Rather than talking about the broadening function B(x) one often prefers to
talk about the velocity profile L(v) ≡ B(v/c), also called the LOSVD
Kinematics IV
A typical functional form to assume for L(v) is a simple Gaussian
1 2
L(v) = √1 e −2w
2πσ

where w ≡ (v − V )/σ . Note that this parameterization has two free


parameters: V and σ .
However, L(v) is generally not Gaussian, and a more general
parameterization is required. Van der Marel & Franx (1993) and Gerhard
(1993) have suggested using the Gauss-Hermite series
1 2
−1 w2
#
N
L(v) = √1 e 2 1+ hj Hj (w)
2πσ
j=3

where Hl (x) are Hermite Polynomials of degree l, and hj are additional free
parameters that describe the deviation of L(v) from a pure Gaussian
(i.e., hj = 0 for a Gaussian LOSVD).
Typically one truncates the series expansion at N = 4 so that the LOSVD
has four free parameters: V , σ , h3 , and h4 . Their best-fit values are
determined by minimizing the χ2 defined above.
Kinematics V

(from: van der Marel & Franx 1993)


Note that the odd Gauss-hermite coefficients (h3 , h5 , etc.) characterize
anti-symmetric deviations of L(v) from a Gaussian, while the even
coefficients (h4 , h6 , etc.) characterize the symmetric deviations.
The LOSVD shape contains useful information to break the degeneracy
between mass and anisotropy
The Relaxation Puzzle
Relaxation: the process by which a physical system acquires equilibrium or
Relaxation: returns to equilibium after a disturbance.
Often, but not always, relaxation erases the system’s ‘knowledge’ of its initial
conditions.
In the first lecture we defined the two-body relaxation time as the time
required for a particle’s kinetic energy to change by its own amount due to
long-range collisions. We found that
N
trelax = t
10lnN cross

For galaxies and DM haloes one always finds that trelax ≫ tHubble . As first
pointed out by Zwicky (1939), how is it possible that galaxies appear relaxed?
In particular, if galaxies did not form near equilibrium, two-body relaxation
certainly did not have the time to get them there. This would imply that
galaxies all form in (virial) equilibrium, which seems very contrived.

The solution to this puzzle is that alternative relaxation mechanisms exist.


Relaxation Mechanisms
In gravitational N -body systems the four most important relaxation
mechanisms are:

• Phase Mixing The spreading of neighboring points in phase-space due to


• the difference in frequencies between neighboring tori.
• Chaotic Mixing The spreading of neighboring points in phase-space due
• to the chaotic nature of their orbits.
• Violent Relaxation The change of energy of individual particles due to
• the change of the overall potential.
• Landau Damping The damping (and decay) of perturbations in the density
• and/or potential.

We will discuss each of these in turn.


As we will see later, Violent Relation and Landau Damping are basically
specific examples of Phase Mixing!!.
Phase Mixing I

Consider circular motion in a disk with Vc (R) = V0 = constant. The


frequency of a circular orbit at radius R is then
1 V0
ω= T
= 2πR
Thus, points in the disk that are initially close will separate according to
( V0 )
∆φ(t) = ∆ R t = 2π∆ωt
We thus see that the timescale on which the points are mixed over their
accessible volume in phase-space is of the order of
1
tmix ≃ ∆ω
Phase Mixing II
In this example the phase-mixing occurs purely in ‘real space’. In the more
general case, however, the phase-mixing occurs in phase-space.
Phase mixing is the simplest mechanism that causes relaxation in
gravitational N -body systems. Because the frequencies of regular motion
on adjacent invariant tori are generally similar but different, two points on
adjacent tori that are initially close together in phase-space, will seperate
linearly with time. However, two points on the same torus do not phase-mix;
their separation remains invariant.
Note that phase-mixing decreases the coarse-grained DF around a point, by
mixing in ‘vacuum’ (i.e., unpopulated regions of phase-space). Nevertheless,
as assured by the CBE, the flow in phase-space of a collisionless system is
perfectly incompressible: unlike the coarse-grained DF, the fine-grained DF
is not influenced by phase-mixing and is perfectly conserved.
Although phase-mixing is a relaxation mechanism, in that it drives the
system towards a state in which the phase-space density is more and more
uniform, it does not cause any loss of information: the system preserves all
knowledge of the initial conditions.
In other words, in an integrable, Hamiltonian system phase mixing is a
time-reversible relaxation mechanism!
The Lyapunov Exponent I
The Lyapunov exponent of a dynamical system is a measure that determines,
for a given point in phase space, how quickly trajectories that begin in this
point diverge over time.
Actually, for each point in a 2N -dimensional phase space (N is the number
of degrees of freedom) there are 2N Lyaponov exponents λi , but it is
common to just refer to the largest one.
Consider a small 2N -dimensional sphere with radius r centered on a
phase-space point ⃗ x. Different points on the sphere’s surface evolve
differently with time, causing the sphere to deform into a 2N -dimensional
ellipsoid with principal axes Li (t).
The Lyapunov exponents for ⃗
x are defined as
& '
dLi (t)
λi = lim 1t ln dr
t→∞

In a collisionless system
#
2N
λi = 0
i=1
which expresses the incompressibility of the flow (conservation of volume).
The Lyapunov Exponent II
x is regular then λi = 0 for i = 1, ..., 2N .
If the trajectory through ⃗
On the other hand, if the trajectory is stochastic then λ ≡ max λi > 0.

Note that a positive Lyapunov exponent implies that the neighboring


trajectories diverge exponentially:
δΓ(t) ∝ eλt

The inverse of the largest Lyapunov exponent is called the Lyapunov time,
and defines the characteristic e-folding time.
For a chaotic (stochastic) orbit the Lyapunov time is finite, while it is infinite
for regular orbits.
Chaotic Mixing I

In the parts of phase-space that are not filled with regular, but with stochastic
orbits, mixing occurs naturally due to the chaotic behavior of the obits.
Chaos implies a sensitivity to initial conditions: two stars intially close
together separate exponentially with time.
After a sufficiently long time, the group of stars that were initially close
together will have spread over the entire accessible phase-space (ie., the
Arnold web). As for phase-mixing, chaotic mixing thus smooths out (i.e.,
relaxes) the coarse-grained DF, but leaves the fine-grained DF invariant.
Chaotic Mixing II
Unlike for phase-mixing, chaotic mixing is irreversible, in the sense that an
infinitely precise fine-tuning of the phase-space coordinates is required to
undo its effects.
Chaotic mixing occurs on the Lyapunov timescale.
However, the mixing rate of chaotic ensembles typically falls below the
Lyapunov rate once the trajectories separate, because stochastic orbits are
often confined over long periods of time to restricted parts of phase-space.
The time scale on which the orbits are uniformly spread over their accessible
phase-space then becomes dependent on the efficiency of Arnold diffusion.
All in all, the effective rates of phase mixing and chaotic mixing might
therefore be comparable in real galaxies.

For a nice review, see Merritt (1999)


Violent Relaxation I
Since E = 1
2
v 2 + Φ and Φ = Φ(⃗
x, t) we can write that
dE d⃗ ∂E dΦ
dt
= ∂E
∂⃗
v
· v
dt
+ ∂Φ dt
= −⃗ ⃗ + dΦ
v · ∇Φ dt
= −⃗ ⃗ +
v · ∇Φ ∂Φ
+ ∂Φ
· d⃗
x
∂t ∂⃗x dt
= −⃗ ⃗ + ∂Φ + ⃗
v · ∇Φ ⃗
v · ∇Φ
∂t
= ∂Φ
∂t
Thus we see that the only way in which a particle’s energy can change in a
collisionless system, is by having a time-dependent potential.
An interesting case to consider is the collapse of a dark matter halo, or that
of a galaxy. In this case the potential will vary as function of time, and the
particles thus change their energy
Exactly how a particle’s energy changes depends in a complex way on the
particle’s initial position and energy: particles can both gain or loose energy
(see fig. on next page). Overall, however, the effect is to widen the range of
energies.
Violent Relaxation II
No Relaxation

Particle looses energy

Particle gaines energy

time
Violent Relaxation III
A few remarks regarding Violent Relaxation:
• During the collapse of a collisionless system the CBE is still valid, i.e., the
fine-grained DF does not evolve (df /dt = 0). However, unlike for a
‘steady-state’ system, ∂f /∂t ̸= 0.
• A time-varying potential does not guarantee violent relaxation. One can
construct oscillating models that exhibit no tendency to relax: although the
energies of the individual particles change as function of time, the relative
distribution of energies is invariant (cf. Sridhar 1989).
Although fairly artificial, this demonstrates that violent relaxation requires
both a time-varying potential and mixing to occur simultaneous.
• The time-scale for violent relaxation is
4 5−1/2 4 5−1/2
(dE/dt)2 (∂Φ/∂t)2 3
tvr = E2
= E2
= 4
⟨ Φ̇2
/Φ 2 −1/2

where the last step follows from the time-dependent virial theorem (see
Lynden-Bell 1967).
Note that tvr is thus of the order of the time-scale on which the potential
changes by its own amount, which is basically the collapse time. ◃ the
relaxation mechanism is very fast. Hence the name ‘violent relaxation’
Violent Relaxation IV
0 0.2 0.4 0 0.2 0.4 0 0.2 0.4

-2

-4

-2

-4

0 0.2 0.4 0 0.2 0.4 0.6 0.8 0 1 2 3

(from: Henriksen & Widrow 1997)

Collapse of a spherical system with ρinit ∝ r −3/2


Violent Relaxation V
In the collapse simulation of Henriksen & Widrow (1997), phase-mixing is the
dominant relaxation mechanism during the initial phases of the collapse.
After some time, there is a transition to a more ‘chaotic’ flow: due to the
time-varying potential particles on neighboring phase-space streams start to
mix rapidly (i.e., violent relaxation kicks in).

Violent relaxation leads to efficient fine-grained mixing of the DF, and erases
the system’s memory of its initial conditions.
For comparison: phase-mixing only leads to a relaxation of the
coarse-grained DF and is reversible.

Another important aspect of violent relaxation is the fact that it changes a


particle’s energy in a way that is independent of the particle’s mass. Thus
violent relaxation drives the system to a relaxed state that is very different
from the one promoted by collional relaxation, where momentum
conservation in collisions causes equipartition of energy (ie. mass
segregation).
Landau Damping I
In 1946 Landau showed that waves in a collsionless plasma can be damped,
despite the fact that there is no dissipation.
This damping mechanism, called Landau Damping, also operates in
gravitational, collisionless systems, and is thus another relaxation
mechanism.
The physical reason for this collisionless damping arises from the detailed
interaction of the wave with the orbits of the background particles which are
not part of the wave (i.e., particle-wave interactions).
To gain insight, it is useful to start by considering a fluid. Perturbation
analysis of the fluid shows that if the perturbation has a wavelength
λ < λJ , with λJ the Jeans length, then the perturbation is stable, and the
wave propagates with a phase velocity
+
vp = c 1 − λ2 /λ2J
with c the sound speed. Note that larger waves move slower, which owes to
the fact that self-gravity becomes more and more important.
When λ = λJ the wave no longer propagates. Rather, the perturbation is
unstable: self-gravity overpowers the pressure, causing the perturbation to
grow.
Landau Damping II
One can apply a similar perturbation analysis to collisionless, gravitational
systems (see B&T, Section 5.1). This yields a similar Jeans criterion, but with
the velocity dispersion of the stars, σ playing the role of the sound speed.
Once again, perturbations with λ > λJ are unstable and cause the
perturbation to grow.
For λ < λJ , however, the situation is somewhat different. While the fluid
supports gravity-modified sound waves that are stable, the equivalent waves
in gravitational systems experience Landau Damping.
Consider a density wave with λ < λJ . While for a fluid the phase velocity
vp < c, in a gravitational system we have that vp < σ .
Stars that move faster than the wave (i.e., with v > vp ) loose energy and
tend to be captured by the wave: they tend to amplify the wave. Inversely,
stars with v < vp gain energy and thus tend to damp the wave.
If, for simplicity, we assume a Gaussian distribution of velocities, centered
on v = 0 and with a velocity dispersion of σ , we see immediately that there
will be more stars with v < vp than with v > vp . Consequently, the net
effect will be to damp the wave.
Landau Damping III

Particles initially at A and D gain energy during the first quarter cycle of the
wave (i.e., their net velocity increases), while those at B and C loose energy.
Since there are more particles at A than at C and more at D than at B (see
distribution at left), the particles experience a net gain. Consequently, the
wave has to experience a net loss.
The End-State of Relaxation I
The various relaxation mechanism discussed will drive the system to an
equilibrium configuration. However, there are many different equilibrium
configurations for a collisionless system of mass M and energy E . So to
which of these configurations does a system settle?
This is a very complicated problem which is still largely unsolved.
One might expect the system to evolve to a most probable state.
This actually happens in a collisional gas, where the collisions cause a rapid
exchange of energy between the particles. The most probable distribution is
obtained by maximizing the entropy, which results in the Maxwell-Boltzmann
velocity distribution.
If one applies the same logic to collisionless systems, and (somewhat
naively) defines the systems entropy by
!
S=− f lnf d3 ⃗
xd3⃗
v
then one finds (not surprisingly) that the functional form of f which
maximizes S subject to given values of the system’s mass and energy is that
of a singular isothermal sphere, for which the DF again has a Maxwellian
velocity distribution (see Lynden-Bell 1967).
The End-State of Relaxation II
However, a singular isothermal sphere has infinite mass, and is thus
inconsistent with our constraint. Thus, no DF that is compatible with finite
M and E maximizes S !
In fact, one can show that one can always increase the system’s entropy (as
defined above) by increasing the systems central concentration (see
Tremaine et al. 1986).
Thus, violent relaxation will drive the system towards its equilibrium state by
making the system more concentrated, but can never reach it. In other
words, there is no end state. This seems unsatisfactory.
There are two possible ‘solutions’, both of which are probably correct:
(1) The relaxation is not complete.
(2) The expression for the entropy is not appropriate.
As an example of the latter, since gravitational systems do not obey
extensive statistics, a more appropriate definition for the entropy may be
!
Sq = − f q lnq f d3 ⃗
xd3⃗
v
where lnq (x) = (x1−q − 1)/(1 − q) with q ̸= 1 (see Hansen et al. 2004).
Note that for q = 1 one recovers the normal Boltzmann-Gibbs entropy S .
The End-State of Relaxation III
Despite the uncertainties regarding the definition of entropy, it has become
clear that relaxation is in general not complete. There are various reasons for
this:
• The time-scales for phase-mixing and chaotic mixing may not be small
enough compared to the Hubble time.
• Violent relaxation is only efficient as long as the potential fluctuates. This
requires global, coherent modes, which, however, as strongly damped by
Landau Damping. This occurs roughly on the collapse time-scale, which is
also the time-scale on which violent relaxation works. Thus it is not too
suprising if it is not complete.
• This is strongly supported by numerical simulations, which show that the
end-state of violent relaxation depends on the clumpiness of the initial
conditions (van Albada 1982). This illustrates that the final state is not
completely governed by statistical mechanics alone, but also by the details
of the collapse.

Much work is still required before we have a proper understanding of why


dark matter haloes and galaxies have the (semi)-equilibrium states they have.
Collisions & Encounters I
A

rAS

rAB
r0 rBS b

Let A encounter B with an initial velocity v∞ and an impact parameter b.


A star S (red dot) in A gains energy wrt the center of A due to the fact that
the center of A and S feel a different gravitational force due to B .
Let ⃗
v be the velocity of S wrt A then
& '
dES
=⃗ g [⃗
v ·⃗ rBS (t)] ≡ ⃗ ⃗ B [⃗
v · −∇Φ ⃗ B [⃗
rAB (t)] − ∇Φ rBS (t)]
dt

We define ⃗ r0 as the position vector ⃗


rAB of closest approach, which occurs
at time t0 .
Collisions & Encounters II
If we increase v∞ then |⃗
r0 | → b and the energy increase
!t0
∆ES (t0 ) ≡ g [⃗
v ·⃗
⃗ rBS (t)] dt
0
dimishes, simply because t0 becomes smaller. Thus, for a larger impact
velocity v∞ the star S withdraws less energy from the relative orbit between
A and B .
This implies that we can define a critical velocity vcrit , such that for
v∞ > vcrit galaxy A reaches ⃗ r0 with sufficient energy to escape to infinity.
If, on the other hand, v∞ < vcrit then systems A and B will merge.
If v∞ ≫ vcrit then we can use the impulse approximation to analytically
calculate the effect of the encounter.
In most cases of astrophysical interest, however, v∞ ∼ < v
crit and we have
to resort to numerical simulations to compute the outcome of the encounter.
However, in the special case where MA ≪ MB or MA ≫ MB we can
describe the evolution with dynamical friction, for which analytical estimates
are available.
Dynamical Friction I
Consider the motion of a system with mass M through a medium consisting
of many individual ‘particles’ of mass m ≪ M . As an example, think of a
satellite galaxy moving through the dark matter halo of its parent galaxy.

M
vM

Due to gravitational focussing M creates an overdensity of particles behind


its path (the wake). The backreaction of this wake on M is called dynamical
friction and causes M to slow down. Consequently, energy is transferred
from the massive to the less massive bodies: dynamical friction is a
manifestation of mass segregation.
Dynamical Friction II
Assuming, for simplicity, a uniform density medium with an isotropic velocity
distribution f (vm ) of the particles m ≪ M , then

⃗ d⃗
vM 4πG2 M 2
Fdf = M dt = − v2 lnΛ ρ(< vM )
M

with lnΛ the Coulomb logarithm and


v!M
ρ(< vM ) = 4π f (vm )vm
2
dvm
0
the mass density of background particles with velocities vm < vM .
The derivation of this equation (see B&T Sect. 7.1) is due to Chandrasekhar
(1943), and one therefore often speaks of Chandrasekhar dynamical friction.

Note that F⃗df ∝ M 2 : the amount of material that is deflected (i.e., the
‘mass’ of the wake) is proportional to M and the gravitational force that this
wake exerts on M is proportional to M times its own mass.
−2
⃗df ∝ v in the limit of large vM , but F ⃗df ∝ vM in the limit of
Note that F M
small vM [i.e., for sufficiently small vM one may replace f (vm ) with f (0)].
⃗df is independent of m!
Note that F
The Coulomb Logarithm
One has that Λ = bmax /bmin with bmin and bmax the minimum and
maximum impact parameters for which encounters can be considered
effective:
Encounters with b > bmax don’t cause a significant deflection, and these
therefore do not contribute significantly to the wake. Encounters with
b < bmin cause a very strong deflection so that these also do not contribute
to the wake.
We can estimate bmin as the impact parameter that corresponds to a close
encounter (see first lecture), and thus bmin ≃ GM
⟨v2 ⟩
with ⟨v 2 ⟩1/2 the rms
velocity of the background particles.
The maximum impact parameter, bmax , is much harder to estimate (see
White 1976), and one typically simply takes bmax ≃ L with L the size of the
system.
Typical values that one encounters for the Coulomb Logarithm are
3∼< lnΛ < 30.

Dynamical Friction: Local vs. Global
Note that Chandrasekhar Dynamical Friction is a purely local phenomenon:
⃗df depends only on the local density ρ(< v),
The dynamical friction force F
and the backreaction owes to a local phenomenon, namely wake-creation
due to gravitational focussing.
However, a system A can also experience dynamical friction due to a system
B when it is located outside of B (Lin & Tremaine 1983). Clearly, this friction
can not arise from a wake. Instead, it arises from torques between A and
stars/particles in B that are in resonance with A (Tremaine & Weinberg
1984).

(Weinberg 1989)
The extent to which dynamical friction is a local (wake) versus a global
(resonant-coupling) effect is still being debated.
Orbital Decay I
Consider a singular, isothermal sphere with density and potential given by

Vc2
ρ(r) = 4πGr 2
Φ(r) = Vc2 lnr
If we further assume that this sphere has, at each point, an isotropic and
Maxwellian velocity distribution, then
$ 2
%
ρ(r) vm
f (vm ) = (2πσ 2 )3/2
exp − 2σ2

with σ = Vc / 2. Now consider a test-particle of mass M moving on a
circular orbit (i.e., vM = Vc ) through this sphere. The Chandrasekhar
dynamical friction that this particle experiences is
$ %
4πlnΛG2 M 2 ρ(r) GM 2
Fdf = − erf (1) − √2 e−1 ≃ −0.428 lnΛ
Vc2 π r2

The test-particle has an angular momentum L = rvM , which it looses, due


to dynamical friction, at a rate
dL
dt
= r ∂vM
∂t
= r Fdf
M
= −0.428 lnΛ GM
r
Due to this angular momentum loss the test-particle moves to a smaller
radius, while it continues on circular orbits with vM = VC .
Orbital Decay II
The rate at which the radius changes follows from

Vc dr
dt
= −0.428 lnΛ GM
r

Solving this differential equation subject to the initial condition r(0) = ri


one finds that the test-particle reaches the center after a time
2
1.17 ri Vc
tdf = lnΛ GM

As an example, consider the LMC. Assume for simplicity that the LMC moves
on a circular orbit at ri = 50 kpc, that the mass of the LMC is
M = 2 × 1010 M⊙ , and that the MW can be approximated as a singular
isothermal sphere with Vc = 220 km s−1 and with a radius of
r = 200 kpc.
We then find that the LMC will reach the center of the MW halo after a time
tdf ≃ 7.26
lnΛ
Gyr. Using the approximation for Λ discussed before we find
that lnΛ ≃ 6, and thus tdf ≃ 1.2 Gyr.
Orbital Decay III
The derivation on the previous pages was for a circular orbit. We now focus
on the orbital decay of an eccentric orbit, whose eccentricity is defined as
r+ − r−
e= r+ + r−

with r+ and r− the apo- and pericenter, respectively.


For simplicity, we once again focus on a singular isothermal sphere, for
which the radius of a circular orbit with energy E is given by
& '
1 E
rc (E) = √
e
exp Vc2

We can express the angular momentum of an eccentric orbit in terms of the


orbit’s circularity
L L
η≡ Lc (E)
= rc (E)Vc

The circularity η is uniquely related to the orbital eccentricity e, with


de/dη < 0:
Circular orbit: η = 1 and e = 0
Radial orbit: η = 0 and e = 1
We now investigate how dynamical friction influences the orbit’s evolution.
Orbital Decay IV
Dynamical friction transfers both energy and angular momentum from the
test-particle to the particle’s that make up the halo. Let’s examin how this
influences the orbit’s eccentricity
de de dη
dt
= dη dt

Using the definition of the orbital circularity we obtain


& ' $ %
dη d L 1 dL L ∂Lc (E) dE 1 dL 1 dE
dt
= dt Lc (E)
= Lc (E) dt
− L2
=η − Vc2 dt
c (E) ∂E dt L dt

where we have used that Lc (E) = rc (E)Vc . Using that L = rv⊥ , with
v⊥ the velocity in the direction perpendicular to the radial vector, we find that
dE
dt
= v dv
dt
dL
dt
= r dv⊥
dt
= L dv
v dt
Combining all the above we finally find that
, & '2 -
de η de v dv
dt
= v dη
1− Vc dt

where dv/dt = Fdf /M < 0 (see van den Bosch et al. 1999).
Orbital Decay V
de
At pericenter we have that v > Vc . Since η > 0, dη < 0, and dv
dt
< 0 we
thus have that de
dt
< 0; the eccentricity decreases and the orbit becomes
more circular.
However, at apocenter v < Vc and therefore de
dt
> 0: the orbit becomes
more eccentric during an apocentric passage.
The overall effect of dynamical friction on the orbit’s eccentricity, integrated
over an entire orbit, can not be obtained from inspection: numerical
simulations are required.

For realistic density distributions one finds that de


dt
∼ 0: contrary to what is
often claimed in the literature, dynamical friction does (in general) not lead to
circularization of the orbit (see van den Bosch et al. 1999).
As an example of an orbit that circularizes, consider a space-ship on an
eccentric orbit around the Earth. It only experiences a friction, due to the
Earth’s atmosphere, during a pericentric passage, and this causes the
‘grazing’ orbit of the space-ship to circularize.
Numerical simulations have shown that tdf ∝ η 0.53 .
Orbital Decay VI

van den Bosch et al. (1999)


Orbital Decay VII

van den Bosch et al. (1999)


The Impulse Approximation I
There are two kinds of encounters between collisionless systems that can be
treated analytically:
• Encounters of very unequal mass ◃ Dynamical Friction
• Encounters of very high speed ◃ Impulse Approximation
As we have seen before, when v∞ becomes larger, the effect of the
encounter diminishes. Therefore, for sufficiently large v∞ we can treat the
encounter as a perturbation.
The crucial assumption of the impulse approximation is that the tidal forces
due to the perturber act on a timescale ≪ orbital time scale of the perturbed
stars, so that we may consider the star stationary during the encounter.
◃ No resonant effects
◃ Instantaneous change in velocity of each star
◃ Magnitude of ∆⃗
v depends on location of star but not on its velocity
◃ If the encounter speed is sufficiently large then perturber moves in
◃ straight line with vp (t) = v∞⃗
ey ≡ vp⃗ ⃗
ey and R(t) = (b, vp t, 0).

Note that the equations for vp (t) and R(t) define the coordinate system
that we will adopt in what follows.
The Impulse Approximation II
Consider a system P , which we call the perturber, encountering another

system S with an impact parameter b and an initial velocity v∞ . Let R(t) be
the position vector of P from S and vp (t) the velocity of P wrt S .
P

b
b’ R x
θ

S z

In the large-v∞ limit we have the b′ ≃ b and vp (t) ≃ v∞⃗


ey ≡ vp⃗
ey so

that R(t) = (b, vp t, 0).
A star in S experiences a gravitational force due to P given by

GMp f (R)R
a∗ (t) =
⃗ R3
with f (R) the fraction of P ’s mass that falls within R.
The Impulse Approximation III
We consider the case with b > max[Rp , Rs ] with Rp and Rs the sizes of
P and S , respectively.
In this distant encounter approximation we have that f (R) = 1, and
!

∆⃗
v∗ = a(t)dt

−∞
!

(b,v t,0)dt
= GMp p
(b2 +vp 2 t2 )3/2
−∞
6 7
GMp !

b ds
!

s ds
= vp (s2 +b2 )3/2
, (s2 +b2 )3/2
,,0
−∞ −∞
(
GMp 2 )
= vp b
, 0, 0
2GMp
= vp b

ex
The ratio Mp /vp is called the collision strength. In impulse approximation
the mass and velocity of the perturber only enter through this ratio.
We can split this ∆⃗
v∗ in two components: the component ∆⃗ vS which
describes change in center of mass velocity of S , and the component ∆⃗
v
wrt the systematic velocity of S .
The Impulse Approximation IV
Since we are interested in how P modifies the internal structure of S , we are
only interested in ∆⃗
v.
Note that ∆⃗ v arises due to the tidal forces on S , which arise from the fact
that the gravitational attraction of P is not uniform over S .
We define a rotating coordinate frame (x′ , y ′ , z ′ ) centered on the center of
S , and with the x′ -axis pointing towards the instantaneous location of P .
Let ⃗ ⃗ = R⃗
r ′ be the position vector of a star in S , and R ex′ the position
vector of P .
GMp
r ′ due to P is Φ(⃗
The potential at ⃗ r′ ) = − ⃗ .
r ′ −R|
|⃗

r’−R
r’
ϕ
R
From the above figure one can see that
⃗ 2 = (R − r ′ cos φ)2 + r ′2 sin2 φ = R2 − 2rR cos φ + r ′2
r ′ − R|
|⃗
The Impulse Approximation V
Using the series expansion √ 1 = 1 − 12 x + 1·3 2
2·4
x − 1·3·5 3
2·4·6
x + ... this
1+x
yields
, & ' & '2 -
1 1 1 r′ ′2
r 3 r′ r ′2


r ′ −R|
|⃗
= R
1− 2
−2 R cos φ + R2
+ 8
−2 R cos φ + R2
+ ...

which allows us to write


GM GMp r ′ GMp r ′2 (3 1
)
Φ(⃗
r)= ′
− Rp − R2
cos φ − R3 2
cos φ −
2
2
− ...
The first term is a constant and foes not yield any force.
GM
The second term yields a uniform acceleration R2 p ⃗ ex′ directed towards P .
This is the term that causes the center of mass of S to change its velocity,
and is not of interest to us.
In the tidal approximation one considers the third term:
GM (3 1 ′2
)
Φ3 (⃗
r)= ′
− R3 p 2
r cos φ −
′2 2
2
r
Using that r ′ cos φ = x′ and that r ′2 = x′2 + y ′2 + z ′2 we obtain
GM ( )
Φ3 (x , y , z ) =
′ ′ ′
− 2R3p 2x ′2
−y ′2
−z ′2
The Impulse Approximation VI
The above allows us to write the tidal forces on S as
2GMp x′ GMp y ′ GMp z ′
F x′ = R3
Fy ′ = − R3 Fz ′ = − R3

These are related to the tidal forces in the (x, y, z) coordinate system:

Fx = Fx′ cos θ − Fy′ sin θ


Fy = Fx′ sin θ + Fy′ cos θ
Fz = Fz ′
while (x′ , y ′ , z ′ ) are related to (x, y, z) according to

x′ = x cos θ − y sin θ
y′ = −x sin θ + y cos θ
z′ = z
so that we obtain, after some algebra

dvx GMp / 2
0
Fx = = R3 /
x (2 − 3 sin θ) + 3 y sin θ cos θ
dt
dvy GMp 0
Fy = dt
= R3
y (2 − 3 cos θ) + 3 x sin θ cos θ
2
dvz GM z
Fz = dt
= − R3p
The Impulse Approximation VII
Integrating these equations over time yields the cumulative velocity changes

wrt the center of S . Using that R(t) = (b, vp t, 0), and thus cos θ = b/R
and sin θ = vp t/R we obtain

2GMp x 2GMp z
∆vx = vp b2
∆vy = 0 ∆vz = − vp b2

2GM
We thus have that ∆⃗
v = vp b2p (x, 0, −z), and

∆E = 1
2
(⃗ v )2 + Φ(⃗
v + ∆⃗ r ′ ) − 12 ⃗
v 2 − Φ(⃗
r′ ) = ⃗ v + 12 (∆v)2
v · ∆⃗
Note that, in the impulse approximation, the potential energy does not
change during the encounter.
We are interested in computing ∆Etot which is obtained by integrating ∆E
over the entire system S .
First we note that the integral of the first term of ∆E typically is equal to
zero, by symmetry. Therefore
1
!
∆Etot = 2
ρ(⃗ r ′
)|∆⃗ v | d ⃗
2 3 ′
r
2G2 Mp2 !
= 2
vp b 4 ρ(⃗ r ′
)(x 2
+ z 2 )d3⃗
r′
2G2 Mp2
= 2
vp b 4 M s ⟨x 2
+ z 2

The Impulse Approximation VIII
Assuming spherical symmetry for S , so that
2 2
⟨x2 + z 2 ⟩ = 3
⟨x2
+ y2 + z2⟩ = 3
⟨r 2
⟩ we finally obtain
& '2
4 2 Mp ⟨r 2 ⟩
∆Etot = 3
G Ms vp b4

As shown by Aguilar & White (1985), this derivation, which is originally due
to Spitzer (1958), is surprisingly accurate for encounters with
b∼ > 5max[r , r ] (with r and r the median radii of P and S ), even for
p s p s
relatively slow collisions with v∞ ≃ ⟨vs2 ⟩1/2 .
The above shows that fast encounters pump energy into the systems
involved. This energy originates from the kinetic energy associated with the
orbit of P wrt S . Note that ∆Etot ∝ b−4 , so that close encounters are far
more important that distant encounters.
As soon as the amount of energy pumped into S becomes comparable to its
binding energy, the system S will become tidally disrupted.
Some stars can be accelerated to velocities that exceed the local escape
velocity ◃ encounters, even those that do not lead to tidal disruption, may
cause mass loss of S . In this case, the first terms of ∆E is not zero, and the
above impulse approximation has to be handled with care.
Return to Equilibrium
As we have seen, a fast encounter transfers orbital energy to the two systems
involved in the encounter, whose kinetic energy has subsequently increased.
After the encounter the systems are therefore no longer in virial equilibrium.
The systems now need to readjust themselves to find a new virial
equilibrium. Interestingly, this process changes the internal kinetic energy
more than did the encounter itself.
Let the initial kinetic and total energies of a system be T0 and E0 ,
respectively. According to the virial theorem we have that E0 = −T0 .
Due to the encounter T0 → T0 + δT , and thus also E0 → E0 + δT .
Applying the virial theorem we obtain that after the relaxation the new kinetic
energy is
T1 = −E1 = −(E0 + δT ) = T0 − δT
Thus, the relaxation process decreases the kinetic energy by 2δT from
T0 + δT to T0 − δT0 .
Similarly, the gravitational energy becomes less negative:
W1 = 2E1 = 2E0 + 2δT = W0 + 2δT
Since the gravitational radius rg = GM 2 /|W | the system will expand!
Heat Capacity of Gravitating Systems
As we have seen on the previous page, by pumping energy (‘heat’) into the
system, it has actually grown ‘colder’. This is a consequence of the negative
heat capacity of gravitational systems.
By analogy with an ideal gas we defined the temperature of a self-gravitating
system as
1 3
2
m⟨v 2 ⟩ = k T
2 B
Unlike an isothermal gas, the temperature in a self-gravitating system is
typically a function of position. Therefore, we define the mean temperature as
1
!
⟨T ⟩ ≡ M
x)T (⃗
ρ(⃗ x)d3 ⃗
x
and the total kinetic energy of a system of N particles is then
K = 32 N kB ⟨T ⟩. According to the virial theorem we thus have that
E = − 32 N kB ⟨T ⟩.
This allows us to define the heat capacity of the system as
dE
C≡ d⟨T ⟩
= − 32 N kB

which is thus negative: by losing energy the system becomes hotter!


Heat Capacity of Gravitating Systems
Note that all systems in which the dominant forces are gravitational have a
negative heat capacity. This includes the Sun, where the stability of nuclear
burning is a consequence of C < 0: If the reaction rates become ‘too high’,
the excess energy input into the core makes the core expand and cool. This
makes the reaction rates drop, bringing the system back to equilibrium.
The negative specific heat also results in fascinating phenomena in
stellar-dynamical systems.
Consider a central density cusp. If the cusp is sufficiently steep one has that
σ(r) increases with radius: the center is colder than its surroundings.
Two-body interactions tend towards thermal equilibrium, which means that
they transport heat from outside to inside.
Since C < 0 ⇒ σ0 ↓ , i.e., the center becomes colder!

As a consequence ∇T ↑ , and the heat flow becomes larger.
This leads to run-away instability, known as Gravothermal Catastrophe.
Thus, if radial temperature gradient exists, and two-body relaxation time is
sufficiently short (e.g., in globular clusters), the system can undergo core
collapse.

You might also like