Lec 2 - Random Variables and their Functions
9th Jan, 2023
Statistical mechanics is about the statistical properties of the system. It is thus inherently
probabilistic.
Sources of the probability are from a lack of total knowledge (classical) and from quantum
measurements/interactions.
Thus we average and consider statistical behaviour to eliminate the large number of degrees of
freedom
Probabilities
Probabilites can be
Objective : measured experimentally - p(outcome) = # outcome
Subjective : determined through the understanding of a system
e.g. given a un-weighted coin, we can say that the odds of heads/tails are 50-50 without
conducting 1 million trials
Stat. Mech. is the subjective assignment of probabilites to system configurations which closely
match the objectivities.
Random Variables
Discrete Random Variables
A random variable A can be derived from a sample space {a i
}
N
i=1
where each element has a
given p(a ) chance of occuring.
i
N
0 ≤ p(a i ) ≤ 1, ∑ p(a i ) = 1
i=1
Continuous Random Variables
The sample space is now an infinite space with an associated measure, typically R with the
typical measure
The space is now described by a cumulative probability function
P (X) = prob(−∞ ≤ outcome ≤ x)
P (−∞) = 0, P (∞) = 1 P (x) . is monotone non-decreasing.
The chance of random variable being close to a given point i.e. the probability density is given
as
p(x)dx = prob(x ≤ outcome ≤ x + dx)
= prob(−∞ ≤ outcome ≤ x + dx) − prob(−∞ ≤ outcome ≤ x)
2
p(x)dx = P (x + dx) + P (x) = ( P (x) + ∂ x P (x ⋅ dx + O(dx )) − P (x)
⟹ p(x) = ∂ x P (x) + O(dx)
Since P (x) is monotone, 0 ≤ p(x) < ∞. ∫ Sample space
p(x)dx = 1
Joint Distributions :
A single outcome can be descibed by multiple variables associated with it, thus providing us a
joint-probability distribution with the two variables A, B, given by p(a, b).
Discrete :
Variables A, B take on a finite set of values from {a i} and {b i}
0 ≤ p(a, b) ≤ 1, ∑ ∑ p(a, b) = 1
a b
Continuous :
The sample space can be described by a vector x
→ = (x 1, x 2 , ⋯) . The sample space then is
Sx
→ = {−∞ ≤ x 1 , x 2 , ⋯ ≤ ∞}
→
p(x)dx 1 ⋅ dx 2 ⋯ = prob (outcome ∈ small neighbourhood around x) →
Marginal/Unconditional Probabilites :
If we ignore one of the variables (taken as a condition on the other), then we can get a
probability distribution regarding the outcomes of just one variable.
Discrete :
p A (a) = ∑ b p(a, b), p B (b) = ∑ a p(a, b)
Continuous :
e.g. Distribution on just the velocites in a gas :
p V (v 1 , v 2 , ⋯) = ∫ p(x 1 , x 2 , ⋯ , v 1 , v 2 , ⋯)dx 1 ⋅ dx 2 ⋯
→
x
Conditional Probabilites
- Given the value of one of the variables associated with an event, we can define an updated
probability distribution
p(a|b) = prob (A = a given that B = b)
⟹ p(a, b) = p(a|b) ⋅ p B (B)
= p(b|a)p A (A)
p(a, b) p(b|a)p A (a)
⟹ p(a|b) = =
p B (b) p B (b)
But p B (b) = ∑ p(a, b) = ∑ p(b|a)p A (a)
a a
p(b|a)p A (a)
⟹ Bayes Theorem : p(a|b) =
∑ a p(b|a)p A (a)
A test for a disease with prevalence 0.5% results as positive. The test is 99% specific (it
corrrectly identifies 99% of people without the disease) and 99% sensitive (it correctly
identifies 99% of people with the disease). What is the probability of one having the
disease after the result?
When the test results positive (denoted as +), the chances of being healthy or diseased
(denoted as h/d) are
p(+|d)p(d)
p(d|+) =
p(+|d)p(d) + p(+|h)p(h)
0.99 ⋅ 0.005
=
0.99 ⋅ 0.005 + 0.01 ⋅ 0.995
p(d|+) ≃ 0.33
When the test results are negative (denoted as −),
p(−|d)p(d)
p(d|−) =
p(−|d)p(d) + p(−|h)p(h)
0.01 ⋅ 0.005
=
0.01 ⋅ 0.005 + 0.99 ⋅ 0.995
−5
p(d|−) ≃ 5 ⋅ 10
Function of a random variable:
We can define functions that take the various otucomes for a random variable and produce an
output. The output is thus itself a random variable.
Discrete :
- Let F : A → F = {F (a i )} . Then p F
(f ) = ∑
a
δ f ,F (a) p(a) = ∑
F (a)=f
p(a)
- If G(x, y) is a a function of two variables, P ( G (g)) = ∑ x,y δ G(x,y),g p(x, y)
Rolling 2d6
Given two dice, whose outcome values are represented by x, y ∈ {1, 2, 3, 4, 5, 6} each of the
probability 1
6
.
Let G(x, y) = x + y. Then the most probable value is 7.
6 1
p G (7) = p(1, 6) + p(2, 5) + p(3, 4) + p(4, 3) + p(5, 2) + p(6, 1) = =
36 6
Overall, the probability distribution looks like a triangle. As more and more dice are rolled,
the distribution peaks gets thinner, until the mean is immensely more likely than any other
outcome.
0:2
0:15
0:1
5 ¢ 10¡2
0
2 3 4 5 6 7 8 9 10 11 12
Continuous :
Let S X = {−∞ < x < ∞}, F : S X → S F . Thus
p F (f )df = ∑ p(x i )dx
F (x i )=f
dx
⟹ p F (f ) = ∑ p(x i )
dF x=x i
F (x i )=f
Alternative derivation is
We added sgn ( dF
dx
increasing argument.
)
p F (f ) = ∫
p F (f ) =
SX
F (x i )=f
Let u = F (x ⋅ sgn (
F (x i )=f
F (x i )=f
∫
∫
x i +h
x i −h
x i +h
x i −h
p(x i )
∣
δ(F (x) − f ) ⋅ p(x)dx
δ(F (x) − f ) ⋅ p(x)dx
dF
dx
δ(u) ⋅ p(x)
dx
dF x=x i
)) − f
du
dF
dx
because the defining property of the delta function is defined for an
Note that this means that for a well-defined probability density, any solutions to f
may not be local extrema.
= F (x)