Outline
Chapter 3: Random variables and distributions
Conditional distributions
Chapter 3.6 3.7 3.8
Columbia University
February 20, 2025
Chapter 3.6 3.7 3.8 Conditional distributions
Outline
Chapter 3: Random variables and distributions
Chapter 3: Random variables and distributions
3.6 Conditional distributions
3.7 Multivariate distributions
3.8 Functions of a random variable
3.9 Functions of Two or more Random variables
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
▶ Conditioning X on Y
▶ Total probability theorem
▶ Total expectation theorem
▶ Independence
▶ Bayes rule
▶ Functions of a random variable
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
3.6 Conditional distributions
Discrete conditional distributions
Definition
Conditional Distribution.Let a random variable X and Y have a
discrete joint distribtion f . Let f2 be the marginal p.f. of Y . For
each y such that f2 (y ) > 0, define:
f (x, y )
g1 (x|y ) =
f2 (y )
g1 is called the conditional p.f. of X given Y .
the discrete distribution whose p.f. g1 (·|y ) is called the conditional
distribution of X given Y .
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Continuous conditional distributions
Definition
Conditional p.d.f. Let a random variable X and Y have a discrete
joint distribtion f . Let f2 be the marginal p.f. of Y . For each y
such that f2 (y ) > 0, define:
f (x, y )
g1 (x|y ) = , −∞ < x < ∞
f2 (y )
For values of y such that f2 (y ) = 0 we can define g1 (x|y ) arbitrary
as long as g1 (x|y ) is a p.d.f. as a function of x.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Theorem
For each y , g1 (x|y ) is a p.d.f as a function of x.
the proof is just a consequence of
R∞
∞
−∞ f (x, y )dx
Z
g1 (x|y )dx = =1
−∞ f2 (y )
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Generalizing the multiplication rule
Remember the multiplication rule for conditional probabilities
P(A ∩ B) = P(A)P(B|A)
Theorem
Let X, Y be two random variables with p.d.f of X be f1 (x) and
p.d.f. of Y be f2 (y ).
For each y , g1 (x|y ) is a p.d.f as a function of x. Given the
Given the conditional distribution of X given Y we can infer the
joint distribution:
f (x, y ) = f2 (y )g1 (x|y )
And equivalently,
f (x, y ) = f1 (x)g2 (y |x)
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Example
Break a stick of length l two times , one time X uniform on [0, l].
Then take the smaller stick and break it one more time uniformly.
What is the joint density of the first break and second break?
▶ X is uniform on [0, l]
▶ given X , Y is uniform on [0, X ].
1
f (x, y ) = fX (x)g2 (y |x) = ,0 ⩽ y ⩽ x ⩽ l
lx
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Bayes’s theorem and Law of total probability for Random variables
Theorem
Given Y with p.d.f fY (y ) and the conditional distribution of X
given Y g1 (x|y ) the marginal distribution or p.d.f of X is
X
fX (x) = g1 (x|y )fY (y ),
y
if Y is discrete. If Y is countinuous, then the p.d.f of X is:
Z ∞
fX (x) = g1 (x|y )fY (y )dy ,
−∞
By observing the values of an experiment which has a
measurement error , infer the true value of the parameter.
For example, take measurements using 10 different radars of the
speed of a car. Infer the actual car speed.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Example
Suppose that Z follows a pdf.
(
2e −2z forz > 0
fZ (z) =
0 else
Given Z, one can draw random variables X1 , X2 independent with
distribution (
ze −zx1 , x > 0
fX1 (x1 |z) =
0 else
Determine the marginal distribution
Z ∞ Z ∞
fX1 ,X2 (x1 , x2 ) = fX1 ,X2 (x1 , x2 |z)fZ (z)dz = ze −zx1 ze −zx2 2e −2z d
−∞ −∞
4
=
(2 + x1 + x2 )3
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
We can think of random variable Z in the previous example as the
rate at which customers are served in a queue.
With this interpretation is useful to find the conditional p.d.f of Z
given X1 and X2 ,
f (z, x1 , x2 )
f (z|x1 , x2 ) =
f (x1 , x2 )
1
= (2 + x1 + x2 )3 z 2 e −z(2+x1 +x2 ) , z > 0
2
Once can evaluate that the rate at which customers arrive in the
queue is less than 1 given that we observed one customer arriving
at 1 and one at 4.
Z 1 Z 1
P(Z ⩽ 1|X1 = 1, X2 = 4) = f (z|1, 4)dz = 171.5z 2 e −7z dz = .97
0 0
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Example (Continuation)
Break a stick of length l two times , one time X uniform on [0, l].
Then take the smaller stick from 0 to X and break it one more
time uniformly. What is the density of the second break?
Z Z l
1 1 l
f (y ) = fX (x)g2 (y |x)dx = dx = ln( ), 0 ⩽ y ⩽ l
y lx l y
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Bayes Theorem
Theorem
Given Y with p.d.f fY and the conditional distribution of Y given Y
g1 (x|y ). Then the conditional distribution or p.d.f of Y given X is
fY (y )g1 (x|y )
g2 (y |x) =
fX (x)
Similarly , the distribution of X given Y is obtained by,
fX (x)g2 (y |x)
g1 (x|y ) =
fY (y )
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Example
Observe a discrete variable X which can take 2 values 1 or 0, with
probability Y and 1 − Y , where Y is a uniform variable in [0, 1].
(
1 with prob, Y
X =
0 with prob1 − Y
Given that the observed value is X is 1, what is the distribution of
Y? fY (y ) = 1, 0 < y < 1 , P(X = 1|Y = y ) = y
fY (y )P(X = 1|Y = y )
g (Y = y |X = 1) =
fX (1)
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Example (Continuation)
We only need to derive the the density of X (observe 1 or 0), by
integrating out the Y
Z Z 1
f (X = 1) = fY (y )g2 (X = 1|y )dy = ydy = 1/2
0
the posterior density of Y given that we have observed X = 1
fY (y )P(X = 1|Y = y )
g (Y = y |X = 1) = = 2y , 0 ⩽ y ⩽ 1
fX (1)
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Theorem
Independent Random Variables. Suppose that X and Y are two
random variable with a joint p.d.f f(x,y). Then X and Y are
independent if the conditional distribution of X given Y g1 (x|y )
does not depend on y , namely
g1 (x|y ) = fX (x).
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Multivariate distributions
Definition
The joint c.d.f of n random variables X1 , · · · , Xn is the function F
whose values at each point in the n dimensional space is
F (x1 , · · · , xn ) = P(X1 ⩽ x1 , · · · , Xn ⩽ xn )
We say that n discrete random variables have a discrete joint
distribution if f (x1 , · · · , xn ) = P(X1 = x1 , · · · , Xn = xn ) For a
discrete joint distribution for every C ∈ Rn
X
P(X ∈ C ) = f (x1 , · · · , xn )
(x1 ,··· ,xn )∈C
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Definition
Given a vector of random variables X1 , · · · , Xn have a continuous
multivaraiate joint distribution if there exist a nonnegative function
defined on Rn such that for any C ∈ Rn
Z Z
P(X1 , · · · , Xn ∈ C ) = · · · f (x1 , · · · ,n )dx1 · · · dxn
C
The joint density can be derived from the joint C.D.F
F (x1 , · · · , xn ) s follows:
∂ n F (x1 , · · · , xn )
f (x1 , · · · , xn ) =
∂x1 · · · xn
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Marginal distributions
If the joint p.d.f of X1 , · · · , Xn , then the marginal p.d.f. f1 of X1
is specified at every value
Z ∞ Z ∞
f1 (x1 ) = ··· f (x1 , · · · , dxn )dx2 · · · dxn
−∞ −∞
More generaly we can find the joint distribution of k variables by
integrating over the n − k variables.
Z ∞Z ∞
f (x2 , x4 ) = f (x1 , x2 , x3 , x4 )dx1 dx3
−∞ −∞
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Deriving the marginal C.D.F of one variable from the joint C.D.F.
FX1 (x1 ) = P(X1 ⩽ x1 ) = P(X1 ⩽ x1 , X2 < ∞, · · · , Xn < ∞)
= lim F (x1 , x2 , · · · , xn )
x2 ,..,xn →∞
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Independent Random Variables
Definition
X1 , · · · , Xn are independent if ,
P(X1 ∈ A1 , · · · , Xn ∈ An ) = P(X1 ∈ A1 ) · · · P(Xn ∈ An )
Theorem
Given X1 , · · · , Xn are independent, the joint C.D.F is just the
product of individual c.d.f
F (x1 , · · · , xn ) = F1 (x1 )F2 (x2 ) . . . Fn (xn )
The joint p.d.f is also the product of the individual p.d.f
f (x1 , · · · , xn ) = f1 (x1 ) . . . fn (xn )
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random sample
Definition
Given a probability distribution on the real line denoted with f .
X1 , · · · , Xn forma a random sample from this distribution of all
the variables are independent and the marginal distribution of each
variable is f . This sample is called independent and identically
distributed or i.i.d.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Conditional distributions
The concepts of joint and marginal distributions can be generalized
in k, and n − k variables
The conditional distribution of X1 , · · · , Xk given Xk+1 , · · · , Xn
f (x1 , · · · , xn )
gk+1,..,n (xk+1 , .., xn )|x1 , · · · , xk ) =
f (x1 , · · · , xk )
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Multivariate law of total probability and Bayes’s Theorem
Definition
Conditionally independent random variables. Let Z be a random
variable with densitity fZ (z). Several variables are conditionally
independent given Z if their conditional density given Z is just a
product of individual conditional densities
f (x1 , · · · , xn |z) = Πni=1 gi (x1 |z)
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Histograms
Let x1 , · · · , xn be a collection of numbers that all lie between
a < b.
Choose an integer and divide the interval into k subintervals and
count the number of xi that fall into each interval(ci )
Draw a rectangular bar for each subinterval with the ci /n on the
y − axis.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
3.8 Functions of a random variable
▶ given the distribution of X (rate of customers in queue)
▶ find the distribution of 1/X average waiting time
▶ in general given the distribution of X find the distribution of
r (x)
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random variables with discrete distribution
Example
Let X be a random variable with discrete distribution in 1, . . . , 9.
What is the distribution of the distance from the center
Y = |X − 5|
P(Y = 1) = P(X ∈ {6, 4}) = 2/9
s
P(Y = 2) = P(X ∈ {7, 3}) = 2/9
P(Y = 3) = P(X ∈ {8, 2}) = 2/9
P(Y = 4) = P(X ∈ {9, 1}) = 2/9
P(Y = 0) = P(X ∈ {5}) = 1/9
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Theorem
Given X with discrete distribution f (x), the distribution of the
random variable Y given the function r (·) of X , Y = r (X ) is
X
gY (y ) = P(r (X ) = y ) = fX (x)
x:r (x)=y
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random Variables of Continuous distribution
Example
Let Z be the rate at which customers are served in a queue.
Assume the p.d.f of Z is fZ (z) = λe −λz , z ⩾ 0. What is the inverse
CDF of Z Y = 1/Z and how would we draw random customer
times?
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
1
FY (y ) = 1 − FZ ( )
y
−λ y1
= 1 − (1 − e )
−λ y1
= e
In general for any function r (·), one can compute the c.d.f
FY (y ) = P(Y ⩽ y ) = P(r (X ) ⩽ y )
Z
= fX (x)dx
x:r (x)⩽y
and p.d.f at the points where G is differentiable
dFY (y )
fY (y ) =
dy
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
The probability integral transformation
Example
Let X be a variable with fX (x) = λe −λx , x ⩾ 0. The c.d. f of X is:
Z x
FX (x) = λe −λy dy = 1 − e −λx
0
What is the c.d.f of the transformed variable Y = FX (X) where
r (·) = FX (·)
P(Y ⩽ y ) = P(FX (X) ⩽ y ) = P(X ⩽ FX−1 (y )) = FX (F −1 (y )) = y
Thus Y is uniform on [0,1]
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Probability Integral Transformation
Theorem
Let X have a continuous c.d.f FX and let Y = FX (X ). The
distribution of Y is uniform in [0,1].
P(Y ⩽ y ) = P(FX (X) ⩽ y ) = P(X ⩽ FX−1 (y )) = FX (F −1 (y )) = y
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random variable sampling
In general numeric algorithms are set to generate random variables
on [0,1].
in order to generate a variable different from Uniform, the most
common solution is to apply the inverse CDF
Corollary
Let Y have the uniform distribution on [0,1] and let F be a
continous c.d.f with quantile functions F −1 . Then X = F −1 (Y )
has c.d.f. F.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random Variables of Continuous distribution Sampling
Example
Let Z be the rate at which customers are served in a queue.
Assume the p.d.f of Z is fZ (z) = λe −λz , z ⩾ 0. What is the inverse
CDF of Z and how would we draw random customer times? Now
compute the c.d.f of Z
Z z
FZ (z) = λe −λx dx = 1 − e −λz
0
The inverse of CDF FZ−1 : [0, 1) → [0, ∞)
log (1 − y )
FZ−1 (z) = −
λ
See jupyter notebook for the code.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Direct Derivation of p.d.f when r is One to One and
differentiable.
Let r (·) be a differentiable one-to-one function on the open
interval (a, b)
▶ Then r is either strictly increasing or strictly decreasing
▶ The inverse of r , denoted s would exist
▶ The relationship between the derivative of s and derivative of
r
ds(y ) 1
= ′
dy r (s(y ))
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Theorem
Let X be a random variable for which the p.d.f. is f between a, b.
Let Y = r (X ) with r one - to -one differentiable on (a,b). Let
(α, β) be the image of interval (a, b) using r , and denote with
s(y ) = r −1 (y ). Then the p.d.f of Y is
(
f (s(y )) | ds(y )
dy | , α < y < β
gY (y ) =
0 , else
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
3.9 Functions of Two or more Random variables
Example
Three different firms advertize their holdings in 10 different
companies. Denote each company with Xi taking values 1 if
outperforms and zero if not. We are interested in
Y1 = X1 + · · · + X10
Y2 = X11 + · · · + X20
Y3 = X21 + · · · + X30
Assume that underperform and outperform is equally likely
What is the probability that firm Y1 = 3, Y2 = 5, Y3 = 8
Given n independent variable with Bernoulli and parameter p. The
sum Y = X1 + · · · + Xn is a binomial with parameter n and p.
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Random Variables with Continuous joint distribution
Example
Consider 2 customers in a queue and the time they wait is given by
2 independent random variables X1 .X2 with p.d.f 2e −2x . Since
they leave together they are interested in the joint wait time. Let
Y = X1 + X2
P(Y ⩽ y ) = P(X1 + X2 ⩽ y )
Z Z
= 2e −2x1 2e −2x2 dx1 dx2
x1 +x2 <y
Z yZ x1 <y −x2
= 2e −2x1 dx1 2e −2x2 dx2
0 0
−2y
= 1−e − 2ye −2y
P(Y ⩽ y )
gY (y ) = = 4ye −2y
dy
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Linear function of 2 variables
Theorem
Let X1 , X2 with joint p.d.f f (x1 , x2 ) and let Y = a1 X1 + a2 X2 + b.
Then Y has a continuous distribution with pdf
Z ∞
y − b − a2 x2 1
gY (y ) = f( , x2 ) dx2
−∞ a 1 | a 1 |
Chapter 3.6 3.7 3.8 Conditional distributions
Outline 3.7 Multivariate distributions
Chapter 3: Random variables and distributions 3.8 Functions of a random variable
3.8 Functions of a random variable
Convolution
Definition
Given X1 , X2 indep and continuous random variables with
distribution f1 , f2 , define Y = X1 + X2 The distribution of Y is
called the convolution and is
Z ∞
gY (y ) = f1 (y − z)f2 (z)dz
−∞
See example with the i.i.d waiting times with p.d.f 2e −2x . Dist of
the sum Y = X1 + X2
Z y
gY (y ) = 2e −2(y −z) 2e −2z dz = 4ye −2y
0
Chapter 3.6 3.7 3.8 Conditional distributions