Master Probability - Lecture Notes - Quizzes & Solutions
Master Probability - Lecture Notes - Quizzes & Solutions
MASTER
PROBABILITY
LECTURE NOTES - QUIZZES WITH SOLUTIONS
EXCLUSIVE EDITION
WWW.AYOUBB.COM
ADVANCED PROBABILITY & STATISTICS
What to expect?
CONTENT
Chapter I
1. Probabilistic
Models and
Axioms
Probabilistic Models
Elements of a Probabilistic Model
The sample space
The sample space Ω is the set of all possible outcomes of an experiment.
Figure 1. Two rolls of a tetrahedral die, Discrete example: Sample space vs. sequential
description
1. Probability Models and Axioms
Exercise 1:
For the experiment of flipping a coin, and for each one of the following
choices, determine whether we have a legitimate sample space:
Exercise 2:
Paul checks the weather forecast. If the forecast is good, Paul will go out
for a walk. If the forecast is bad, then Paul will either stay home or go out. If
he goes out, he might either remember or forget his umbrella. In the tree
diagram below, identify the leaf that corresponds to the event that the
forecast is bad and Paul stays home.
Probability axioms
1. (Nonnegativity) 𝑃(𝐴)≥0, for every event A.
2. (Additivity) If A and B are two disjoint events, then the probability
of their union satisfies
1. Probability Models and Axioms
Exercise 3:
Let A and B be events on the same sample space, with P(A)=0.6
and P(B)=0.7. Can these two events be disjoint?
No
Exercise 4:
Let A, B, and C be disjoint subsets of the sample space. For each one of the
following statements, determine whether it is true or false. Note: "False"
means "not guaranteed to be true."
( 𝑐) (
𝑎) 𝑃(𝐴) + 𝑃 𝐴 + 𝑃(𝐵) = 𝑃 𝐴∪𝐴 ∪𝐵
𝑐
)
𝑏) 𝑃(𝐴) + 𝑃(𝐵)≤1
( 𝑐)
𝑐) 𝑃 𝐴 + 𝑃(𝐵)≤1
𝑑) 𝑃(𝐴∪𝐵∪𝐶)≥𝑃(𝐴∪𝐵)
False, True, False, True
Exercise 5:
Let A, B, and C be subsets of the sample space, not necessarily disjoint. For
each one of the following statements, determine whether it is true or
false. Note: “False" means “not guaranteed to be true."
𝑐
( (
a) 𝑃 (𝐴∩𝐵) ∪ 𝐶∩𝐴 )) ≤ 𝑃(𝐴∪𝐵∪𝐶)
𝑐 𝑐 𝑐
b) 𝑃(𝐴∪𝐵∪𝐶) = 𝑃(𝐴∩𝐶 ) + 𝑃(𝐶) + 𝑃(𝐵∩𝐴 ∩ 𝐶 )
True, True
Exercise 6:
Consider the same model of two rolls of a tetrahedral die, with all 16
outcomes equally likely. Find the probability of the following events:
a) The value in the first roll is strictly larger than the value in the
second roll.
b) The sum of the values obtained in the two rolls is an even number.
0.375, 0.5
Exercise 7:
Consider a sample space that is the rectangular region [0,1] × [0,2] i.e., the
set of all pairs (x,y) that satisfy 0≤x≤1 and 0≤y≤2. Consider a “uniform"
probability law, under which the probability of an event is half of the area of
the event. Find the probability of the following events:
Exercise 8:
Let the sample space be the set of positive integers and suppose that
𝑛
𝑃(𝑛) = 1/2 , for 𝑛 = 1, 2, ….Find the probability of the set {3, 6, 9, …}, that is,
of the set of of positive integers that are multiples of 3.
0.1428
Exercise 9:
Let the sample space be the set of all positive integers. Is it possible to
have a “uniform" probability law, that is, a probability law that assigns the
same probability c to each positive integer?
No
Let the sample space be the two-dimensional plane. For any real number x,
let Ax be the subset of the plane that consists of all points of the vertical
line through the point (x,0), i.e., Ax={(x,y):y∈Re}.
Exercise 10:
Countably infinite sample space
Let the sample space be the two-dimensional plane. For any real number x,
let Ax be the subset of the plane that consists of all points of the vertical
line through the point (x,0), i.e., Ax={(x,y):y∈Re}.
Problem 1:
Venn diagrams
In this problem, you are given descriptions in words of certain events (e.g.,
"at least one of the events A,B,C occurs"). For each one of these
descriptions, identify the correct symbolic description in terms of A,B,C
from Events E1-E7 below. Also identify the correct description in terms of
regions (i.e., subsets of the sample space Ω) as depicted in the Venn
diagram below. (For example, Region 1 is the part of A outside of B and C.)
Symbolic descriptions:
𝐸1: 𝐴∩𝐵∩𝐶
𝑐
𝐸2: (𝐴∩𝐵∩𝐶)
𝑐
𝐸3: 𝐴∩𝐵∩𝐶
𝑐 𝑐
(
𝐸4: 𝐵∪ 𝐵 ∩ 𝐶 )
𝑐 𝑐 𝑐
𝐸5: 𝐴 ∩ 𝐵 ∩ 𝐶
Problem 2:
Set operations and probabilities
( (
𝑃 𝐴∪ 𝐵 ∪ 𝐶
𝑐 𝑐 𝑐
) ) = ?
( (
𝑃 𝐴∪ 𝐵 ∪ 𝐶
𝑐 𝑐 𝑐
) ) =?
𝑐 𝑐 𝑐
(
3. 𝑃 𝐴 ∩ 𝐵 ∪ 𝐶 ( )) = 0. 7
( (
𝑃 𝐴∪ 𝐵 ∪ 𝐶
𝑐 𝑐 𝑐
) ) =?
Problem 3:
Three tosses of a fair coin
You flip a fair coin (i.e., the probability of obtaining Heads is 1/2) three times.
Assume that all sequences of coin flip results, of length 3, are equally likely.
Determine the probability of each of the following events.
1. {HHH}: 3 Heads
2. {HTH}: the sequence Heads, Tails, Heads
3. Any sequence with 2 Heads and 1 Tail (in any order):
4. Any sequence in which the number of Heads is greater than or
equal to the number of Tails:
0.125, 0.125, 0.375, 0.5
1. Probability Models and Axioms
Problem 4:
Parking lot problem
Mary and Tom park their cars in an empty parking lot with n≥2 consecutive
parking spaces (i.e, n spaces in a row, where only one car fits in each space).
Mary and Tom pick parking spaces at random; of course, they must each
choose a different space. (All pairs of distinct parking spaces are equally
likely.) What is the probability that there is at most one empty parking
space between them?
4⋅𝑛−6
2
𝑛 −𝑛
Problem 5:
Probabilities on a continuous sample space
Alice and Bob each choose at random a real number between zero and
one. We assume that the pair of numbers is chosen according to the
uniform probability law on the unit square, so that the probability of an
event is equal to its area.
P(A)=?
P(B)=?
P(A∩B)=?
P(C)=?
P(D)?
P(A∩D)=?
0.4444, 15/16, 0.4444, 0, 0.75, 0.30903
1. Probability Models and Axioms
Problem 6:
Upper and lower bounds on the probability of intersection
Given two events A,B with P(A)=3/4 and P(B)=1/3, what is the smallest
possible value of P(A∩B)? The largest? That is, find a and b such that,
𝑎≤𝑃(𝐴∩𝐵)≤𝑏
Chapter II
2. Conditioning and
Bayes' rule
and specifies a new (conditional) probability law on the same sample space
Ω. In particular, all properties of probability laws remain valid for conditional
probability laws.
If the possible outcomes are finitely many and equally likely, then:
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐴∩𝐵
𝑃(𝐵) = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑜𝑓 𝐵
Exercise 1:
Are the following statements true or false?
True, False
Exercise 2:
Conditional probabilities in a continuous model
Let the sample space be the unit square, Ω=[0,1]2, and let the probability of
a set be the area of the set. Let A be the set of points (x,y)∈[0,1]2 for
which y≤x. Let B be the set of points for which x≤1/2. Find P(A∣B).
¼
Exercise 3:
The multiplication rule
Are the following statements true or false? (Assume that all conditioning
events have positive probability.)
𝐶
(
1. 𝑃 𝐴∩𝐵∩𝐶 ) = 𝑃(𝐴∩𝐵)𝑃(𝐴∩𝐵)
𝐶 𝑐
2. 𝑃(𝐴∩𝐵∩𝐶 ) = 𝑃(𝐴)𝑃(𝐴)𝑃(𝐴∩𝐶 )
𝐶 𝑐
3. 𝑃(𝐴∩𝐵∩𝐶 ) = 𝑃(𝐴)𝑃(𝐴)𝑃(𝐴∩𝐶 )
4. 𝑃(𝐶) = 𝑃(𝐶)𝑃(𝐴∩𝐶)
True, True, True, True
( )
𝑃(𝐵) = 𝑃 𝐴1∩𝐵 + ⋯ + 𝑃 𝐴𝑛∩𝐵 ( ) ( )( ) ( )( )
= 𝑃 𝐴1 𝑃 𝐴1 + ⋯ + 𝑃 𝐴𝑛 𝑃 𝐴𝑛
Exercise 4:
We have an infinite collection of biased coins, indexed by the positive
integers. Coin ii has probability 2−i of being selected. A flip of coin ii results
2. Conditioning and Bayes' rule
in Heads with probability 3−i. We select a coin and flip it. What is the
probability that the result is Heads? The geometric sum formula may be
useful here:
∞
𝑖 α
∑ α = 1−α
, 𝑤ℎ𝑒𝑛|α| < 1
𝑖=1
Exercise 5:
A test for a certain rare disease is assumed to be correct 95% of the time: if
a person has the disease, the test result is positive with probability 0.95,
and if the person does not have the disease, the test result is negative with
probability 0.95. A person drawn at random from a certain population has
probability 0.001 of having the disease.
2.4. Independence
Two events A and B are said to be independent if:
𝑃(𝐴∩𝐵) = 𝑃(𝐴)𝑃(𝐵)
𝑃(𝐵) = 𝑃(𝐴)
𝑃(𝐶) = 𝑃(𝐶)𝑃(𝐶)
𝑃(𝐵∩𝐶) = 𝑃(𝐶)
Exercise 6:
We have a peculiar coin. When tossed twice, the first toss results in Heads
with probability 1/2. However, the second toss always yields the same
result as the first toss. Thus, the only possible outcomes for a sequence of
2 tosses are HH and TT, and both have equal probabilities. Are the two
events A={Heads in the first toss} and B={Heads in the second toss}
independent?
No they’re independent
1. Always
2. If and only if P(A)=0
3. If and only if P(A)=1
4. If and only if P(A) is either 0 or 1
4
Exercise 7:
Conditional independence
Yes, No
Exercise 8:
Reliability
16/27, 22/27
Problem 1:
Two five-sided dice
You roll two five-sided dice. The sides of each die are numbered from 1 to
5. The dice are “fair"" (all sides are equally likely), and the two die rolls are
independent.
Part (a): Event A is “the total is 10" (i.e., the sum of the results of the two die
rolls is 10).
Problem 2:
A reliability problem
Problem 3:
Oscar's lost dog in the forest
Oscar has lost his dog in either forest A (with probability 0.4) or in forest B
(with probability 0.6).
If the dog is in forest A and Oscar spends a day searching for it in forest A,
the conditional probability that he will find the dog that day is 0.25.
Similarly, if the dog is in forest B and Oscar spends a day looking for it
there, he will find the dog that day with probability 0.15.
The dog cannot go from one forest to the other. Oscar can search only in
the daytime, and he can travel from one forest to the other only overnight.
The dog is alive during day 0, when Oscar loses it, and during day 1, when
Oscar starts searching. It is alive during day 2 with probability 2/3. In
general, for n≥1, if the dog is alive during day n−1, then the probability it is
alive during day n is 2/(n+1). The dog can only die overnight. Oscar stops
searching as soon as he finds his dog, either alive or dead.
1. In which forest should Oscar look on the first day of the search to
maximize the probability he finds his dog that day?
2. Oscar looked in forest A on the first day but didn't find his dog.
What is the probability that the dog is in forest A?
3. Oscar flips a fair coin to determine where to look on the first day
and finds the dog on the first day. What is the probability that he
looked in forest A?
2. Conditioning and Bayes' rule
4. Oscar decides to look in forest A for the first two days. What is the
probability that he finds his dog alive for the first time on the
second day?
5. Oscar decides to look in forest A for the first two days. Given that
he did not find his dog on the first day, find the probability that he
does not find his dog dead on the second day.
6. Oscar finally finds his dog on the fourth day of the search. He
looked in forest A for the first 3 days and in forest B on the fourth
day. Given this information, what is the probability that he found his
dog alive?
Forrest A, 0.3333, 0.5263, 0.05, 0.9722, 0.1333
Problem 4:
Serap and her umbrella
Before leaving for work, Serap checks the weather report in order to
decide whether to carry an umbrella. On any given day, with
probability 0.2 the forecast is “rain" and with probability 0.8 the forecast is
“no rain". If the forecast is “rain", the probability of actually having rain on
that day is 0.8. On the other hand, if the forecast is “no rain", the probability
of actually raining is 0.1.
1. One day, Serap missed the forecast and it rained. What is the
probability that the forecast was “rain"?
2. Serap misses the morning forecast with probability 0.2 on any day
in the year. If she misses the forecast, Serap will flip a fair coin to
decide whether to carry an umbrella. (We assume that the result of
the coin flip is independent from the forecast and the weather.) On
any day she sees the forecast, if it says “rain" she will always carry
an umbrella, and if it says “no rain" she will not carry an umbrella.
Let U be the event that “Serap is carrying an umbrella", and let N be
the event that the forecast is “no rain". Are
events U and N independent?
3. Serap is carrying an umbrella and it is not raining. What is the
probability that she saw the forecast?
0.67, No, 8/27
3. Counting
Chapter III
3. Counting
(b) For every possible result at the first stage, there are n2 possible results
at the second stage.
(c) More generally, for any sequence of possible results at the first i − 1
stages, there are ni possible results at the ith stage. Then, the total number
of possible results of the r-stage process is
𝑛1𝑛2 + …𝑛𝑟
Exercise 1:
You are given the set of letters {A, B, C, D, E}.
How many five-letter strings can be made if we require that each letter
appears exactly once and the letters A and B are next to each other, as
either “AB" or “BA"? (Hint: Think of a sequential way of producing such a
string.)
60, 32, 48
Exercise couting:
You are given the set of letters {A, B, C, D, E}. What is the probability that in
a random five-letter string (in which each letter appears exactly once, and
with all such strings equally likely) the letters A and B are next to each
other? The answer to a previous exercise may also be useful here. (In this
3. Counting
• Partitions of n objects into r groups, with the ith group having ni objects:
(𝑛 𝑛1, 𝑛2, …, 𝑛𝑟 ) = 𝑛!
𝑛1!𝑛2!⋯𝑛𝑟!
𝑐 = ∑ 𝑘(𝑛 𝑘 )
𝑘=1
Find the value of c (as a function of n) by thinking about a different way of
forming a chaired committee: first choose the chairperson, then choose
the other members of the committee. The answer is of the form
β γ𝑚+δ
(
𝑐 = α +𝑛 2 )
What are the values of α, β, γ, and δ?
0, 1, 1, -1
𝑘 𝑛−𝑘
Find the value of ∑ (𝑛 𝑘 )𝑝 (1 − 𝑝)
𝑘=0
Having given 2 items to the Alice, we now give 3 items to Bob. This can be
done in (𝑒 𝑓 )ways. Find e and f. (There are 2 possible values of f that are
correct. Enter the smaller value.)
9, 2, 9, 2, 7, 3
1 2 3 4
a) (6 2 ) ( 4 )( )
4
2
(6 2 )( )
1
b) 4
1 2
(6 2 )( ) (6 4 )(3 4
c) 4
− 4)
1 4 3 2
d) (6 2 )( 4 )( )
4
We are told that exactly three of the rolls resulted in a 1 and exactly three
rolls resulted in a 2. Given this information, find the probability that the six
rolls resulted in the sequence (1,2,1,2,1,2).
Note: Your answer should be a number. Do not enter "!" or combinations in
your answer.
)( )( ) 𝑐1
1
𝑐3 (
𝑐3 𝑘
1
𝑐2 𝑐2
, 𝑓𝑜𝑟 𝑘 = 1, 2, …, 6
1−( )𝑐1
𝑐2
For the remainder of the problem, assume that Alice and Bob are among
the ten people being considered.
1. 1. 𝑛(2𝑛 𝑛 ) = 2𝑛(2𝑛 − 1 𝑛 − 1 )
𝑛 𝑛
2
2. (2𝑛 𝑛 ) = ∑ (𝑛 𝑖 ) = ∑ (𝑛 𝑖 )(𝑛 𝑛 − 𝑖 )
𝑖=0 𝑖=0
2𝑛
2𝑛
3. 2 = ∑ (2𝑛 𝑖 )
𝑖=0
𝑛
𝑛−1
4. 𝑛2 = ∑ (𝑛 𝑖 )𝑖
𝑖=0
Now assume, in addition, that every hat thrown into the box has
probability p of getting dirty (independently of what happens to the other
hats or who has dropped or picked it up). Find the probability that:
Chapter IV
4. Discrete
Random
Variables
4.1. Main Concepts Related to Random Variables
Starting with a probabilistic model of an experiment:
1. Collect all the possible outcomes that give rise to the event {X = x}.
For each of the two cases below, indicate a statement that is true.
a) 𝐼𝐴 + 𝐼𝐵:
b) 𝐼𝐴 · 𝐼𝐵:
𝑝𝑋(2. 5) =?
𝑝𝑋(1) =?
0, 80/243
𝑃(𝑋≥10) =?
9
(1 − 𝑝)
4.4. Expectation
We define the expected value (also called the expectation or the mean) of
a random variable X, with PMF pX, by
𝐸[𝑋] = ∑ 𝑥𝑝𝑋(𝑥)
𝑥
𝐸[𝑔(𝑋)] = ∑ 𝑔(𝑥)𝑝𝑋(𝑥)
𝑥
𝐸[8 − 𝑋] =?
4.6. Variance
The variance var(X) of a random variable X is defined by
2
𝑣𝑎𝑟(𝑋) = 𝐸[(𝑋 − 𝐸[𝑋]) ]
2
𝑣𝑎𝑟(𝑋) = ∑(𝑥 − 𝐸[𝑋]) 𝑝𝑋(𝑥)
𝑥
(2 − 3𝑋) =?
4. Discrete Random Variables
18
Var(X)=?
𝑛⋅(𝑛+2)
3
𝑝𝑋,𝑌(𝑥, 𝑦) = 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦)
• The marginal PMFs of X and Y can be obtained from the joint PMF, using
the formulas
• The above have natural extensions to the case where more than two
random variables are involved.
4. Discrete Random Variables
𝑝𝑉,𝑊(𝑣, 𝑤) = 𝑐⋅(𝑣 + 𝑤)
b) Find pV(1)
1/9, 2/3
[ 2]
𝑎)𝐸 𝑋 = ∑ 𝑥𝑝𝑋 𝑥 ( 2)
𝑥
[ 2] 2
𝑏)𝐸 𝑋 = ∑ 𝑥 𝑝𝑋(𝑥)
𝑥
2 2
𝑐)𝐸[𝑋 ] = ∑ 𝑥 𝑝𝑋,𝑃(𝑥)
𝑥
[ 2] 2
𝑑)𝐸 𝑋 = ∑ 𝑥 𝑝𝑋,𝑌(𝑥, 𝑦)
𝑥
2 2
𝑒)𝐸[𝑋 ] = ∑ ∑ 𝑥 𝑝𝑋,𝑌(𝑥, 𝑦)
𝑥 𝑦
2
𝑓)𝐸[𝑋 ] = ∑ 𝑧𝑝 2(𝑧)
𝑋
𝑧
[ ]
𝐸 𝑋1 + 2𝑋2 − 3𝑋3 =?
4. Discrete Random Variables
-4
We toss coin A until Heads is obtained for the first time. We then toss coin
B until Heads is obtained for the first time with coin B.
𝑝𝑋∣𝐴(𝑥) = 𝑃(𝐴)
and satisfies
∑ 𝑝𝑋∣𝐴(𝑥) = 1
𝑥
• If A1, . . . , An are disjoint events that form a partition of the sample space,
with P(Ai) > 0 for all i, then
𝑛
𝑖=1
( )
𝑝𝑋(𝑥) = ∑ 𝑃 𝐴𝑖 𝑝𝑋∣𝐴 (𝑥)
𝑖
𝑝𝑋,𝑌(𝑥, 𝑦) = 𝑝𝑌(𝑦)𝑝𝑋∣𝑌(𝑦)
4. Discrete Random Variables
𝑝𝑋(𝑥) = ∑ 𝑝𝑌(𝑦)𝑝𝑋∣𝑌(𝑦)
𝑦
• There are natural extensions of the above involving more than two
random variables.
𝑎)𝑝𝑋,𝑌,𝑍(𝑥, 𝑦, 𝑧) = 𝑝𝑌(𝑦)𝑝𝑍∣𝑌(𝑦)𝑝𝑋∣𝑌,𝑍(𝑦, 𝑧)
𝑏)𝑝𝑋,𝑌∣𝑍(𝑧) = 𝑝𝑋(𝑥)𝑝𝑌∣𝑍(𝑧)
𝐶)𝑝𝑋,𝑌∣𝑍(𝑧) = 𝑝𝑋∣𝑍(𝑧)𝑝𝑌∣𝑋,𝑍(𝑥, 𝑧)
𝑑) ∑ 𝑝𝑋,𝑌∣𝑍(𝑧) = 1
𝑥
𝑒) ∑ ∑ 𝑝𝑋,𝑌∣𝑍(𝑥, 𝑦∣𝑧) = 1
𝑥 𝑦
𝑝𝑋,𝑌,𝑍(𝑥,𝑦,𝑧)
𝑓)𝑝𝑋,𝑌∣𝑍(𝑧) = 𝑝𝑍(𝑧)
𝑝𝑋,𝑌,𝑍(𝑥,𝑦,𝑧)
𝑔(𝑥, 𝑦, 𝑧) = 𝑝𝑌,𝑍(𝑦,𝑧)
𝐸[𝐴] = ∑ 𝑥𝑝𝑋∣𝐴(𝑥)
𝑥
𝐸[𝐴] = ∑ 𝑔(𝑥)𝑝𝑋∣𝐴(𝑥)
𝑥
𝐸[𝑌 = 𝑦] = ∑ 𝑥𝑝𝑋∣𝑌(𝑦)
𝑥
Furthermore, for any event B with P(Ai ∩ B) > 0 for all i, we have
𝑛
𝐸[𝐵] = ∑ 𝑃(𝐵)𝐸 𝐴𝑖∩𝐵
𝑖=1
[ ]
• We have
𝐸[𝑋] = ∑ 𝑝𝑌(𝑦)𝐸[𝑌 = 𝑦]
𝑦
𝑝𝑋,𝑌(𝑥,2)
5)𝐸[𝑌 = 2] = ∑ 𝑔(𝑥, 2) 𝑝𝑌(2)
𝑥
4. Discrete Random Variables
• X and Y are independent if for all pairs (x, y), the events {X = x} and {Y = y}
are independent, or equivalently
𝐸[𝑋𝑌] = 𝐸[𝑋]𝐸[𝑌]
Furthermore, for any functions g and h, the random variables g(X) and h(Y )
are independent, and we have
𝐸[𝑔(𝑋)ℎ(𝑌)] = 𝐸[𝑔(𝑋)]𝐸[ℎ(𝑌)]
(𝑋 + 𝑌) = (𝑋) + (𝑌)
Exercise: Independence
Let X, Y, and Z be discrete random variables.
𝑋 1
2. 𝐸⎡ 𝑌 ⎤ = 𝐸[𝑋]𝐸⎡ 𝑌 ⎤
⎣ ⎦ ⎣ ⎦
False, True
a) Find E[XY].
Let A be the event that there are 6 Heads in the first 8 tosses. Let B be the
event that the 9th toss results in Heads.
Find the probability that there are 3 Heads in the first 4 tosses and 2 Heads
in the last 3 tosses. Express your answer in terms of p using standard
notation. Remember not to use ! or combinations in your answer.
Given that there were 4 Heads in the first 7 tosses, find the probability that
the 2nd Heads occurred at the 4th toss. Give a numerical answer.
P(X=0)=?
P(X=1)=?
P(X=−2)=?
P(X=3)=?
E[X]=?
4. Discrete Random Variables
Var(X)=?
P(Y=0)=?
P(Y=1)=?
P(Y=2)=?
1/3, 2/9, 1/9, 0, 0, 4/3, 1/3, 4/9, 0
Find P(Y<X).
Find P(Y=X).
P(X=1)=?
P(X=2)=?
P(X=3)=?
P(X=4)=?
Find pX(1).
[ ]
If k≥1k≥1, ℓ≥2, and k+ℓ≤n, then 𝐸 𝐼𝑘𝐼𝑘+𝑙 =?
Mid-term exam
True or False
Let A, B, and C be events associated with the same probabilistic model (i.e.,
subsets of a common sample space), and assume that P(C)>0.
For each one of the following statements, decide whether the statement is
True (always true), or False (not always true).
Expectation 1
Compute E(X) for the following random variable X:
X=Number of tosses until getting 4 (including the last toss) by tossing a fair
10-sided die.
10
Expectation 2
Compute E(X) for the following random variable X:
X=Number of tosses until all 10 numbers are seen (including the last toss)
by tossing a fair 10-sided die.
To answer this, we will use induction and follow the steps below:
Find E(10).
4. Discrete Random Variables
For i=0,1,…,9
where f(i)= ?
Conditional Independence 1
Suppose that we have a box that contains two coins:
A fair coin: P(H)=P(T)=0.5.
A two-headed coin: P(H)=1.
A coin is chosen at random from the box, i.e. either coin is chosen with
probability 1/2, and tossed twice. Conditioned on the identity of the coin,
the two tosses are independent.
For the following statements, decide whether they are true or false.
a) A and B are independent.
b) A and C are independent.
c) A and B are independent given D.
d) A and C are independent given D.
False, False, True, False
Conditional Independence 2
Suppose three random variables X, Y, Z have a joint distribution
4. Discrete Random Variables
𝑃𝑋,𝑌,𝑍(𝑥, 𝑦, 𝑧) = 𝑃𝑋(𝑥)𝑃𝑍∣𝑋(𝑥)𝑃𝑌∣𝑍(𝑧)
(𝐼𝐴 − 𝐼𝐵) =?
2
𝑝 − 2𝑟 + 𝑞 − (𝑝 − 𝑞)
Joint PMF
Write down an expression for the joint 𝑃𝑀𝐹𝑝𝑁,𝑅(𝑛, 𝑘).
For n=1,2,… and k=1,2,…,2n:
1
𝑛
4⋅2
Marginal Distribution
Find the marginal PMF pK(k) as a function of k. For simplicity, provide the
answer only for the case when k is an even number. (The formula for
when k is odd would be slightly different, and you do not need to provide it).
4. Discrete Random Variables
𝑖 1
Hint: You may find the following helpful: ∑ 𝑟 = 1−𝑟
𝑓𝑜𝑟 0 < 𝑟 < 1
𝑖=0
For k=2,4,6,… :
𝑝𝐾(𝑘) =?
1
( )+1
2
𝑘
2
Discrete PMFs
Let A be the event that K is even. Find P(A|N=n) and P(A).
P(A∣N=n)=?
P(A)=?
½, ½
Independence 2
Is the event A independent of N?
Yes
4. Discrete Random Variables
5. Continuous Random Variables
Chapter V
5. Continuous
Random
Variables
5.1. Summary of PDF Properties
Summary of PDF Properties Let X be a continuous random variable with
PDF fX.
∞
𝑓𝑥(𝑥)≥0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥 ∫ 𝑓𝑋(𝑥)𝑑𝑥 = 1
−∞
𝑃(𝑋∈𝐵) = ∫ 𝑓𝑋(𝑥)𝑑𝑥
𝐵
Exercise: PDFs
Let X be a continuous random variable with a PDF of the form
c=?
P(X=1/2)=?
( {
𝑃 𝑋∈
1
𝑘 }) =?
: 𝑘 𝑖𝑛𝑡𝑒𝑔𝑒𝑟, 𝑘≥2
P(X≤1/2)=?
2, 0, 0, ¾
5. Continuous Random Variables
c=?
P(1/2≤X≤3/2)=?
¼, 3/8
𝑃(𝑎≤𝑋≤𝑏) =?
𝐸[𝑋] =?
[ 3]
𝐸 𝑋 =?
𝑏−𝑎
2
, 2, 10
The expected value rule for a function g(X) has the for
∞
𝐸[𝑔(𝑋)] = ∫ 𝑔(𝑥)𝑓𝑋(𝑥)𝑑𝑥
−∞
We have
2 2
0≤𝑣𝑎𝑟(𝑋) = 𝐸[𝑋 ] − (𝐸[𝑋])
5. Continuous Random Variables
Properties of a CDF
• If X is discrete and takes integer values, the PMF and the CDF can be
obtained from each other by summing or differencing:
𝑘
𝐹𝑋(𝑘) = ∑ 𝑝𝑋(𝑖) 𝑝𝑋(𝑘) = 𝑃(𝑋≤𝑘) − 𝑃(𝑋≤𝑘 − 1) = 𝐹𝑋(𝑘) − 𝐹𝑋(𝑘 − 1)
𝑖=−∞
• If X is continuous, the PDF and the CDF can be obtained from each other
by integration or differentiation:
𝑥 𝑑𝐹𝑋
𝐹𝑋(𝑥) = ∫ 𝑓𝑋(𝑡)𝑑𝑡, 𝑓𝑋(𝑥) = 𝑑𝑥
(𝑥)
−∞
(The second equality is valid for those x at which the PDF is continuous.)
𝑃(1≤𝑋≤2) =?
17/2, 0.12
5. Continuous Random Variables
𝑥 > 0, 𝐹𝑋(𝑥) =?
−2⋅𝑥
0, 1 − 𝑒
For a normal random variable X with mean µ and variance σ2, we use a
two-step procedure.
(b) Read the CDF value from the standard normal table:
𝑃(𝑋≤𝑥) = 𝑃 ( 𝑋−μ
σ
≤
𝑥−μ
σ ) = 𝑃(𝑌≤ 𝑥−μ
σ ) = Φ( )
𝑥−μ
σ
5. Continuous Random Variables
The standard normal table. The entries in this table provide the numerical values of Φ(y) =
P(Y ≤ y), where Y is a standard normal random variable, for y between 0 and 3.49. For
example, to find Φ(1.71), we look at the row corresponding to 1.7 and the column
corresponding to 0.01, so that Φ(1.71) = .9564. When y is negative, the value of Φ(y) can be
found using the formula Φ(y) = 1 − Φ(−y)
a) always.
b) if and only if σ≠0.
c) if and only if μ≠0 and σ≠0.
b
5. Continuous Random Variables
P(X≤5.2)=?
P(X≥2.8)=?
P(X≤2.2)=?
0.6554, 0.6554, 0.2743
a) λδ
5. Continuous Random Variables
b) 2λδ
−4λ
c) δ𝑒
−4λ
d) λδ𝑒
−2λ
e) λδ𝑒
−2λ
f) 2λδ𝑒
g) None
b, d, f
𝑃(𝐴) = ∫ 𝑓𝑋∣𝐴(𝑥)𝑑𝑥
𝐵
• Let A1, A2, . . . , An be disjoint events that form a partition of the sample
space, and assume that P(Ai) > 0 for all i. Then,
𝑛
𝑖=1
( )
𝑓𝑋(𝑥) = ∑ 𝑃 𝐴𝑖 𝑓𝑋∣𝐴 (𝑥)
𝑖
The conditional PDF fX|Y (x | y) is defined only for those y for which fY (y) > 0.
• We have
𝑃(𝑌 = 𝑦) = ∫ 𝑓𝑋∣𝑌(𝑦)𝑑𝑥
𝐴
5. Continuous Random Variables
a) λδ
b) 2λδ
−4λ
c) δ𝑒
−4λ
d) λδ𝑒
−2λ
e) λδ𝑒
−2λ
f) 2λδ𝑒
g) None of the above
b, d, f
5. Continuous Random Variables
𝑓𝑋(9. 5) =?
𝑓𝑋(10. 5) =?
1/8, ½
Find the values of a, b, c, d. Each one of your answers should be one of the
following: 0, x, y, or 1.
0, 1, y, 1
5. Continuous Random Variables
And
∞
𝐸[𝑌 = 𝑦] = ∫ 𝑔(𝑥)𝑓𝑋∣𝑌(𝑦)𝑑𝑥
−∞
5. Continuous Random Variables
• Total expectation theorem: Let A1, A2, . . . , An be disjoint events that form
a partition of the sample space, and assume that P(Ai) > 0 for all i. Then,
𝑛
( )[ ]
𝐸[𝑋] = ∑ 𝑃 𝐴𝑖 𝐸 𝐴𝑖
𝑖=1
Similarly,
∞
𝐸[𝑋] = ∫ 𝐸[𝑌 = 𝑦]𝑓𝑌(𝑦)𝑑𝑦
−∞
• There are natural analogs for the case of functions of several random
variables. For example,
And
1. 𝐸[𝑋 = 𝑥] = ∫𝑔(𝑦)𝑓𝑌∣𝑋(𝑥)𝑑𝑦
2. 𝐸[𝑋 = 𝑥] = ∫𝑔(𝑦)𝑓𝑌∣𝑋(𝑥)𝑑𝑦
6. 𝐸[𝑌 = 𝑦] = 𝐸[𝑌 = 𝑦]
𝐸[𝑋𝑌] = 𝐸[𝑋]𝐸[𝑌]
Furthermore, for any functions g and h, the random variables g(X) and h(Y )
are independent, and we have
𝐸[𝑔(𝑋)ℎ(𝑌)] = 𝐸[𝑔(𝑋)]𝐸[ℎ(𝑌)]
(𝑋 + 𝑌) = (𝑋) + (𝑌)
Definition of independence
Suppose that X and Y are independent, with a joint PDF that is uniform on a
certain set S: fX,Y(x,y) is constant on S, and zero otherwise. The set S
a) must be a square.
b) must be a set of the form {(x,y):x∈A, y∈B}} (known as the Cartesian
product of two sets A and B).
c) can be any set.
b
Yes, Yes
Exercise: Stick-breaking
Consider the same stick-breaking problem as in the previous clip, and
let ℓ=1 Recall that fX,Y(x,y)=1/x when 0≤y≤x≤1.
{ (4𝑥
𝑓𝑋,𝑌(𝑥, 𝑦) = 𝑐⋅ exp 𝑒𝑥𝑝 −
1
2
2 2
− 8𝑥 + 𝑦 − 6𝑦 + 13 )}
E[X]=?
Var(X)?
E[Y]=?
Var(Y)=?
1, ¼, 3, 1
𝑝𝑋(𝑥)𝑝𝑌∣𝑋(𝑥) = 𝑝𝑌(𝑦)𝑝𝑋∣𝑌(𝑦)
and the terms on the two sides in this relation are both equal to
𝑝𝑋,𝑌(𝑥, 𝑦)
𝑝𝑋(𝑥)𝑓𝑌∣𝑋(𝑥) = 𝑓𝑌(𝑦)𝑝𝑋∣𝑌(𝑦)
and the terms on the two sides in this relation are both equal to
𝑃(𝑋=𝑥,𝑦≤𝑌≤𝑦+δ)
lim δ
δ→0
𝑓𝑋(𝑥)𝑓𝑌∣𝑋(𝑥) = 𝑓𝑌(𝑦)𝑓𝑋∣𝑌(𝑦)
and the terms on the two sides in this relation are both equal to
𝑃(𝑥≤𝑋≤𝑥+δ,𝑦≤𝑌≤𝑦+δ)
lim 2
δ→0 δ
Let K be the total number of Heads in two independent tosses of the coin.
Find pQ|K(3/4|2).
3/8
1
𝑓𝑋(𝑥) = { 𝑏−𝑎 , 𝑖𝑓𝑎≤𝑥≤𝑏 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
2
𝑎+𝑏 (𝑏−𝑎)
𝐸[𝑋] = 2
, (𝑋) = 12
−λ𝑥
𝐹𝑋(𝑥) = {1 − 𝑒 , 𝑖𝑓𝑥≥0 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1 1
𝐸[𝑋] = λ
, (𝑋) = 2
λ
5.11. Problems
Problem 1. Normal random variables
Let X and Y be two normal random variables, with means 0 and 3,
respectively, and variances 1 and 16, respectively. Find the following, using
the standard normal table.
5. Continuous Random Variables
P(X>−1)= ?
P(X≤−2)= ?
E[V]=?
Var(V)=?
P(−2<Y≤2)=?
0.8413, 0.0228, 1/3, 16/9, 0.2957
Problem 2. CDF
For each one of the following figures, identify whether it is a valid CDF. The
value of the CDF at points of discontinuity is indicated with a small solid
circle.
1.
2.
5. Continuous Random Variables
3.
4.
Not a valid CDF, Not a valid CDF, Valid CDF, Not a valid CDF
If 0≤y≤1:
𝑓𝑌(𝑦) =?
If 1<y≤2:
𝑓𝑌(𝑦) =?
4 3
4/15, 28/45, 45
⋅(8 − 𝑦 ), 0.2976
If 0<x<40 and 0<y<3x:
𝑓𝑋,𝑌(𝑥, 𝑦) =?
If y<0 or y>3x:
𝑓𝑋,𝑌(𝑥, 𝑦) =?
P(Z>0)=?
If −40<z<0:
𝑓𝑍(𝑧) =?
If 0<z<80:
𝑓𝑍(𝑧) =?
5. Continuous Random Variables
If z<−40 or z>80:
𝑓𝑍(𝑧) =?
What is E[Z]?
40+𝑧 80−𝑧
1/2400, 0, 2/3, 2400
, 4800
, 0, 40/3
Using Bayes' rule, find the conditional PMF pK∣Y(k∣y). Which of the following
is the correct expression for pK∣Y(2|y), when y≥0?
𝑦
−2
𝑒
a) −𝑦 −2
𝑦
−3
𝑦
1
𝑒 +𝑒 +3𝑒
−𝑦
𝑒
b) −𝑦 −2
𝑦
−3
𝑦
1 1
3
𝑒 +𝑒 +3𝑒
𝑦
−2
𝑒
c) −𝑦 −2
𝑦
−3
𝑦
1 1 1
3
𝑒 +3𝑒 +3𝑒
5. Continuous Random Variables
𝑦
−3
𝑒
d) −𝑦 −
𝑦 𝑦
−3
2 1
𝑒 +𝑒 +3𝑒
Are X and Y independent?
Find 𝑓𝑋(𝑥):
● If 0<x≤1
● If 1<x<2
● If x<0 or x≥2
● If y<0 or y>1/2
● If 1/2<x<1
● If 1<x<3/2
● If x<1/2 or x>3/2
Chapter VI
6. Further Topics
Random
Variables
6.1. Calculation of the PDF of a Function Y = g(X)
of a Continuous Random Variable X
Calculate the CDF FY of Y using the formula
𝑌 = 𝑎𝑋 + 𝑏
𝑓𝑌(𝑦) =
1
|𝑎|
𝑓𝑋 ( )
𝑦−𝑏
𝑎
-1, 5
a) always.
b) a≠0.
c) a≠0 and b=0
d) a>0
e) a>0 and b=0
f) a=1
e
a) always.
b) a≠0.
c) a≠0 and b=0
d) a>0
e) a>0 and b=0
b
Let Y=X2. For y≥1, the PDF of Y it takes the form fY(y)=a/yb. Find the values
of a and b.
6. Further Topics on Random Variables
½, 3/2
1 -1 -1, -2 -1 1
which is the distance of the outcome (X,Y) from the origin. The PDF of Z,
𝑏
for z∈[0,1], takes the form 𝑓𝑍(𝑧) = 𝑎𝑧 . Find a and b.
2, 1
𝑏
∫ ℎ𝑋(𝑥)ℎ𝑌(𝑧 − 𝑥)𝑑𝑥
𝑎
a) Is 2X−4 always normal?
b) Is 3X−4Y always normal?
6. Further Topics on Random Variables
c) Is X2+Y always normal?
True, True, False
• We have
, and satisfy
− 1≤ρ(𝑋, 𝑌)≤1
(𝑋1 + ⋯ + 𝑋8) =?
144
a) It follows that:
ρ(𝑋, − 𝑌) =?
ρ(− 𝑋, − 𝑌) =?
ρ(X,Y) is close to ?
- ½, ½, ½, - ½, 1, 0
𝐸[𝐸[𝑌]] = 𝐸[𝑋]
a) E[X+Y∣X]=0.
b) E[X+Y∣X]=x.
c) E[X+Y∣X]=X.
d) E[X+Y∣X]=X+Y.
c
involved. For each one of the statements below, indicate whether it is true
or false.
E[E[X|Y,Z]|Z]=E[X|Z]
E[E[X|Y]|Z]=E[X|Z]
E[E[X|Y,Z]]=E[X|Z]
The quantity E[g(X,Y)|Y,Z] is:
a) a random variable
b) a number
c) a function of (X,Y)
d) a function of (Y,Z)
e) a function of Z only
True, True, False, True, False, False, True, False
a) q
b) Q
c) 1−q
d) 1−Q
b
a) ¼
6. Further Topics on Random Variables
b) q(1−q)
c) Q(1−Q)
d)
q2
e) Q2
(𝐸[𝑄]) =?
𝐸[(𝑄) ] =?
c, 1/12, 1/6
(a) If X=Y (i.e., the two random variables always take the same values),
then Var(X|Y)=0.
(b) If X=Y (the two random variables always take the same values),
then Var(X|Y)=Var(X).
a mean of: ?
a variance of: ?
(b) E[Var(X|Y)]= ?
(c) Var(X)=?
50, 66.666, 10, 76.66
E[M]=?
Var(M)=?
9, 24
Problems
Problem 1. The PDF of the logarithm of X
Let XX be a non-negative random variable. Find the PDF of the random
variable Y=lnX for each of the following cases:
For general fX, fY(y)=
( 𝑦)
a) 𝑓𝑋 𝑒 𝑒
𝑦
( 𝑦)
𝑓𝑋 𝑒
b) 𝑦
𝑒
𝑓𝑋(ln𝑙𝑛 𝑦 )
c) 𝑦
d) none of the above
1
When 𝑓𝑋(𝑥) = { 4 , 𝑖𝑓 2 < 𝑥≤6 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
g(y)= ?
a=?
b=?
Give a formula for g(y), and the values of a and b, using standard notation.
g(y)= ?
a=?
b=?
𝑦
𝑒 2⋅𝑦 𝑦
a, 4
, 0.6931, 1.791, 2⋅𝑒 − 2⋅𝑒 , 0, 0.6931
Y=5X−7
𝑓 (𝑦) = 5𝑓 ( )
𝑦+7
c) 𝑌 𝑋 5
𝑓 (𝑦) = 𝑓 ( )
1 𝑦−7
d) 𝑌 5 𝑋 5
Y=X2−2X. For y≥−1,
𝑓𝑋(1+ 𝑦+1)−𝑓𝑋(1− 𝑦+1)
a)
2 𝑦−1
𝑓𝑋(1+ 𝑦+1)−𝑓𝑋(1− 𝑦+1)
b)
2 𝑦+1
𝑓𝑋(1+ 𝑦+1)+𝑓𝑋(1− 𝑦+1)
c)
2 𝑦+1
𝑓𝑋(1+ 𝑦+1)+𝑓𝑋(1− 𝑦+1)
d)
2 𝑦+1−2 𝑦−1
6. Further Topics on Random Variables
b, c
Let Z=max{X,Y}. Find the PDF of Z. Express your answer in terms of z using
standard notation.
For 0<z<1:
𝑓𝑍(𝑧) =?
For 0<z<1:
𝑓𝑍(𝑧) =?
For 0<z<2:
𝑓𝑍(𝑧) =?
2⋅z, z, ½
Find a, b, c, d
a) X1 and X2 are uncorrelated.
b) X1 and X2 are positively correlated.
c) X1 and X2 are negatively correlated.
(𝑋1, 𝑋2) =?
Suppose now that the die is biased, with a probability pi≠0 that the result of
any given die roll is ii, for i=1, 2,…,k. We still consider n independent rolls of
this biased die and define Xi to be the number of rolls that result in side i.
[ 2] [ 2]
𝐸[𝑋] = 𝐸[𝑌] = 𝐸[𝑍] = 0 𝐸 𝑋 = 𝐸 𝑌 = 𝐸 𝑍 = 1 [ 2]
Find the correlation coefficients ρ(X−Y,X+Y), ρ(X+Y,Y+Z), and ρ(X,Y+Z).
0, 0.5, 0
E[X]=?
Var(X)=?
7.5, 18.75
7. Bayesian Inference
Chapter VII
7. Bayesian
Inference
7.1. Major Terms, Problems, and Methods in this
Chapter
•Bayesian statistics treats unknown parameters as random variables with
known prior distributions.
(a) Guess the time at which the grocery store was robbed.
For each of the following questions, choose the most appropriate answer.
1) X2 is an
2) 25 is an
• Θ discrete, X continuous:
𝑝Θ(θ)𝑓𝑋∣⊖(θ)
𝑝Θ∣𝑋(𝑥) =
( ')
∑ 𝑝Θ θ 𝑓𝑋∣Θ θ
'
( ')
θ
• Θ continuous, X discrete:
𝑓Θ(θ)𝑝𝑋∣Θ(θ)
𝑓Θ∣𝑋(𝑥) =
( ')
∫𝑓Θ θ 𝑝𝑋∣Θ θ 𝑑θ ( ') '
• Θ continuous, X continuous:
𝑓Θ(θ)𝑓𝑋∣Θ(θ)
𝑓Θ∣𝑋(𝑥) =
( ')
∫𝑓Θ θ 𝑓𝑋∣Θ θ 𝑑θ ( ') '
7. Bayesian Inference
• If Θ takes only a finite number of values, the MAP rule minimizes (over all
decision rules) the probability of selecting an incorrect hypothesis. This is
true for both the unconditional probability of error and the conditional one,
given any observation value x.
b) We flip the coin 10 times independently and observe 1 Heads and 9 Tails.
𝑚 𝑛
The posterior PDF of Θ will be of the form 𝑐θ (1 − θ) , where c is a
normalizing constant and where m=?, n=?
High, 10, 10
d) What is the MAP estimate of Θ1 based on X, that is, the one that
maximizes 𝑝Θ ∣𝑋(θ1∣𝑥)??
1
a) Let us assume that the true value of Θ is 1. In this case, our estimator
makes an error if and only if W has a low (negative) value. The conditional
^
probability of error given the true value of Θ is 1, that is, 𝑃(Θ ≠1∣Θ = 1), is
equal to:
a) Φ(−1)
b) Φ(0)
c) Φ(1)
b) For this problem, the overall probability of error is easiest found using
the formula
^ ^
a) 𝑃(Θ ≠ Θ) = ∫𝑃(Θ ≠ Θ∣𝑋 = 𝑥)𝑓𝑋(𝑥)𝑑𝑥
^ ^
b) 𝑃(Θ ≠ Θ) = ∑ 𝑃(Θ ≠θ∣Θ = θ)𝑝Θ(θ)
θ
a, b
^ 2
b) The conditional mean squared error 𝐸[(Θ − Θ 𝐿𝑀𝑆
) ∣𝑋 = 𝑥] is
^ 2
d) The conditional mean squared error 𝐸[(Θ − Θ 𝑀𝐴𝑃
) ∣𝑋 = 𝑥] is
a) a=?
7. Bayesian Inference
[ 2]
b) 𝐸 Θ =?
12, 0.2
a) E[X]=?
b) Var(X)=?
-3, 0.125
Where
(Θ,𝑋)
ρ= σΘσ𝑋
(1 − ρ2)σ2Θ
( )( )
1
σ0
2 +
1
2
σ1
2 2
where σ0 𝑎𝑛𝑑 σ1 are the variances of Θ and W, respectively.
𝑋1 = Θ1 + 𝑊1, 𝑋2 = Θ1 + Θ2 + 𝑊2
^ ^ ^
Find the MAP estimate θ = (θ 1, θ 2)𝑜𝑓(Θ1, Θ2) if we observe
that X1=1, X2=3. (You will have to solve a system of two linear equations.)
^ ^
θ 1
=?, θ 2
=?
1, 1
7. Bayesian Inference
7.10. Problems
Problem 1. Defective Coin
A defective coin minting machine produces coins whose probability of
Heads is a random variable Q with PDF
4
𝑓𝑄(𝑞) = {5𝑞 , 𝑖𝑓𝑞∈[0, 1] 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
P(A)=?
Find the conditional PDF of Q given event A. Express your answer in terms
of q using standard notation.
P(B∣A)=?
5
5/6, 6⋅𝑞 , 6/7
What is the probability that Bob will guess the coin correctly using the
decision rule from part 2?
Suppose instead that Bob tries to guess which coin he received without
tossing it. He still guesses the coin in order to minimize the probability of
error. What is the probability that Bob will guess the coin correctly under
this scenario?
Bob uses the decision rule of Part 2. If p is small, then Bob will always
decide in favor of the second coin, ignoring the results of the three tosses.
The range of such p's is [0,t). Find t.
c, a, 0.8437, 0.75, 0.035
Consider the MAP rule for deciding between the two hypotheses, given
that X=x.
Suppose for this part of the problem that p=2/3. The MAP rule can choose
in favor of the hypothesis Θ=1 if and only if x≥c1. Find the value of c1.
7. Bayesian Inference
For this part, assume again that p=2/3. Find the conditional probability of
error for the MAP decision rule, given that the hypothesis Θ=0 is true.
𝑃(Θ = 0) =?
where θ0, θ1, 𝑎𝑛𝑑 θ2 are some parameters and t stands for time. At certain
times t1,…,tn, we make noisy observations Y1,…,Yn, respectively, of the
height of the object. Based on these observations, we would like to
estimate the object's vertical trajectory.
We consider the special case where there is only one unknown parameter.
We assume that θ0 (the height of the object at time zero) is a known
constant. We also assume that θ2 (which is related to the acceleration of
the object) is known. We view θ1 as the realized value of a continuous
random variable Θ1. The observed height at time
2
𝑡𝑖 𝑖𝑠 𝑌𝑖 = θ0 + Θ1𝑡𝑖 + θ2𝑡𝑖 + 𝑊 2𝑖 = 1, …, 𝑛, 𝑤ℎ𝑒𝑟𝑒 𝑊𝑖 models the observation
𝑖
2
noise. We assume that Θ1 ∼ 𝑁(0, 1), 𝑊1, …, 𝑊𝑛∼𝑁(0, σ ), and that all these
random variables are independent.
Carry out this minimization and choose the correct formula for the MAP
^
estimate, θ 1, from the options below.
𝑛
2
∑ 𝑡𝑖(𝑦𝑖−θ0−θ2𝑡𝑖 )
^
a) θ 1
= 𝑖=1
2
σ
7. Bayesian Inference
𝑛
2
∑ 𝑡𝑖(𝑦𝑖−θ0−θ2𝑡𝑖 )
^
b) θ 1
= 𝑖=1
𝑛
2 2
σ + ∑ 𝑡𝑖
𝑖=1
𝑛
2
∑ 𝑡𝑖(𝑦𝑖−θ0−θ2𝑡𝑖 )
^
c) θ 1
= 𝑖=1
𝑛
2 2
σ + ∑ θ2𝑡𝑖
𝑖=1
Let σ=1 and consider the special case of only two observations (n=2). Write
^ 2
down a formula for the mean squared error 𝐸[(Θ 1
− Θ1) ], as a function
of t1 and t2.
^ 2
𝐸[(Θ 1
− Θ1) ] =?
t1 = ?
t2 = ?
1
b, true, 2 2 , 10, 10
1+𝑡1+𝑡2
| 𝑛 2 𝑛 |
|𝑐 ∑ 𝑡 + 𝑐 ∑ 𝑡 | < 1
| 1 𝑖|
| 𝑖=1 𝑖 2
𝑖=1 |
Find the values of c1≥0 and c2≥0 such that this is true. Express your answer
in terms of n, and use "ln" to denote the natural logarithm function, as in
"ln(3)".
3
8⋅𝑛⋅𝑙𝑛(2)
,0
a) Find an expression for the conditional mean squared error of the LMS
estimator given that X=x, valid for x∈[0,1]. Express your answer in terms
of x
b) Is it true that the calculation of the mean squared error of the LMS
estimator will always involve only ordinary integrals (integrals with respect
to only one variable)?
Yes, No
a=?, b=?
a) Is it true that the LMS estimator is guaranteed to take values only in the
interval [0,1]?
And
7.12. Problems
Problem 1. Determining the type of a lightbulb
The lifetime of a type-A bulb is exponentially distributed with parameter λ.
The lifetime of a type-B bulb is exponentially distributed with parameter μ,
where μ>λ>0. You have a box full of lightbulbs of the same type, and you
would like to know whether they are of type A or B. Assume an a
priori probability of 1/4 that the box contains type-B lightbulbs.
α=?
results in Heads for the first time, for i=1,2,…,k (Ti includes the toss that
results in the first Heads.)
You may find the following integral useful: For any non-negative
integers k and m,
1
𝑘 𝑚 𝑘!𝑚!
∫ 𝑞 (1 − 𝑞) 𝑑𝑞 = (𝑘+𝑚+1)!
0
Find the PMF of T1. (Express your answer in terms of t using standard
notation.)
For t=1,2,…
𝑝𝑇 (𝑡) =?
1
Find the least mean squares (LMS) estimate of Q based on the observed
value, t, of T1. (Express your answer in terms of t using standard notation.)
[ ]
𝐸 𝑇1 = 𝑡 =?
We flip each of the k coins until they result in Heads for the first time.
^
Compute the maximum a posteriori (MAP) estimate 𝑞 of Q given the
number of tosses needed, T1=t1,…,Tk=tk, for each coin. Choose the correct
^
expression for 𝑞 .
^ 𝑘−1
a) 𝑞 = 𝑘
∑ 𝑡𝑖
𝑖=1
^ 𝑘
b) 𝑞 = 𝑘
∑ 𝑡𝑖
𝑖=1
^ 𝑘+1
c) 𝑞 = 𝑘
∑ 𝑡𝑖
𝑖=1
^
The LLMS estimator of U based on X is of the form 𝑈 = 𝑎𝑋 + 𝑏.
Find a and b. Express your answers in terms of m, v, and h using standard
notation.
E[AB]=?
E[NA]=?
^
Let 𝑁 = 𝑐1𝐴 + 𝑐2 be the LLMS estimator of N given A. Find c1 and c2 in
terms of m and v
c1=?
c2=?
2
2 2 𝑣 𝑚
𝑚 + 𝑣, 𝑚 + 𝑣, 𝑚+𝑣
, 𝑚+𝑣
For 0≤x≤1 and x/2≤θ≤x:
𝑓Θ∣𝑋(𝑥) =?
For 0≤x≤1:
^
θ 𝑀𝐴𝑃
(𝑥) =?
For 0≤x≤1:
^
θ 𝐿𝑀𝑆
(𝑥) =?
^
Find the linear LMS estimate θ 𝐿𝐿𝑀𝑆
of Θ based on the observation X=x.
^
Specifically, θ 𝐿𝐿𝑀𝑆
is of the form 𝑐1 + 𝑐2𝑥. Find c1 and c2.
1 𝑥
θ⋅𝑙𝑛(2)
, x/2, 2⋅𝑙𝑛(2)
, 0.065, 0.581
Exam
Problem 1(a)
Suppose that X, Y, and Z are independent, with E[X]=E[Y]=E[Z]=2,
and E[X2]=E[Y2]=E[Z2]=5.
Find cov(XY,XZ).
Compute Cov(X,Y).
Are X and Y independent?
Problem 2
We are given a stick that extends from 0 to x. Its length, x, is the realization
of an exponential random variable X, with mean 1. We break that stick at a
point Y that is uniformly distributed over the interval [0,x].
(𝐸[𝑋]) =?
We do not observe the value of X, but are told that Y=2.2. Find the MAP
estimate of X based on Y=2.2.
−𝑥
𝑒
𝑥
, 1/4, 2.2
Problem 3
Let X be uniform on [0,1/2]. Find the PDF 𝑓𝑌(𝑦)𝑜𝑓𝑌 = 𝑋/(1 − 𝑋)
2
2
(1+𝑦)
Problem 4(a)
A random variable X is generated as follows. We flip a coin. With
probability p, the result is Heads, and then X is generated according to a
PDF fX|H which is uniform on [0,1]. With probability 1−p the result is Tails, and
then X is generated according to a PDF fX|T of the form
For 0≤x≤1:
𝑓𝑋(𝑥) =?
Calculate E[X]
Problem 4(b)
We now wish to estimate the result of the coin toss, based on the value
of X.
Find P(Tails∣X=1/4).
The MAP rule decides in favor of Heads if X<a and in favor of Tails if X>a.
What is a?
1 1−𝑝 𝑝
2⋅𝑥 + 𝑝⋅(1 − 2⋅𝑥), 6
⋅(4 − 𝑝), 1+𝑝
, 2⋅(1−𝑝)
Problem 5
Let X and Y be independent positive random variables. Let Z=X/Y. In what
follows, all occurrences of x, y, z are assumed to be positive numbers.
𝑝𝑍∣𝑌(𝑦) = 𝑝𝑋(?)
𝑓𝑍∣𝑌(𝑦) = 𝐴𝑓𝑋(𝐵)
∞
e) 𝑓𝑍(𝑧) = ∫ 𝑓𝑌(𝑦)𝑓𝑋(𝑦𝑧)𝑑𝑦
0
∞
f) 𝑓𝑍(𝑧) = ∫ 𝑦𝑓𝑌(𝑦)𝑓𝑋(𝑦𝑧)𝑑𝑦
0
𝑧⋅𝑦, z, 𝑧⋅𝑦, a d f
8. Limit theorems and classical statistics
Chapter VIII
8. Limit theorems
and classical
statistics
8.1. Markov Inequality
If a random variable X can only take nonnegative values, then
𝐸[𝑋]
𝑃(𝑋≥𝑎) ≤ 𝑎
, 𝑓𝑜𝑟 𝑎𝑙𝑙𝑎 > 0
2
σ
𝑃(|𝑋 − μ|≥𝑐) ≤ 2 , 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑐 > 0
𝑐
𝑃(|𝑍|≥4)≤?
¼
8. Limit theorems and classical statistics
a) Is it true that the Chebyshev inequality is stronger (i.e., the upper bound
is smaller) than the Markov inequality, when a is very large?
b) Is it true that the Chebyshev inequality is always stronger (i.e., the upper
bound is smaller) than the Markov inequality?
Yes, No
(| | )
𝑃 𝑀𝑛 − μ ≥ϵ ≤
𝑎σ
𝑛
n should
Exercise: Polling
We saw that if we want to have a probability of at least 95% that the poll
results are within 1 percentage point of the truth, Chebyshev's inequality
recommends a sample size of n=50,000n=50,000. This is very large
compared to what is done in practice. Newspaper polls use smaller sample
sizes for various reasons. For each of the following, decide whether it is a
valid reason.
c) the people sampled are all different, so their answers are not identically
distributed.
(| | )
lim 𝑃 𝑌𝑛 − 𝑎 ≥ϵ = 0
𝑛→∞
(| | ) (
| 𝑋 +⋯+𝑋 |
𝑃 𝑀𝑛 − μ ≥ϵ = 𝑃 | 1 𝑛 𝑛 − μ|≥ϵ →0, 𝑎𝑠𝑛→∞
| | )
8.6. The Central Limit Theorem
Let X1,X2, . . . be a sequence of independent identically distributed random
variables with common mean μ and variance σ , and define 2
𝑋1+⋯+𝑋𝑛−𝑛μ
𝑍𝑛 =
σ 𝑛
Exercise: CLT
Let Xn be i.i.d. random variables with mean zero and variance σ2.
Let Sn=X1+⋯+Xn. Let Φ stand for the standard normal CDF. According to
the central limit theorem, and as n→∞, 𝑃(𝑆𝑛≤2σ 𝑛) converges to Φ(a),
where:
a= ?
3. Consider the class average if the class is split into two equal-size
sections. One section gets an easy exam and the other section gets a hard
exam.
Use the CLT to find good approximations to the following quantities. You
may want to refer to the normal table. In parts (a) and (b), give answers with
4 decimal digits.
( )
𝑎)𝑃 𝑆100≤245 ≈?
(a) Maximum likelihood (ML) estimation: Select the parameter that makes
the observed data “most likely,” i.e., maximizes the probability of obtaining
the data at hand.
(b) Linear regression: Find the linear relation that matches best a set of
data pairs, in the sense that it minimizes the sum of the squares of the
discrepancies between the model and the data
(c) Likelihood ratio test: Given two hypotheses, select one based on the
ratio of their “likelihoods,” so that certain error probabilities are suitably
small.
^
• The bias of the estimator, denoted by 𝑏θ(Θ 𝑛), is the expected value of the
estimation error:
^ ^
𝑏θ(Θ 𝑛) = 𝐸θ[Θ 𝑛] − θ
^
The expected value, the variance, and the bias of Θ 𝑛
depend on θ, while
the estimation error depends in addition on the observations X1, . . . ,Xn.
^ ^
• We call Θ 𝑛
unbiased if 𝐸θ[Θ 𝑛], for every possible value of θ.
^ ^
• We call Θ 𝑛
asymptotically unbiased if lim 𝐸θ[Θ 𝑛] = θ, for every possible
𝑛→∞
value of θ.
8. Limit theorems and classical statistics
^ ^
• We call Θ 𝑛
consistent if the sequence Θ 𝑛
converges to the true value of
the parameter θ, in probability, for every possible value of θ.
^ ^
• The ML estimate of a one-to-one function h(θ) of θ is ℎ(θ 𝑛) where θ 𝑛 is
the ML estimate of θ (the invariance principle).
• When the random variables Xi are i.i.d., and under some mild additional
assumptions, each component of the ML estimator is consistent and
asymptotically normal.
2
• The estimator 𝑆𝑛 coincides with the ML estimator if the Xi are normal. It is
^2
biased but asymptotically unbiased. The estimator 𝑆 𝑛
is unbiased. For
large n, the two variance estimators essentially coincide.
^− ^+
•Θ 𝑛
𝑎𝑛𝑑 Θ 𝑛
are random variables that depend on the observations X1, . . .
,Xn.
^− ^+
𝑃θ(Θ 𝑛 ≤θ≤Θ 𝑛 )≥1 − α
Over the next 100 days, I expect that the unknown parameter will be inside
the confidence interval about 70 times.
Exercise: A simple CI
Let θ be an unknown parameter, and let X be uniform on the
interval [θ−0.5,θ+0.5].
a =?
b =?
If you do not have any prior knowledge about the value of E[Xai], can you
estimate it based on the available data?
3, 4, Yes
Exercise: ML estimation
Let K be a Poisson random variable with parameter λ: its PMF is
𝑘 −λ
λ𝑒
𝑝𝐾(𝑘, λ) = 𝑘!
, 𝑓𝑜𝑟 𝑘 = 0, 1, 2, …
Problems
Problem 1. Convergence in probability
For each of the following sequences, determine whether it converges in
probability to a constant. If it does, enter the value of the limit. If it does
not, enter the number “999".
𝐼1+𝐼2+⋯+𝐼𝑖
𝑆𝑖 = 𝑖
𝑛
1 2 1
𝐿𝑒𝑡 𝑍𝑖 = 3
𝑋𝑖 + 3
𝑋𝑖+1𝑓𝑜𝑟 𝑖 = 1, 2, …, 𝑎𝑛𝑑 𝑙𝑒𝑡 𝑀𝑛 = 𝑛
∑ 𝑍𝑖 𝑓𝑜𝑟 𝑛 = 1, 2
𝑖=1
lim 𝑃
𝑛→∞
( 𝑛
3
− 10≤𝑆𝑛 ≤
𝑛
3
+ 10 =? )
lim 𝑃
𝑛→∞
( 𝑛
3
−
𝑛
6
≤ 𝑆𝑛 ≤
𝑛
3
+
𝑛
6 ) =?
𝑛→∞
lim 𝑃 ( 𝑛
3
−
2𝑛
5
≤ 𝑆𝑛 ≤
𝑛
3
+
2𝑛
5 ) =?
0, 1, 0.45
8. Limit theorems and classical statistics
E[H]=?
Var(H)≤?
Calculate the smallest possible value of n such that the standard deviation
of H is guaranteed to be at most 0.01.
1.96⋅3 1.96⋅3
a) ⎡⎢𝐻 − ,𝐻 + ⎤
⎥
⎣ 𝑛 𝑛 ⎦
1.96 1.96
b) ⎡⎢𝐻 − ,𝐻 + ⎤
⎥
⎣ 3𝑛 3𝑛 ⎦
1.96⋅ 3 1.96⋅ 3
c) ⎡⎢𝐻 − ,𝐻 + ⎤
⎥
⎣ 𝑛 𝑛 ⎦
1.96⋅3 1.96⋅3
d) ⎡⎢𝐻 − ,𝐻 + ⎤
⎥
⎣ 𝑛 𝑛 ⎦
8. Limit theorems and classical statistics
3⋅𝑛
𝑠
If yes, enter the mean and variance of N. If not, enter 0 in both of the
corresponding answer boxes.
mean: ?
variance: ?
Let N be the number of Heads in 300 tosses. At each toss, one of the three
coins is selected at random (either choice is equally likely), and
independently from everything else.
mean: ?
variance: ?
Let N be the number of Heads in 100 tosses of the red coin, followed by
100 tosses of the green coin, followed by 100 tosses of the yellow coin (for
a total of 300 tosses).
mean: ?
variance: ?
We select one of the three coins at random: each coin is equally likely to be
selected. We then toss the selected coin 300 times, independently, and
let N be the number of Heads.
mean: ?
variance: ?
120 72, 150 75, 150 73, 0 0
9. Bernoulli and Poisson processes
Chapter IX
9. Bernoulli and
Poisson
processes
9.1. BERNOULLI PROCESS
Some Random Variables Associated with the Bernoulli Process and their
Properties
1 1−𝑝
𝐸[𝑇] = 𝑝
, (𝑇) = 2
𝑝
• Let n be a given time and let T be the time of the first success after time
n. Then, T − n has a geometric distribution with parameter p, and is
independent of the random variables X1, . . . ,Xn.
𝑌𝑘 = 𝑇1 + 𝑇2 + ⋯ + 𝑇𝑘
and the latter are independent geometric random variables with common
parameter p.
[ ] [ ]
𝐸 𝑌𝑘 = 𝐸 𝑇1 + ⋯ + 𝐸 𝑇𝑘 = [ ] 𝑘
𝑝
Exercise: Splitting
For each exam, Ariadne studies with probability 1/2 and does not study
with probability 1/2, independently of any other exams. On any exam for
which she has not studied, she still has a 0.20 probability of passing,
independently of whatever happens on other exams. What is the expected
number of total exams taken until she has had 3 exams for which she did
not study but which she still passed?
30
𝐸[𝑍] = λ, (𝑍) = λ
(a) (Time-homogeneity) The probability P(k, τ) of k arrivals is the same for all
intervals of the same length τ.
𝑜(τ) 𝑜𝑘(τ)
lim τ
= 0, lim τ
=0
τ→0 τ→0
Suppose that you have recorded this process in a movie and that you play
this movie at twice the speed. The process that you will be seeing in the
sped-up movie satisfies the following (pick one of the answers):
Poisson models
For each one of the following situations, state whether a Poisson model is a
plausible model over the specified time frame.
Poisson practice
Consider a Poisson arrival process with rate λλ per hour. To simplify
notation, we let a=P(0,1), b=P(1,1), and c=P(2,1), where P(k,1) is the
probability of exactly k arrivals over an hour-long time interval.
What is the probability that we will have “at most one arrival between
10:00 and 11:00 and exactly two arrivals between 10:00 and 12:00"? Your
answer should be an algebraic function of a, b, and c
2
𝑎 · 𝑐 +𝑏
• The exponential with parameter λ. This is the time T until the first arrival.
Its PDF, mean, and variance are
−λ𝑡 1 1
𝑓𝑇(𝑡) = λ𝑒 , 𝑡≥0, 𝐸[𝑇] = λ
, (𝑇) = 2
λ
• Let t be a given time and let 𝑇 be the time of the first arrival after time t.
Then, 𝑇 −t has an exponential distribution with parameter λ, and is
independent of the history of the process until time t.
𝑌𝑘 = 𝑇1 + 𝑇2 + ⋯ + 𝑇𝑘
and the latter are independent exponential random variables with common
parameter λ.
[ ] [ ]
𝐸 𝑌𝑘 = 𝐸 𝑇1 + ⋯ + 𝐸 𝑇𝑘 = [ ] 𝑘
λ (𝑌𝑘) = (𝑇1) ( )
+ ⋯ + 𝑇𝑘 =
𝑘
2
λ
No, Yes
9.3. Problems
Problem 1. Marie gives away children toys
Marie distributes toys for toddlers. She makes visits to households and
gives away one toy only on visits for which the door is answered and a
toddler is in residence. On any visit, the probability of the door being
answered is 3/4, and the probability that there is a toddler in residence is
1/3. Assume that the events “Door answered" and “Toddler in residence"
are independent and also that events related to different households are
independent.
9. Bernoulli and Poisson processes
a) What is the probability that she has not distributed any toys by the
end of her second visit?
b) What is the probability that she gives away the first toy on her
fourth visit?
c) Given that she has given away her second toy on her fifth visit, what
is the conditional probability that she will give away her third toy on
her eighth visit?
d) What is the probability that she will give away the second toy on her
fourth visit?
e) Given that she has not given away her second toy by her third visit,
what is the conditional probability that she will give away her
second toy on her fifth visit?
f) We will say that Marie “needs a new supply"" immediately after the
visit on which she gives away her last toy. If she starts out with
three toys, what is the probability that she completes at least five
visits before she needs a new supply?
g) If she starts out with exactly six toys, what is the expected value of
the number of houses with toddlers that Marie visits without
leaving any toys (because the door was not answered) before she
needs a new supply?
0.5625, 0.105469, 0.140625, 0.105469, 0.1250, 0.95, 2
For x,t>0,
c) Is X independent of T1?
d) Let Y=T3−T2. Find the PDF of 𝑓𝑌∣𝑇 (𝑦∣𝑇2)
2
For y,t>0,
e) Is Y independent of T2?
9. Bernoulli and Poisson processes
Problem 3. Shuttles
In parts 1, 3, 4, and 5 below, your answers will be algebraic expressions.
Enter "lambda" for λλ and "mu" for μμ. Follow standard notation.
Shuttles bound for Boston depart from New York every hour on the hour
(e.g., at exactly one o"clock, two o'clock, etc.). Passengers arrive at the
departure gate in New York according to a Poisson process with rate λ per
hour. What is the expected number of passengers on any given shuttle?
(Assume that everyone who arrives between two successive shuttle
departures boards the shuttle immediately following his/her arrival. That is,
shuttles are big enough to accommodate all arriving passengers.)
For this and for the remaining parts of this problem, suppose that the
shuttles are not operating on a deterministic schedule. Rather, their
interdeparture times are independent and exponentially distributed with
common parameter μ per hour. Furthermore, shuttle departures are
independent of the process of passenger arrivals. Is the sequence of
shuttle departures a Poisson process?
Problem 4. Ships
All ships travel at the same speed through a wide canal. Each ship
takes t days to traverse the length of the canal. Eastbound ships (i.e., ships
9. Bernoulli and Poisson processes
For Parts 1 and 2, suppose that the pointer is currently pointing west.
What is the probability that the next ship to pass will be westbound?
Determine the PDF, fX(x), of the remaining time, X, until the pointer changes
direction.
For x≥0, fX(x)=?
What is the probability that an eastbound ship does not pass by any
westbound ships during its eastward journey through the canal?
fV(v)=?
What is the probability that the next ship to arrive causes a change in the
direction of the pointer?
Suppose that :
−μ𝑥
𝑓𝑋(𝑥) = {µ𝑒 , 𝑥≥0 0, 𝑥 < 0
E[R]=?
4 3 −μ𝑥
µ𝑥𝑒
𝑓𝑋(𝑥) = { 6
, 𝑥≥0 0, 𝑥 < 0
E[R]=?
2 5
µ
, µ
If you pick a family at random (each family in the village being equally likely
to be picked), what is the expected number of children in that family?
If you pick a child at random (each child in the village being equally likely to
be picked), what is the expected number of children in that child's family
(including the picked child)?
Find a, b.
0, 5