0% found this document useful (0 votes)

11 views31 pages

Probablity Mit Removed

The document outlines the coursework and assessment structure for the 6.041 Probabilistic Systems Analysis course, including quizzes, a final exam, and weekly homework. It provides an overview of the topics covered in lectures, such as probability axioms, conditional probability, and independence of events. Additionally, it emphasizes the importance of reading the assigned textbook and completing required forms for recitation and tutorial scheduling.

Uploaded by

gauravpddu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views31 pages

Probablity Mit Removed

Uploaded by

gauravpddu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

6.

041 Probabilistic Systems Analysis Coursework

6.431 Applied Probability
– Quiz 1 (October 12, 12:05-12:55pm) 17%
• Staﬀ:
– Quiz 2 (November 2, 7:30-9:30pm) 30%
– Lecturer: John Tsitsikli s
– Final exam (scheduled by registrar) 40%
• Pick up and read course information handout – Weekly homework (best 9 of 10) 10%
• Turn in recitation and tutorial scheduling form – Attendance/participation/enthusiasm in 3%
(last sheet of course information handout) recitations/tutorials
• Pick up copy of slides • Collaboration policy described in course info handout

• Text: Introduction to Probability, 2nd Edition,

D. P. Bertsekas and J. N. Tsitsiklis, Athena Scientiﬁc, 2008
Read the text!

LECTURE 1 Sample space Ω

• Readings: Sections 1.1, 1.2 • “List” (set) of possible outcomes
• List must be:
Lecture outline – Mutually exclusive

• Probability as a mathematical framework – Collectively exhaustive

for reasoning about uncertainty • Art: to be at the “right” granularity

• Probabilistic models
– sample space
– probability law
• Axioms of probability
• Simple examples

1
Sample space: Discrete example Sample space: Continuous example

• Two rolls of a tetrahedral die Ω = {(x, y) | 0 ≤ x, y ≤ 1}

– Sample space vs. sequential description
y
1,1
1 1,2
1,3
1
1,4
4 2
3
Y = Second
roll
3
2
1 x
1

1 2 3 4
4
X = First roll 4,4

Probability axioms Probability law: Example with ﬁnite sample space

• Event: a subset of the sample space 4

• Probability is assigned to events
Y = Second 3
roll
2
Axioms:
1
1. Nonnegativity: P(A) ≥ 0
1 2 3 4
2. Normalization: P(Ω) = 1
X = First roll
3. Additivity: If A ∩ B = Ø, then P(A ∪ B) = P(A) + P(B)
• Let every possible outcome have probability 1/16

• P({s1, s2, . . . , sk }) = P({s1}) + · · · + P({sk }) – P((X, Y ) is (1,1) or (1,2)) =

= P(s1) + · · · + P(sk ) – P({X = 1}) =

– P(X + Y is odd) =
• Axiom 3 needs strengthening
• Do weird sets have probabilities? – P(min(X, Y ) = 2) =

2
Discrete uniform law Continuous uniform law

• Let all outcomes be equally likely • Two “random” numbers in [0, 1].
y
• Then,
1
number of elements of A
P(A) =
total number of sample points

1 x
• Computing probabilities ≡ counting
• Uniform law: Probability = Area
• Deﬁnes fair coins, fair dice, well-shuﬄed decks
– P(X + Y ≤ 1/2) = ?

– P( (X, Y ) = (0.5, 0.3) )

Probability law: Ex. w/countably inﬁnite sample space

• Sample space: {1, 2, . . .}
– We are given P(n) = 2−n, n = 1, 2, . . .
– Find P(outcome is even)
Remember!
p

1/2
• Turn in recitation/tutorial scheduling form now

1/4 • Tutorials start next week

…..
1/8
1/16

1 2 3 4

1 1 1 1
P({2, 4, 6, . . .}) = P(2) + P(4) + · · · = + 4 + 6 + ··· =
22 2 2 3

• Countable additivity axiom (needed for this calculation):

If A1, A2, . . . are disjoint events, then:
P(A1 ∪ A2 ∪ · · · ) = P(A1) + P(A2) + · · ·

3
LECTURE 2 Review of probability models

• Readings: Sections 1.3-1.4 • Sample space Ω

– Mutually exclusive
Collectively exhaustive
Lecture outline
– Right granularity
• Review • Event: Subset of the sample space
• Conditional probability
• Allocation of probabilities to events
• Three important tools: 1. P(A) ≥ 0
– Multiplication rule 2. P(Ω) = 1

– Total probability theorem 3. If A ∩ B = Ø,

then P(A ∪ B) = P(A) + P(B)
– Bayes’ rule
3’. If A1, A2, . . . are disjoint events, then:
P(A1 ∪ A2 ∪ · · · ) = P(A1) + P(A2) + · · ·

• Problem solving:
– Specify sample space
– Define probability law
– Identify event of interest
– Calculate...

Conditional probability Die roll example

A 4

Y = Second 3
B roll
2

1
• P(A | B) = probability of A,
1 2 3 4
given that B occurred
– B is our new universe X = First roll

• Definition: Assuming P(B) =

$ 0, • Let B be the event: min(X, Y ) = 2
P(A ∩ B) • Let M = max(X, Y )
P(A | B) =
P(B)
P(A | B) undefined if P(B) = 0 • P(M = 1 | B) =

• P(M = 2 | B) =

4
Models based on conditional Multiplication rule
Multiplication rule
probabilities
PP(A∩∩B
(A C) =
B ∩ C) = PP(A) P(B | A)
(A)·P A)P (C ||A
· P(C A∩∩ B)
B)
• Event A: Airplane is flying above
Event B: Something registers on radar
screen
U U
P(B | A)=0.99 A B P(C | A B) U U
A B C
P(B | A)
P(Bc | A)=0.01
P(A)=0.05
A
P(Bc | A)
A Bc C
U U
P(A)
Bc
U
A
P(Ac)=0.95
A Bc Cc
U U
P(B | Ac)=0.10

P(Bc | Ac)=0.90 P(Ac)

Ac
P(A ∩ B) =

P(B) =

P(A | B) =

Total probability theorem Bayes’ rule

• Divide and conquer • “Prior” probabilities P(Ai)

– initial “beliefs”
• Partition of sample space into A1, A2, A3
• We know P(B | Ai) for each i
• Have P(B | Ai), for every i
• Wish to compute P(Ai | B)
A1 – revise “beliefs”, given that B occurred
B
A1
B
A2 A3

• One way of computing P(B): A2 A3

P(B) = P(A1)P(B | A1)

5
LECTURE 3 Models based on conditional
probabilities
• Readings: Section 1.5
• 3 tosses of a biased coin:
• Review P(H) = p, P(T ) = 1 − p
• Independence of two events p HHH

• Independence of a collection of events p

HHT
1- p
p HTH
Review p 1- p

1- p HTT
P(A ∩ B) p THH
P(A | B) = , assuming P(B ) > 0
P(B) 1- p p
1- p THT
• Multiplication rule: p TTH
1- p
P(A ∩ B) = P(B) · P(A | B) = P(A) · P(B | A) 1- p TTT

• Total probability theorem:

P(B ) = P(A)P(B | A) + P(Ac)P(B | Ac) P(T HT ) =

• Bayes rule: P(1 head) =

P(Ai)P(B | Ai)
P(Ai | B) =
P(B) P(first toss is H | 1 head) =

Independence of two events Conditioning may affect independence

• “Defn:” P(B | A) = P(B) • Conditional independence, given C,

is defined as independence
– “occurrence of A
under probability law P( · | C)
provides no information
about B’s occurrence”
• Assume A and B are independent
• Recall that P(A ∩ B) = P(A) · P(B | A)

• Defn: P(A ∩ B) = P(A) · P(B) C

A
• Symmetric with respect to A and B
B
– applies even if P(A) = 0
– implies P(A | B) = P(A) • If we are told that C occurred,
are A and B independent?

6
Conditioning may affect independence Independence of a collection of events

• Two unfair coins, A and B: • Intuitive definition:

P(H | coin A) = 0.9, P(H | coin B) = 0.1
Information on some of the events tells
choose either coin with equal probability
us nothing about probabilities related to
0.9 the remaining events
0.1
0.9 – E.g.:
Coin A
0.9 P(A1 ∩ (Ac2 ∪ A3) | A5∩Ac6) = P(A1 ∩ (Ac2 ∪ A3))
0.5 0.1
0.1
• Mathematical definition:
0.1 Events A1, A2, . . . , An
0.5 0.1
0.9
are called independent if:
Coin B 0.1
0.9
P(Ai∩Aj ∩· · ·∩Aq ) = P(Ai)P(Aj ) · · · P(Aq )
0.9 for any distinct indices i, j, . . . , q,
(chosen from {1, . . . , n})
• Once we know it is coin A, are tosses
independent?
• If we do not know which coin it is, are
tosses independent?
– Compare:
P(toss 11 = H)
P(toss 11 = H | first 10 tosses are heads)

Independence vs. pairwise The king’s sibling

independence
• The king comes from a family of two
• Two independent fair coin tosses children. What is the probability that
– A: First toss is H his sibling is female?
– B: Second toss is H
– P(A) = P(B) = 1/2

HH HT

TH TT

– C: First and second toss give same

result
– P(C) =
– P(C ∩ A) =
– P(A ∩ B ∩ C) =
– P(C | A ∩ B) =
• Pairwise independence does not
imply independence

7
LECTURE 4 Discrete uniform law

• Readings: Section 1.6 • Let all sample points be equally likely

• Then,
Lecture outline number of elements of A |A|
P(A) = =
total number of sample points |Ω|
• Principles of counting
• Just count. . .
• Many examples
– permutations
– k-permutations
– combinations
– partitions

• Binomial probabilities

Basic counting principle Example

• r stages • Probability that six rolls of a six-sided die

• ni choices at stage i all give diﬀerent numbers?

– Number of outcomes that

make the event happen:

– Number of elements
in the sample space:

• Number of choices is: n1n2 · · · nr

– Answer:

• Number of license plates

with 3 letters and 4 digits =

• . . . if repetition is prohibited =

• Permutations: Number of ways

of ordering n elements is:

• Number of subsets of {1, . . . , n} =

8
Combinations Binomial probabilities
�n �
• : number of k-element subsets
k • n independent coin tosses
of a given n-element set
– P(H) = p
• Two ways of constructing an ordered
sequence of k distinct items:
• P(HT T HHH) =
– Choose the k items one at a time:
n!
n(n−1) · · · (n−k+1) = choices
(n − k)! • P(sequence) = p# heads(1 − p)# tails
– Choose k items, then order them
(k! possible orders)
�
P(k heads) = P(seq.)
• Hence: k−head seq.
�n � n!
· k! =
k (n − k)! = (# of k−head seqs.) · pk (1 − p)n−k
�n � n! �n �
= = pk (1 − p)n−k
k k!(n − k)! k

n � �
� n
=
k=0
k

Coin tossing problem Partitions

• event B: 3 out of 10 tosses were “heads”. • 52-card deck, dealt to 4 players

– Given that B occurred, • Find P(each gets an ace)
what is the (conditional) probability
that the first 2 tosses were heads? • Outcome: a partition of the 52 cards
– number of outcomes:
• All outcomes in set B are equally likely:
52!
probability p3(1 − p)7
13! 13! 13! 13!
– Conditional probability law is uniform • Count number of ways of distributing the
four aces: 4 · 3 · 2
• Number of outcomes in B:
• Count number of ways of dealing the
• Out of the outcomes in B, remaining 48 cards
how many start with HH? 48!
12! 12! 12! 12!
• Answer:
48!
4·3·2
12! 12! 12! 12!
52!
13! 13! 13! 13!

9
LECTURE 5 Random variables

• Readings: Sections 2.1-2.3, start 2.4 • An assignment of a value (number) to

every possible outcome
Lecture outline
• Mathematically: A function
• Random variables from the sample space Ω to the real
numbers
• Probability mass function (PMF)
– discrete or continuous values
• Expectation
• Can have several random variables
• Variance
defined on the same sample space

• Notation:
– random variable X
– numerical value x

Probability mass function (PMF) How to compute a PMF pX (x)

– collect all possible outcomes for which
• (“probability law”, X is equal to x
“probability distribution” of X) – add their probabilities
– repeat for all x
• Notation:
• Example: Two independent rools of a
pX (x) = P(X = x) fair tetrahedral die
= P({ω ∈ Ω s.t. X(ω) = x})
F : outcome of first throw
! S: outcome of second throw
• pX (x) ≥ 0 x pX (x) = 1 X = min(F, S)

• Example: X=number of coin tosses

until first head 4

– assume independent tosses, 3

P(H) = p > 0 S = Second roll
2
pX (k) = P(X = k)
1
= P(T T · · · T H)
= (1 − p)k−1p, k = 1, 2, . . . 1 2 3 4

F = First roll
– geometric PMF
pX (2) =

10
Binomial PMF Expectation
• Definition:
• X: number of heads in n independent $
coin tosses E[X] = xpX (x)
x
• P(H) = p
• Interpretations:
• Let n = 4 – Center of gravity of PMF
– Average in large number of repetitions
pX (2) = P(HHT T ) + P(HT HT ) + P(HT T H) of the experiment
+P(T HHT ) + P(T HT H) + P(T T HH) (to be substantiated later in this course)

= 6p2(1 − p)2 • Example: Uniform on 0, 1, . . . , n

"4#
= p2(1 − p)2 pX(x )
2

1/(n+1)
In general:

pX (k) =
"n#

k
pk (1−p)n−k , k = 0, 1, . . . , n ...
0 1 n- 1 n x

1 1 1
E[X] = 0× +1× +· · ·+n× =
n+1 n+1 n+1

Properties of expectations Variance

• Let X be a r.v. and let Y = g(X) $

Recall: E[g(X)] = g(x)pX (x)
$ x
– Hard: E[Y ] = ypY (y) ! 2
y • Second moment: E[X 2] = x x pX (x)
$
– Easy: E[Y ] = g(x)pX (x)
x • Variance
% &
• Caution: In general, E[g(X)] %= g(E[X]) var(X) = E (X − E[X])2

$
= (x − E[X ])2pX (x)
x

Prop erties: If α, β are constants, then: = E[X 2] − (E[X])2

• E[α] =
Properties:
• E[αX] = • var(X) ≥ 0

• E[αX + β] = • var(αX + β) = α2var(X)

11
LECTURE 6 Review
• Random variable X: function from
• Readings: Sections 2.4-2.6
sample space to the real numbers

Lecture outline • PMF (for discrete random variables):

pX (x) = P(X = x)
• Review: PMF, expectation, variance • Expectation:
!
• Conditional PMF E[X] = xpX (x)
x
• Geometric PMF !
E[g(X)] = g(x)pX (x)
• Total expectation theorem x

• Joint PMF of two random variables E[αX + β] = αE[X] + β

" #
• E X − E[X] =

$ %
var(X) = E (X − E[X])2
!
= (x − E[X])2pX (x)
x
= E[X 2] − (E[X])2
&
Standard deviation: σX = var(X)

Random speed Average speed vs. average time

• Traverse a 200 mile distance at constant • Traverse a 200 mile distance at constant
but random speed V but random speed V
pV (v ) 1/2 1/2 pV (v ) 1/2 1/2

1 200 v 1 200 v

• d = 200, T = t(V ) = 200/V • time in hours = T = t(V ) =

'
• E[T ] = E[t(V )] = v t(v)pV (v) =
• E[V ] =
• E[T V ] = 200 =
" E[T ] · E[V ]

• var(V ) = • E[200/V ] = E[T ] "= 200/E[V ].

• σV =

12
Conditional PMF and expectation Geometric PMF
• X: number of independent coin tosses
• pX|A(x) = P(X = x | A)
until first head
!
• E[X | A] = xpX |A(x) pX (k) = (1 − p)k−1p, k = 1, 2, . . .
x

∞
! ∞
!
pX (x ) E[X] = kpX (k) = k(1 − p)k−1p
k=1 k=1

• Memoryless property: Given that X > 2,

1/4
the r.v. X − 2 has same geometric PMF
p
pX (k) pX |X>2(k)

p(1-p)2
p
1 2 3 4 x
... ...
• Let A = {X ≥ 2} 1 k 3 k
pX- 2|X>2(k)
pX|A(x) =
p
E[X | A] =
...
1 k

Total Expectation theorem Joint PMFs

• Partition of sample space • pX,Y (x, y) = P(X = x and Y = y)

into disjoint events A1, A2, . . . , An
y

A1 4 1/20 2/20 2/20

B
3 2/20 4/20 1/20 2/20

2 1/20 3/20 1/20

A2 A3 1 1/20

1 2 3 4 x

P(B) = P(A1)P(B | A1)+· · ·+P(An)P(B | An) !!

• pX,Y (x, y) =
pX (x) = P(A1)pX |A1 (x)+· · ·+P(An)pX |An (x) x y
!
E[X] = P(A1)E[X | A1]+· · ·+P(An)E[X | An] • pX (x) = pX,Y (x, y)
y

• Geometric example: pX,Y (x, y)

• pX |Y (x | y) = P(X = x | Y = y) =
A1 : {X = 1}, A2 : {X > 1} pY (y)
!
E[X] = P(X = 1)E[X | X = 1] • pX |Y (x | y) =
+P(X > 1)E[X | X > 1] x

• Solve to get E[X] = 1/p

13
LECTURE 7 Review

• Readings: Finish Chapter 2

pX (x) = P(X = x)
Lecture outline
pX,Y (x, y) = P(X = x, Y = y)
• Multiple random variables
pX |Y (x | y) = P(X = x | Y = y )
– Joint PMF
– Conditioning
– Independence !
pX (x) = pX,Y (x, y)
• More on expectations y

• Binomial distribution revisited pX,Y (x, y) = pX (x)pY |X (y | x)

• A hat problem

Independent random variables Expectations

!
pX,Y,Z (x, y, z) = pX (x)pY |X (y | x)pZ |X,Y (z | x, y) E[X] = xpX (x)
x
!!
• Random variables X, Y , Z are
E[g(X, Y )] = g(x, y)pX,Y (x, y)
x y
independent if:
" #
pX,Y,Z (x, y, z) = pX (x) · pY (y) · pZ (z) • In general: E[g(X, Y )] #= g E[X], E[Y ]
for all x, y, z
y • E[αX + β] = αE[X] + β
4 1/20 2/20 2/20
• E[X + Y + Z] = E[X] + E[Y ] + E[Z]
3 2/20 4/20 1/20 2/20

2 1/20 3/20 1/20 • If X, Y are independent:

1 1/20
– E[XY ] = E[X]E[Y ]
1 2 3 4 x

– E[g(X)h(Y )] = E[g(X)] · E[h(Y )]

• Independent?

• What if we condition on X ≤ 2
and Y ≥ 3?

14
Variances Binomial mean and variance

• Var(aX) = a2Var(X) • X = # of successes in n independent

trials
• Var(X + a) = Var(X)
– probability of success p
n
! "n#
• Let Z = X + Y . E[X] = k pk (1 − p)n−k
k=0
k
If X, Y are independent:

Var(X + Y ) = Var(X) + Var(Y ) 1, if success in trial i,
• Xi =
0, otherwise

• Examples:
• E[Xi] =
– If X = Y , Var(X + Y ) =

– If X = −Y , Var(X + Y ) = • E[X] =

– If X, Y indep., and Z = X − 3Y ,
Var(Z) = • Var(Xi) =

• Var(X) =

The hat problem Variance in the hat problem

• n people throw their hats in a box and • Var(X ) = E[X 2] − (E[X])2 = E[X 2] − 1
then pick one at random.
– X: number of people who get their own
! !
hat X2 = Xi2 + XiXj
i i,j :i=j
#
– Find E[X]

• E[Xi2] =

1, if i selects own hat
Xi =
0, otherwise.
P(X1X2 = 1) = P(X1 = 1)·P(X2 = 1 | X1 = 1)
• X = X1 + X2 + · · · + Xn
=
• P(Xi = 1) =

• E[Xi] =

• Are the Xi independent? • E[X 2 ] =

• E[X] = • Var(X) =

15
LECTURE 8 Continuous r.v.’s and pdf’s

• Readings: Sections 3.1-3.3 • A continuous r.v. is described by a

probability density function fX

Lecture outline
fX(x)
Sample Space
• Probability density functions

• Cumulative distribution functions

a b x
Event {a < X < b }
• Normal random variables

! b
P(a ≤ X ≤ b) = fX (x) dx
a

! ∞
fX (x) dx = 1
−∞

! x+δ
P(x ≤ X ≤ x + δ) = fX (s) ds ≈ fX (x) · δ
x

!
P(X ∈ B) = fX (x) dx, for “nice” sets B
B

Means and variances Cumulative distribution function

! ∞ (CDF)
• E[X] = xfX (x) dx
−∞
! ∞ ! x
• E[g(X)] = g(x)fX (x) dx FX (x) = P(X ≤ x) = fX (t) dt
−∞ −∞
! ∞
• 2 =
var(X ) = σX (x − E[X ])2fX (x) dx
−∞ fX(x ) CDF

• Continuous Uniform r.v.

fX (x )

a b x a b x

• Also for discrete r.v.’s:

$
a b x FX (x) = P(X ≤ x) = pX (k)
k≤x
• fX (x) = a≤x≤b 3/6

2/6
• E[X] = 1/6

! b" #
2 = a+b 2 1 (b − a)2 1 2 4 x 1 2 4 x
• σX x− dx =
a 2 b−a 12

16
Mixed distributions Gaussian (normal) PDF

• Schematic drawing of a combination of 1 2

• Standard normal N (0, 1): fX (x) = √ e−x /2
a PDF and a PMF 2π
1/2
Normal PDF fx (x) Normal CDF FX(x)
1
0.5

-1 0 1 2 x -1 0 1 2 x

• E[X] = var(X) = 1
0 1/2 1 x0
• General normal N (µ, σ 2):
• The corresponding CDF: 1 2 2
fX (x) = √ e−(x−µ) /2σ
σ 2π
FX (x) = P(X ≤ x)
CDF

• It turns out that:

1
E[X] = µ and Var(X) = σ 2.
3/4
• Let Y = aX + b

1/4 – Then: E[Y ] = Var(Y ) =

– Fact: Y ∼ N (aµ + b, a2σ 2)
1/2 1 x

Calculating normal probabilities The constellation of concepts

• No closed form available for CDF

– but there are tables
(for standard normal) pX (x) fX (x)
X −µ FX (x)
• If X ∼ N (µ, σ 2), then ∼ N( ) E[X], var(X)
σ
• If X ∼ N (2, 16): pX,Y (x, y) fX,Y (x, y)
" # pX |Y (x | y) fX |Y (x | y)
X −2 3−2
(X
PSec. 3) =Random
3.3 ≤ Normal P Variables ≤ = CDF(0.25)
155
4 4
.00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
0.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
0.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
0.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
0.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
0.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
0.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7
2.8
.9965
.9974
.9966
.9975
.9967
.9976
.9968
.9977
.9969
.9977
.9970
.9978
.9971
.9979
.9972
.9979
.9973
.9980
.9974
.9981
17
LECTURE 9 Continuous r.v.’s and pdf’s
• Readings: Sections 3.4-3.5 fX(x)
Sample Space

Outline
• PDF review Event {a < X < b }
a b x

• Multiple random variables

– conditioning
� b
– independence P(a ≤ X ≤ b) = fX (x) dx
a
• Examples
• P(x ≤ X ≤ x + δ) ≈ fX (x) · δ
Summary of concepts � ∞
• E[g(X)] = g(x)fX (x) dx
−∞
pX (x) fX (x)
FX (x) �
�
xpX (x) E[X] xfX (x) dx
x
var(X)
pX,Y (x, y) fX,Y (x, y)
pX|A(x) fX |A(x)
pX |Y (x | y ) fX |Y (x | y)

Joint PDF fX,Y (x, y) Buﬀon’s needle

• Parallel lines at distance d
Needle of length � (assume � < d)
� � • Find P(needle intersects one of the lines)
P((X, Y ) ∈ S) = fX,Y (x, y) dx dy
S q
x
l
d
• Interpretation:
P(x ≤ X ≤ x+δ, y ≤ Y ≤ y +δ ) ≈ fX,Y (x, y )·δ 2
• X ∈ [0, d/2]: distance of needle midpoint
to nearest line
• Expectations: • Model: X, Θ uniform, independent
� ∞ � ∞
E[g (X, Y )] = g (x, y)fX,Y (x, y) dx dy fX,Θ(x, θ ) = 0 ≤ x ≤ d/2, 0 ≤ θ ≤ π/2
−∞ −∞

�
• From the joint to the marginal: • Intersect if X ≤ sin Θ
2
fX (x) · δ ≈ P(x ≤ X ≤ x + δ) =
� � � �
�
P X≤ sin Θ = fX (x)fΘ(θ) dx dθ
2 x≤ 2� sin θ
� �
4 π/2 (�/2) sin θ
= dx dθ
πd 0 0
• X and Y are called independent if �
4 π/2 � 2�
fX,Y (x, y) = fX (x)fY (y), for all x, y = sin θ dθ =
πd 0 2 πd

18
Conditioning

• Recall

P(x ≤ X ≤ x + δ) ≈ fX (x) · δ Joint, Marginal and Conditional Densities

• By analogy, would like:

P(x ≤ X ≤ x + δ | Y ≈ y) ≈ fX |Y (x | y) · δ

• This leads us to the definition:

fX,Y (x, y) Area of slice = Height of marginal
fX |Y (x | y) = if fY (y) > 0 density at x
fY (y)
Renormalizing slices for
fixed x gives conditional
• For given y, conditional PDF is a densities for Y given X = x
Slice through
(normalized) “section” of the joint PDF density surface
for fixed x

• If independent, fX,Y = fX fY , we obtain

Image by MIT OpenCourseWare, adapted from
fX |Y (x|y) = fX (x) Probability, by J. Pittman, 1999.

Stick-breaking example
• Break a stick of length � twice: 1
fX,Y (x, y) = , 0≤y≤x≤�
break at X: uniform in [0, 1]; �x
break again at Y , uniform in [0, X]
y
f X(x) f Y |X (y | x)
L

L x y
L x

fX,Y (x, y) = fX (x)fY |X (y | x) =

on the set:
y �
fY (y) = fX,Y (x, y) dx
L � �
1
= dx
y �x
1 �
= log , 0≤y≤�
� y
L x � � � �
1 � �
E[Y ] = yfY (y) dy = y log dy =
� 0 0 � y 4
E[Y | X = x] = yfY |X (y | X = x) dy =

19
LECTURE 10 The Bayes variations

Continuous Bayes rule; pX,Y (x, y) pX (x)pY |X (y | x)

Derived distributions pX|Y (x | y) = =
pY (y) pY (y)
�
• Readings: pY (y) = pX (x)pY |X (y | x)
x
Section 3.6; start Section 4.1
Example:
Review • X = 1, 0: airplane present/not present
• Y = 1, 0: something did/did not register
pX (x) fX (x) on radar
pX,Y (x, y) fX,Y (x, y)
pX,Y (x, y) fX,Y (x, y) Continuous counterpart
pX |Y (x | y) = fX|Y (x | y) =
pY (y) fY (y)
� � ∞
fX,Y (x, y) fX (x)fY |X (y | x)
pX (x) = pX,Y (x, y) fX (x) = fX,Y (x, y) dy fX|Y (x | y) = =
y −∞ fY (y) fY (y)
�
fY (y) = fX (x)fY |X (y | x) dx
x
FX (x) = P(X ≤ x)
Example: X: some signal; “prior” fX (x)
E[X], var(X) Y : noisy version of X
fY |X (y | x): model of the noise

Discrete X, Continuous Y What is a derived distribution

pX (x)fY |X (y | x) • It is a PMF or PDF of a function of one

pX |Y (x | y) = or more random variables with known
fY (y)
probability law. E.g.:
� y
fY (y ) = pX (x)fY |X (y | x) f X ,Y(y,x)=1
x
1
Example:
• X: a discrete signal; “prior” pX (x)
• Y : noisy version of X
• fY |X (y | x): continuous noise model
1 x

Continuous X, Discrete Y
– Obtaining the PDF for
fX (x)pY |X (y | x)
fX |Y (x | y) = g(X, Y ) = Y /X
pY (y)
� involves deriving a distribution.
pY (y) = fX (x)pY |X (y | x) dx Note: g(X, Y ) is a random variable
x
Example:
When not to find them
• X: a continuous signal; “prior” fX (x)
(e.g., intensity of light beam); • Don’t need PDF for g(X, Y ) if only want
• Y : discrete r.v. aﬀected by X to compute expected value:
� �
(e.g., photon count)
E[g(X, Y )] = g(x, y)fX,Y (x, y) dx dy
• pY |X (y | x): model of the discrete r.v.

20
How to find them The continuous case

• Discrete case • Two-step procedure:

– Obtain probability mass for each – Get CDF of Y : FY (y) = P(Y ≤ y)

possible value of Y = g(X)

pY (y) = P(g(X) = y) – Diﬀerentiate to get

�
= pX (x) dFY
fY (y) = (y)
x: g(x)=y dy

x y
g(x)
Example
. . • X: uniform on [0,2]
. . • Find PDF of Y = X 3
. .
. . • Solution:

FY (y) = P(Y ≤ y) = P(X 3 ≤ y)

. . 1
= P(X ≤ y 1/3) = y 1/3
. . 2

. . fY (y) =
dFY
dy
(y) =
1
6y 2/3

Example The pdf of Y=aX+b

• Joan is driving from Boston to New York.

Her speed is uniformly distributed be- Y = 2X + 5:
tween 30 and 60 mph. What is the dis-
tribution of the duration of the trip? fX
faX faX+b
200
• Let T (V ) = .
V
• Find fT (t) -2 -1 2 3 4 9

f v(v0 )

1/30 � �
1 y−b
fY (y) = fX
|a| a

30 60 v0 • Use this to check that if X is normal,

then Y = aX + b is also normal.

21
LECTURE 11 A general formula

Derived distributions; convolution; • Let Y = g(X)

covariance and correlation g strictly monotonic.
dg
slope (x)
• Readings: y dx

Finish Section 4.1;

Section 4.2
g(x)
[y, y+?]
Example
y x
f X ,Y(y,x)=1
[x, x+d]

• Event x ≤ X ≤ x + δ is the same as

g(x) ≤ Y ≤ g(x + δ)
or (approximately)
1 x
g(x) ≤ Y ≤ g(x) + δ|(dg/dx)(x)|

• Hence,
Find the PDF of Z = g(X, Y ) = Y /X ! !
! dg !
δfX (x) = δfY (y) !! (x)!!
FZ (z ) = z≤1 dx
where y = g(x)

FZ (z) = z≥1

The distribution of X + Y The continuous case

• W = X + Y ; X, Y independent • W = X + Y ; X, Y independent

. y
y
. (0,3)

. (1,2) w

.(2,1)

.
(3,0)
x
. w x
pW (w) = P(X + Y = w)
" x +y=w
= P(X = x)P(Y = w − x)
x
"
= pX (x)pY (w − x) • fW |X (w | x) = fY (w − x)
x
• fW,X (w, x) = fX (x)fW |X (w | x)
• Mechanics:
= fX (x)fY (w − x)
– Put the pmf’s on top of each other
# ∞
– Flip the pmf of Y • fW (w) = fX (x)fY (w − x) dx
−∞
– Shift the flipped pmf by w
(to the right if w > 0)
– Cross-multiply and add

22
Two independent normal r.v.s The sum of independent normal r.v.’s

• X ∼ N (µx, σx2), Y ∼ N (µy , σy2), • X ∼ N (0, σx2), Y ∼ N (0, σy2),

independent independent

fX,Y (x, y) = fX (x)fY (y) • Let W = X + Y

$ %
1 (x − µx)2 (y − µy )2 # ∞
= exp − − fW (w) = fX (x)fY (w − x) dx
2πσxσy 2σx2 2σy2 −∞
#
1 ∞ 2 2 2 2
= e−x /2σx e−(w−x) /2σy dx
• PDF is constant on the ellipse where 2πσxσy −∞
2
(x − µx)2 (y − µ y ) 2 (algebra) = ce−γw
+
2σx2 2σy2
is constant • Conclusion: W is normal
– mean=0, variance=σx2 + σy2
• Ellipse is a circle when σx = σy
– same argument for nonzero mean case

Covariance Correlation coefficient

& '
• cov(X, Y ) = E (X − E[X]) · (Y − E[Y ]) • Dimensionless version of covariance:
, -
(X − E[X]) (Y − E[Y ])
• Zero-mean case: cov(X, Y ) = E [XY ] ρ = E ·
σX σY
cov(X, Y )
x x =
. . .... . .. .

. σX σY
.
.

.
. .. ... . .... ... . .. .. . ... . .

. . .... . .. .
. ... ...... ... .. . . .

. . ... ...... ... .. . . . • −1 ≤ ρ ≤ 1

. .. ... . ... ... . .. .. . ... . .
. . .. .. . . ... .. . .

. . . .. .. .
. . . .. .

. . . .. . . ... .. . . y y
• |ρ| = 1 ⇔ (X − E[X]) = c(Y − E[Y ])
. .
. . (linearly related)
. . .
.

• Independent ⇒ ρ = 0
• cov(X, Y ) = E[XY ] − E[X]E[Y ] (converse is not true)
 
n
" n
" "
• var  Xi = var(Xi) + 2 cov(Xi, Xj )
i=1 i=1 (i,j ):i=j
&

• independent ⇒ cov(X, Y ) = 0
(converse is not true)

23
LECTURE 12 Conditional expectations

• Readings: Section 4.3; • Given the value y of a r.v. Y :

parts of Section 4.5 !
E[X | Y = y] = xpX|Y (x | y)
(mean and variance only; no transforms) x
(integral in continuous case)

Lecture outline • Stick example: stick of length !

break at uniformly chosen point Y
• Conditional expectation break again at uniformly chosen point X
– Law of iterated expectations y
• E[X | Y = y] = (number)
2
– Law of total variance

• Sum of a random number Y

of independent r.v.’s E[X | Y ] = (r.v.)
2
– mean, variance

• Law of iterated expectations:

!
E[E[X | Y ]] = E[X | Y = y]pY (y)= E[X]
y

• In stick example:
E[X] = E[E[X | Y ]] = E[Y /2] = !/4

var(X | Y ) and its expectation Section means and variances

var(X) = E[var(X | Y )] + var(E[X | Y ]) 30

1 ! 90 · 10 + 60 · 20
E[X] = xi = = 70
30 i=1 30

Proof:
E[X | Y = 1] = 90, E[X | Y = 2] = 60

(a) Recall: var(X) = E[X 2] − (E[X])2 

1 2
(d) var(E[X | Y ]) = E[ (E[X | Y ])2 ]−(E[X])2 var(E[X | Y ]) = (90 − 70)2 + (60 − 70)2
3 3
600
Sum of right-hand sides of (c), (d): = = 200
3
E[X 2] − (E[X ])2 = var(X)

24
Section means and variances (ctd.) Example

10 30
1 ! 1 ! var(X) = E[var(X | Y )] + var(E[X | Y ])
(xi−90)2 = 10 (xi−60)2 = 20
10 i=1 20 i=11
f X(x)

2/3

var(X | Y = 1) = 10 var(X | Y = 2) = 20
1/3

1 2
 Y=1 Y=2 x
10, w.p. 1/3
var(X | Y ) =
20, w.p. 2/3
E[X | Y = 1] = E[X | Y = 2] =
E[var(X | Y )] = 1 · 10 + 2 · 20 = 50
3 3 3

var(X | Y = 1) = var(X | Y = 2) =

var(X) = E[var(X | Y )] + var(E[X | Y ])

E[X] =
50
= + 200
3
= (average variability within sections) var(E[X | Y ]) =
+ (variability between sections)

Sum of a random number of Variance of sum of a random number

independent r.v.’s of independent r.v.’s

• N : number of stores visited • var(Y ) = E[var(Y | N )] + var(E[Y | N ])

(N is a nonnegative integer r.v.)
• E[Y | N ] = N E[X]
• Xi: money spent in store i var(E[Y | N ]) = (E[X])2 var(N )
– Xi assumed i.i.d.
• var(Y | N = n) = n var(X)
– independent of N
var(Y | N ) = N var(X)
• Let Y = X1 + · · · + XN E[var(Y | N )] = E[N ] var(X)

E[Y | N = n] = E[X1 + X2 + · · · + Xn | N = n]
= E[X1 + X2 + · · · + Xn]
= E[X1] + E[X2] + · · · + E[Xn] var(Y ) = E[var(Y | N )] + var(E[Y | N ])
= n E[X] = E[N ] var(X ) + (E[X])2 var(N )

• E[Y | N ] = N E[X]

E[Y ] = E[E[Y | N ]]
= E[N E[X]]
= E[N ] E[X]

25
LECTURE 13 The Bernoulli process

The Bernoulli process • A sequence of independent

Bernoulli trials
• Readings: Section 6.1
• At each trial, i:
– P(success) = P(Xi = 1) = p
Lecture outline
– P(failure) = P(Xi = 0) = 1 − p
• Definition of Bernoulli process

• Random processes • Examples:

– Sequence of lottery wins/losses
• Basic properties of Bernoulli process
– Sequence of ups and downs of the Dow
• Distribution of interarrival times Jones
– Arrivals (each second) to a bank
• The time of the kth success
– Arrivals (at each time slot) to server
• Merging and splitting

Random processes Number of successes S in n time slots

• First view: • P(S = k) =

sequence of random variables X1, X2, . . .
• E[Xt] = • E[S] =
• Var(Xt) =
• Var(S) =
• Second view:
what is the right sample space?

• P(Xt = 1 for all t) =

• Random processes we will study:

– Bernoulli process
(memoryless, discrete time)
– Poisson process
(memoryless, continuous time)
– Markov chains
(with memory/dependence across time)

26
Interarrival times Time of the kth arrival

• T1: number of trials until first success • Given that first arrival was at time t
i.e., T1 = t:
– P(T1 = t) = additional time, T2, until next arrival

– Memoryless property – has the same (geometric) distribution

– independent of T1
– E[T1] =

– Var(T1) = • Yk : number of trials to kth success

– E[Yk ] =
• If you buy a lottery ticket every day, what
is the distribution of the length of the
first string of losing days? – Var(Yk ) =

– P(Yk = t) =

Sec. 6.1 The Bernoulli Process 305

Splitting and Merging of Bernoulli Processes

Starting with a Bernoulli process in which there is a probability p of an arrival

at each time, consider splitting it as follows. Whenever there is an arrival, we
choose to either keep it (with probability q), or to discard it (with probability
1−q); see Fig. 6.3. Assume that the decisions to keep or discard are independent
for diﬀerent arrivals. If we focus on the process of arrivals that are kept, we see
Splitting of a Bernoulli Process
that it is a Bernoulli process: in each time slot, there is a probability pq of a Merging of Indep. Bernoulli Processes
kept arrival, independent of what happens in other slots. For the same reason, 306 The Bernoulli and Poisson Processes Chap. 6

(using independent coin flips)

the process of discarded arrivals is also a Bernoulli process, with a probability
of a discarded arrival at each time slot equal to p(1 − q). Bernoulli (p)
time

time Merged process:

q Bernoulli (p + q − pq) time
Original
process time
Bernoulli (q)
1−q time

time
Figure 6.4: Merging of independent Bernoulli processes.

Figure 6.3: Splitting of a Bernoulli process. yields a Bernoulli process

yields Bernoulli processes
In a reverse situation, we start with two independent Bernoulli processes
(collisions
concentrate on the are
special counted as but
case where n is large one arrival)
p is small, so that the mean
np has a moderate value. A situation of this type arises when one passes from
(with parameters p and q, respectively) and merge them into a single process, discrete to continuous time, a theme to be picked up in the next section. For
as follows. An arrival is recorded in the merged process if and only if there some examples, think of the number of airplane accidents on any given day:
is an arrival in at least one of the two original processes. This happens with there is a large number n of trials (airplane flights), but each one has a very
probability p + q − pq [one minus the probability (1 − p)(1 − q) of no arrival in small probability p of being involved in an accident. Or think of counting the
either process]. Since different time slots in either of the original processes are number of typos in a book: there is a large number of words, but a very small
independent, different slots in the merged process are also independent. Thus, probability of misspelling any single one.
the merged process is Bernoulli, with success probability p + q − pq at each time Mathematically, we can address situations of this kind, by letting n grow
step; see Fig. 6.4. while simultaneously decreasing p, in a manner that keeps the product np at a
Splitting and merging of Bernoulli (or other) arrival processes arises in constant value λ. In the limit, it turns out that the formula for the binomial PMF
many contexts. For example, a two-machine work center may see a stream of simplifies to the Poisson PMF. A precise statement is provided next, together
arriving parts to be processed and split them by sending each part to a randomly with a reminder of some of the properties of the Poisson PMF that were derived
chosen machine. Conversely, a machine may be faced with arrivals of different in Chapter 2.
types that can be merged into a single arrival stream.

The Poisson Approximation to the Binomial

Poisson Approximation to the Binomial
The number of successes in n independent Bernoulli trials is a binomial random
variable with parameters n and p, and its mean is np. In this subsection, we • A Poisson random variable Z with parameter λ takes nonnegative
integer values and is described by the PMF

λk
pZ (k) = e−λ , k = 0, 1, 2, . . . .
k!
Its mean and variance are given by

E[Z] = λ, var(Z) = λ.

27
LECTURE 19 Chebyshev’s inequality
Limit theorems – I
• Random variable X
• Readings: Sections 5.1-5.3; (with finite mean µ and variance σ 2)
start Section 5.4 !
σ2 = (x − µ)2fX (x) dx
! −c ! ∞
• X1, . . . , Xn i.i.d. ≥ (x − µ)2fX (x) dx + (x − µ)2fX (x) dx
−∞ c
X1 + · · · + Xn
Mn =
n ≥ c2 · P(|X − µ| ≥ c)
What happens as n → ∞?

• Why bother? σ2
P(|X − µ| ≥ c) ≤
c2
• A tool: Chebyshev’s inequality

• Convergence “in probability”

1
P(|X − µ| ≥ kσ ) ≤
• Convergence of Mn k2
(weak law of large numbers)

Deterministic limits Convergence “in probability”

• Sequence an • Sequence of random variables Yn

Number a
• converges in probability to a number a:
“(almost all) of the PMF/PDF of Yn ,
• an converges to a eventually gets concentrated
lim a = a (arbitrarily) close to a”
n→∞ n
“an eventually gets and stays
(arbitrarily) close to a” • For every " > 0,

lim P(|Yn − a| ≥ ") = 0

n→∞
• For every " > 0,
there exists n0,
such that for every n ≥ n0,
we have |an − a| ≤ ". 1 - 1 /n pmf of Yn

1 /n

0 n
Does Yn converge?

38
Convergence of the sample mean The pollster’s problem
(Weak law of large numbers)
• f : fraction of population that “. . . ”
• X1, X2, . . . i.i.d.
• ith (randomly selected) person polled:
finite mean µ and variance σ 2 
1, if yes,
X + · · · + Xn
Mn = 1 Xi =
0, if no.
n

• Mn = (X1 + · · · + Xn)/n
• E[Mn] = fraction of “yes” in our sample

• Goal: 95% confidence of ≤1% error

• Var(Mn) = P(|Mn − f | ≥ .01) ≤ .05

• Use Chebyshev’s inequality:

Var(Mn) σ2 2
σM
P(|Mn − µ| ≥ ") ≤ = P(|Mn − f | ≥ .01) ≤ n
"2 n"2 (0.01)2
σx2 1
• Mn converges in probability to µ = ≤
n(0.01)2 4n(0.01)2

• If n = 50, 000,
then P(|Mn − f | ≥ .01) ≤ .05
(conservative)

Different scalings of Mn The central limit theorem

• X1, . . . , Xn i.i.d. • “Standardized” Sn = X1 + · · · + Xn:

finite variance σ 2 Sn − E[Sn] Sn − nE[X]
Zn = = √
σSn nσ
• Look at three variants of their sum:
– zero mean
• Sn = X1 + · · · + Xn variance nσ 2
– unit variance

Sn • Let Z be a standard normal r.v.

• Mn = variance σ 2/n
n (zero mean, unit variance)
converges “in probability” to E[X] (WLLN)

Sn • Theorem: For every c:

• √ constant variance σ 2
n
P(Zn ≤ c) → P(Z ≤ c)
– Asymptotic shape?

• P(Z ≤ c) is the standard normal CDF,

Φ(c), available from the normal tables

39
LECTURE 20 Usefulness
THE CENTRAL LIMIT THEOREM • universal; only means, variances matter
• accurate computational shortcut
• Readings: Section 5.4
• justification of normal models
• X1, . . . , Xn i.i.d., finite variance σ 2

• “Standardized” Sn = X1 + · · · + Xn:
What exactly does it say?
Sn − E[Sn] Sn − nE[X]
Zn = = √ • CDF of Zn converges to normal CDF
σ Sn nσ
– not a statement about convergence of
– E[Zn] = 0, var(Zn) = 1 PDFs or PMFs

• Let Z be a standard normal r.v.

(zero mean, unit variance) Normal approximation

• Theorem: For every c: • Treat Zn as if normal

– also treat Sn as if normal
P(Zn ≤ c) → P(Z ≤ c)

• P(Z ≤ c) is the standard normal CDF,

Φ(c), available from the normal tables
Can we use it when n is “moderate”?
• Yes, but no nice theorems to this eﬀect
• Symmetry helps a lot

0.14 0.1
n =2 n =4
0.12
0.08
0.1

0.08 0.06

0.06
0.04
The pollster’s problem using the CLT
0.04

0.02
0.02
• f : fraction of population that “ . . .��
0 0
0 5 10 15 20 0 5 10 15 20 25 30 35

• ith (randomly selected) person polled:

0.25 0.035
n =32

0.2
0.03 1, if yes,
0.025 Xi =
0.15
0.02
0, if no.
0.1 0.015

0.01
0.05
0.005
• Mn = (X1 + · · · + Xn)/n
0 0

• Suppose we want:
0 2 4 6 8 100 120 140 160 180 200

P(|Mn − f | ≥ .01) ≤ .05

0.12 0.08

n = 16
0.07
n=8

• Event of interest: |Mn − f | ≥ .01

0.1

0.06

0.08
0.05

� �
� X1 + · · · + Xn − nf �
0.06 0.04

� �
� ≥ .01
0.03
0.04
�
0.02
0.02
n
0.01

0 0
0 5 10 15 20 25 30 35 40 0 10 20 30 40 50 60 70
� � √
1 0.06 � X + · · · + X − nf � .01 n
� 1 n �
0.9 n = 32
� √ � ≥
� �
0.05
0.8

0.7
nσ σ
0.04
0.6

0.5 0.03

0.4

0.3
0.02
√
0.2
0.01 P(|Mn − f | ≥ .01) ≈ P(|Z| ≥ .01 n/σ)
0.1
√
0
0 1 2 3 4 5 6 7
0
30 40 50 60 70 80 90 100 ≤ P(|Z| ≥ .02 n)

40
Apply to binomial The 1/2 correction for binomial
approximation
• Fix p, where 0 < p < 1
• P(Sn ≤ 21) = P(Sn < 22),
• Xi: Bernoulli(p) because Sn is integer

• Sn = X1 + · · · + Xn: Binomial(n, p) • Compromise: consider P(Sn ≤ 21.5)

– mean np, variance np(1 − p)

Sn − np
• CDF of � −→ standard normal
np(1 − p)

Example
18 19 20 21 22
• n = 36, p = 0.5; find P(Sn ≤ 21)

• Exact answer:
21 � � �
� 36� 1 36
= 0.8785
k=0
k 2

De Moivre–Laplace CLT (for binomial) Poisson vs. normal approximations of

• When the 1/2 correction is used, CLT the binomial
can also approximate the binomial p.m.f.
(not just the binomial CDF) • Poisson arrivals during unit interval equals:
sum of n (independent) Poisson arrivals
P(Sn = 19) = P(18.5 ≤ Sn ≤ 19.5) during n intervals of length 1/n
– Let n → ∞, apply CLT (??)
18.5 ≤ Sn ≤ 19.5 ⇐⇒
– Poisson=normal (????)
18.5 − 18 Sn − 18 19.5 − 18
≤ ≤ ⇐⇒
3 3 3
0.17 ≤ Zn ≤ 0.5 • Binomial(n, p)
– p fixed, n → ∞: normal
P(Sn = 19) ≈ P(0.17 ≤ Z ≤ 0.5)
– np fixed, n → ∞, p → 0: Poisson

= P(Z ≤ 0.5) − P(Z ≤ 0.17) • p = 1/100, n = 100: Poisson

= 0.6915 − 0.5675 • p = 1/10, n = 500: normal

= 0.124

• Exact answer:
�36� � 1 �36
= 0.1251
19 2

Turn in Recitation and Tutorial Scheduling Form Policy: Text
No ratings yet
Turn in Recitation and Tutorial Scheduling Form Policy: Text
52 pages
CH 1
No ratings yet
CH 1
107 pages
Lecture 2 Review of Probabilty Theory
No ratings yet
Lecture 2 Review of Probabilty Theory
52 pages
Statistical Foundations: SOST70151 - LECTURE 1
No ratings yet
Statistical Foundations: SOST70151 - LECTURE 1
45 pages
Lec-1 PROBABILITY
No ratings yet
Lec-1 PROBABILITY
17 pages
Math 170A
No ratings yet
Math 170A
34 pages
CENG 222 Statistical Methods For Computer Engineering
No ratings yet
CENG 222 Statistical Methods For Computer Engineering
31 pages
Probability 1
No ratings yet
Probability 1
74 pages
MATH 301 Lecture 1
No ratings yet
MATH 301 Lecture 1
65 pages
Probability LectureNotes
No ratings yet
Probability LectureNotes
16 pages
Stochastic Systems: Dr. Farah Haroon
No ratings yet
Stochastic Systems: Dr. Farah Haroon
24 pages
MAT3003 Modules - (1 2 3) - Updated
No ratings yet
MAT3003 Modules - (1 2 3) - Updated
40 pages
1-ProbabilityReview v3
No ratings yet
1-ProbabilityReview v3
116 pages
Chapter 03
No ratings yet
Chapter 03
18 pages
Wa0000.
No ratings yet
Wa0000.
28 pages
PMRprobabilistic Modelling Primer
No ratings yet
PMRprobabilistic Modelling Primer
14 pages
Turn in Recitation and Tutorial Scheduling Form Policy: Text
No ratings yet
Turn in Recitation and Tutorial Scheduling Form Policy: Text
4 pages
Mitres 6 012S18 L01as PDF
No ratings yet
Mitres 6 012S18 L01as PDF
21 pages
Chapter 2 Basic Probability Concepts
No ratings yet
Chapter 2 Basic Probability Concepts
36 pages
Elementary Probability and Statistics
No ratings yet
Elementary Probability and Statistics
25 pages
Chapter 1 Lecture1
No ratings yet
Chapter 1 Lecture1
39 pages
MATHEMATICAL - STATISTICS (p.1-34)
No ratings yet
MATHEMATICAL - STATISTICS (p.1-34)
34 pages
Week 1: Learning Objectives: Introduction To Probability
No ratings yet
Week 1: Learning Objectives: Introduction To Probability
30 pages
(Business Mathematics) Week 6 - Probability & Decision-Making
No ratings yet
(Business Mathematics) Week 6 - Probability & Decision-Making
14 pages
Probability and Statistics - Stat 166 - It
No ratings yet
Probability and Statistics - Stat 166 - It
62 pages
Probability and Statistics: Profit or Loss
No ratings yet
Probability and Statistics: Profit or Loss
6 pages
On The Integration of Polygenic Risk Scores and Functional Genomics Data For Enhanced Prediction of Complex Traits.
No ratings yet
On The Integration of Polygenic Risk Scores and Functional Genomics Data For Enhanced Prediction of Complex Traits.
35 pages
Chapter 4: Probability and Probabilistic Models: Statistics I
No ratings yet
Chapter 4: Probability and Probabilistic Models: Statistics I
85 pages
MAT 3103: Computational Statistics and Probability Chapter 3: Probability
No ratings yet
MAT 3103: Computational Statistics and Probability Chapter 3: Probability
23 pages
Probability
No ratings yet
Probability
30 pages
Second Weeky
No ratings yet
Second Weeky
31 pages
Cours Chapter1
No ratings yet
Cours Chapter1
12 pages
Lecture Ps
No ratings yet
Lecture Ps
30 pages
Statistics Notes
No ratings yet
Statistics Notes
107 pages
Section
No ratings yet
Section
24 pages
Probability & Statistics Wapole (CHPTR 02)
No ratings yet
Probability & Statistics Wapole (CHPTR 02)
40 pages
Prob. Review 1
No ratings yet
Prob. Review 1
25 pages
Introduction To Probability
No ratings yet
Introduction To Probability
38 pages
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
No ratings yet
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
38 pages
Chapter 1
No ratings yet
Chapter 1
39 pages
PSFDS Unit1 Week2
No ratings yet
PSFDS Unit1 Week2
14 pages
Probability & Statistics
No ratings yet
Probability & Statistics
54 pages
Mitres 6 012S18 L01
No ratings yet
Mitres 6 012S18 L01
21 pages
SMT5203
No ratings yet
SMT5203
57 pages
Lecture01 Intro Probability Theory
No ratings yet
Lecture01 Intro Probability Theory
80 pages
Lecure-2 Probability
No ratings yet
Lecure-2 Probability
38 pages
lecture-1-SMMD-11092024-080101pm (2) (Autosaved)
No ratings yet
lecture-1-SMMD-11092024-080101pm (2) (Autosaved)
39 pages
1.1 Probability Spaces
No ratings yet
1.1 Probability Spaces
24 pages
UNIT 4 Probability
No ratings yet
UNIT 4 Probability
20 pages
Financial Modelling Notes
No ratings yet
Financial Modelling Notes
45 pages
Elementary Probability Theory For CS648A
No ratings yet
Elementary Probability Theory For CS648A
19 pages
Chap2 Full
No ratings yet
Chap2 Full
18 pages
Basic Concepts of Probability Theory
No ratings yet
Basic Concepts of Probability Theory
51 pages
Lecture 05
No ratings yet
Lecture 05
39 pages
Probability
No ratings yet
Probability
30 pages
Lecture 1 PDF
No ratings yet
Lecture 1 PDF
5 pages
PDM Notes
No ratings yet
PDM Notes
15 pages
Ma 151 Lecture LT1
No ratings yet
Ma 151 Lecture LT1
95 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Pythagorean Triangles
From Everand
Pythagorean Triangles
Waclaw Sierpinski
No ratings yet
Material Variance Calculation in X3
No ratings yet
Material Variance Calculation in X3
3 pages
Economics: Swami Ramanand Teerth Marathwada University, Nanded
No ratings yet
Economics: Swami Ramanand Teerth Marathwada University, Nanded
13 pages
Probability Distributions and Combination of Random Variables
No ratings yet
Probability Distributions and Combination of Random Variables
52 pages
Levene's Test
No ratings yet
Levene's Test
2 pages
STAT 333 Assignment 1 Solutions
No ratings yet
STAT 333 Assignment 1 Solutions
6 pages
IBM SPSS Statistics 23 Step by Step A Simple Guide and Reference 14th Edition George Instant Download
No ratings yet
IBM SPSS Statistics 23 Step by Step A Simple Guide and Reference 14th Edition George Instant Download
48 pages
Sample Size Calculations For Evaluating Mediation
No ratings yet
Sample Size Calculations For Evaluating Mediation
17 pages
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Reading 5 Portfolio Mathematics
No ratings yet
Reading 5 Portfolio Mathematics
14 pages
Research Article: The Impact of Quality of Life On The Health of Older People From A Multidimensional Perspective
No ratings yet
Research Article: The Impact of Quality of Life On The Health of Older People From A Multidimensional Perspective
8 pages
Intermediate Part 1 PDF
No ratings yet
Intermediate Part 1 PDF
148 pages
Simple Random Sampling
No ratings yet
Simple Random Sampling
10 pages
Chapter 1
No ratings yet
Chapter 1
61 pages
List of Topics For Demo - BSED 3B
No ratings yet
List of Topics For Demo - BSED 3B
6 pages
NTRC AXLE LOAD SURVEY N-55 Part 4
No ratings yet
NTRC AXLE LOAD SURVEY N-55 Part 4
10 pages
Methods For Identifying Out-of-Trend Data in Analysis of Stability Measurements - Part I: Regression Control Chart
No ratings yet
Methods For Identifying Out-of-Trend Data in Analysis of Stability Measurements - Part I: Regression Control Chart
7 pages
Statistics 1 Revision Sheet
No ratings yet
Statistics 1 Revision Sheet
9 pages
OpenSAP Ibcs1-Tl Week 1 Transript
No ratings yet
OpenSAP Ibcs1-Tl Week 1 Transript
14 pages
A Level - Maths - List of Formulae
No ratings yet
A Level - Maths - List of Formulae
10 pages
Portfolio Theory
No ratings yet
Portfolio Theory
78 pages
A Mathematical Analysis of Tournament Selection: Tobias Blickle Lothar Thiele
No ratings yet
A Mathematical Analysis of Tournament Selection: Tobias Blickle Lothar Thiele
8 pages
3 Gage-RR
No ratings yet
3 Gage-RR
25 pages
FM - Student - Guide - v1.3 L3
100% (1)
FM - Student - Guide - v1.3 L3
88 pages
Hoffman y Pooter - 2002 LNW
No ratings yet
Hoffman y Pooter - 2002 LNW
7 pages
Topic 7 Tutorial Am025
No ratings yet
Topic 7 Tutorial Am025
4 pages
Hydrognomon Theory English 07
No ratings yet
Hydrognomon Theory English 07
11 pages
Asm Fam-L and S Sample
No ratings yet
Asm Fam-L and S Sample
47 pages
Chpt4-Risk and Return
No ratings yet
Chpt4-Risk and Return
43 pages
ISS Statistics Paper 3 2020
No ratings yet
ISS Statistics Paper 3 2020
8 pages