cumulative distribution function
The cumulative distribution function for a random variable X is
the function F: R →[0,1] defined by
F(a) = P[X≤a]
Ex: if X has probability mass function given by:
cdf
pmf
NB: for discrete random variables, be careful about “≤” vs “<” 8
expectation
10
expectation
For a discrete r.v. X with p.m.f. p(•), the expectation of X, aka expected
value or mean, is
average of random values, weighted
E[X] = Σx xp(x) by their respective probabilities
For the equally-likely outcomes case, this is just the average of the
possible random values of X
For unequally-likely outcomes, it is again the average of the possible
random values of X, weighted by their respective probabilities
Ex 1: Let X = value seen rolling a fair die p(1), p(2), ..., p(6) = 1/6
Ex 2: Coin flip; X = +1 if H (win $1), -1 if T (lose $1)
E[X] = (+1)•p(+1) + (-1)•p(-1) = 1•(1/2) +(-1)•(1/2) = 0
11
properties of expectation
Linearity of expectation, I
For any constants a, b: E[aX + b] = aE[X] + b
Proof:
21
properties of expectation–example
A & B each bet $1, then flip 2 coins: Let X = A’s net gain: +1, 0, -1, resp.:
HH A wins $2
P(X = +1) = 1/4
HT Each takes
back $1 P(X = 0) = 1/2
TH
P(X = -1) = 1/4
TT B wins $2
What is E[X]?
0
e2
E[X] = 1•1/4 + 0•1/2 + (-1)•1/4 = 0
slid
What is E[X2]?
m
Fro
E[X2] = 12•1/4 + 02•1/2 + (-1)2•1/4 = 1/2
What is E[2X+1]?
E[2X + 1] = 2E[X] + 1 = 2•0 + 1 = 1
22
properties of expectation
Note:
Linearity is special!
It is not true in general that
E[X•Y] = E[X] • E[Y]
E[X2] = E[X]2 ← counterexample above
E[X/Y] = E[X] / E[Y]
E[asinh(X)] = asinh(E[X])
•
•
•
27
variance
28
what does variance tell us?
The variance of a random variable X with mean E[X] = μ is
Var[X] = E[(X-μ)2], often denoted σ2.
1: Square always ≥ 0, and exaggerated as X moves away
from μ, so Var[X] emphasizes deviation from the mean.
II: Numbers vary a lot depending on exact distribution of
X, but it is common that X is
within μ ± σ ~66% of the time, and
within μ ± 2σ ~95% of the time. The Standard Normal Density Function
(We’ll see the reasons for this soon.)
0.5
0.4
0.3
µ=0
σ=1
f(x)
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
32
properties of variance
NOT linear;
Var[aX+b] = a2 Var[X] insensitive to location (b),
quadratic in scale (a)
Ex:
E[X] = 0
Var[X] = 1
Y = 1000 X
E[Y] = E[1000 X] = 1000 E[X] = 0
Var[Y] = Var[103 X]=106Var[X] = 106
39
independence
and .
joint .
distributions
41
variance of independent r.v.s is additive
(Bienaymé, 1853)
Theorem: If X & Y are independent, (any dist, not just binomial) then
Var[X+Y] = Var[X]+Var[Y]
Alternate Proof:
slide 60
FYI, the quantity E[XY]-E[X]E[Y] is called the covariance
of X,Y. As shown, it is 0 if X,Y are independent; if not zero
it is a useful measure of their degree of dependence. 62
Slide 45
random variables – summary
Conditional Expectation:
E[X | A] = ∑x x•P(X=x | A)
Law of Total Expectation
E[X] = E[X | A]•P(A) + E[X | ¬ A]•P(¬ A)
Variance:
Var[X] = E[ (X-E[X])2 ] = E[X2] - (E[X])2]
Standard deviation: σ = √Var[X]
Var[aX+b] = a2 Var[X] “Variance is insensitive to location, quadratic in scale”
If X & Y are independent, then
E[X•Y] = E[X]•E[Y]
Var[X+Y] = Var[X]+Var[Y]
}
(These two equalities hold for
indp rv’s; but not in general.)
93