ADVANCED PROBABILITY HOMEWORK 1
GROUP 4
(1120240065)
(2120240153)
(2120240161)
(1120240072)
(2120240160)
1.2.2 Let χ have the standard normal distribution. Use Theorem 1.2.6 to get upper and lower bounds on
P (χ ≥ 4).
Proof. Consider the density function f (·) of standard normal distribution accoding to Example 1.2.5.
( )
f (x) = (2π)−1/2 exp −x2 /2 .
Then we have
∫ ∞
( ) ( ) ( ) ( )
(2π)−1/2 x−1 − x−3 exp −x2 /2 ≤ (2π)−1/2 exp −y 2 /2 dy ≤ (2π)−1/2 x−1 exp −x2 /2
x
accoding to Theorem 1.2.6.
Let x = 4, we gain
∫ ∞
15 ( ) 1
√ exp (−8) ≤ (2π)−1/2 exp −y 2 /2 dy = P (χ ≥ 4) ≤ √ exp (−8).
64 2π 4 4 2π
e−8 15e−8
Finally, we get the upper bound on P (χ ≥ 4) is √ and the lower bound is √ .
4 2π 64 2π
□
1.2.3 Show that a distribution function has at most countably many discontinuities.
Proof. (i) Note that the distribution function is F (x), then F (x) is non-decreasing.
Let A = {x : x are the discontinuities of F }. ∀x0 ∈ A, let
y0+ = lim F (x)
x→x+
0
y0− = lim F (x)
x→x−
0
then y0− < y0+ , and thus we can obtain an open interval (y0− , y0+ ).
Date: 2024.9.30.
1
2 GROUP 4 (1120240065) (2120240153) (2120240161) (1120240072) (2120240160)
Then we prove that ∀x1 < x2 ∈ A, we have
(F (x1 − 0), F (x1 + 0)) ∩ (F (x2 − 0), F (x2 + 0)) = ∅
Since F (x1 + 0) = infx>x1 F (x), F (x2 − 0) = supx<x2 F (x). If x1 < x < x2 , then we have F (x1 + 0) ≤
F (x) ≤ F (x2 − 0).
Then the above proposition is proven.
Therefore we obtain a sequence of mutually exclusive open intervals.
Let B = {Iα } be the set of these open intervals, then ∀Iα ∈ B, since rational numbers are dense, then
∃xα ∈ Iα ∩ Q.
Define a mapping
f :B → Q
Iα 7→ xα .
Since Iα are mutually exclusive, when Iα 6= Iβ , we have xα 6= xβ , that is f is an injection.
Let C = f (B) ⊂ Q, then f : B → C is a bijection. Since we already know Q is countable, and C ⊂ Q,
therefore C is countable, and we can obtain that B is countable.
□
1.2.7 (i) Suppose X has density function f . Compute the distribution function of X 2 and then differentiate
to find its density function. (ii) Work out the answer when X has a standard normal distribution to find the
density of the chi-square distribution.
Proof. (i) First we let Y = X 2 , then we can compute the distribution function FY (y):
√ √
FY (y) = P (Y ≤ y) = P (X 2 ≤ y) = P (− y ≤ X ≤ y).
This can be expressed in terms of the distribution function F of X:
√ √
FY (y) = F ( y) − F (− y).
By the Fundamental Theorem of Calculus, we differentiate FY (y):
d d √ √
fY (y) = FY (y) = (F ( y) − F (− y)) .
dy dy
Applying the chain rule:
1 √ 1 √
fY (y) = √ f ( y) + √ f (− y),
2 y 2 y
where f (x) is the density function of X.
(ii) When X has a standard normal distribution. For X ∼ N (0, 1), the density function f (x) is:
1 x2
f (x) = √ e− 2 .
2π
Then we can calculate FY (y):
√ √ √ √ √
FY (y) = P (− y ≤ X ≤ y) = Φ( y) − Φ(− y) = 2Φ( y) − 1,
ADVANCED PROBABILITY HOMEWORK 1 3
where Φ is the cumulative distribution function of the standard normal distribution.
Differentiate FY (y):
d √ √ 1
fY (y) = (2Φ( y) − 1) = 2ϕ( y) · √ ,
dy 2 y
2
√1 e− 2
x
where ϕ(x) = 2π
.
Thus, the density function of Y = X 2 is:
1 y
fY (y) = √ e− 2 for y ≥ 0.
2πy
which is also called the chi-square distribution.
□
1.6.3 Chebyshev’s inequality is and is not sharp.
(i) Show that Theorem 1.6.4 is sharp by showing that if 0 < b ≤ a are fixed there is an X with EX 2 = b2 for
which P (|x| ≥ a) = b2 /a2 . (ii) Show that Theorem 1.6.4 is not sharp by showing that if X has 0 < EX 2 <
∞ then
lim P (|X| ≥ a)/EX 2 = 0.
a→∞
Proof. (i) Consider the random variable X defined as follows:
a, with probability p = b2
,
a2
X=
0, with probability 1 − p.
Now, calculate EX 2 :
b2
EX 2 = a2 · P (X = a) + 0 · P (X = 0) = a2 · = b2 .
a2
Thus, EX 2 = b2 , which satisfies the condition.
Next, calculate P (|X| ≥ a):
b2
P (|X| ≥ a) = P (X = a) = .
a2
b2
Hence, we have found a random variable X that satisfies both EX 2 = b2 and P (|X| ≥ a) = a2
. This shows
that Theorem 1.6.4 is sharp.
(ii) Now, let a > 0. For x ∈ {x : |x| ≥ a}, we have:
a2 I{|x|≥a} ≤ x2 I{|x|≥a} = x2 − x2 I{|x|<a} .
4 GROUP 4 (1120240065) (2120240153) (2120240161) (1120240072) (2120240160)
Therefore, ∫
a P (|x| ≥ a) =
2
a2 I{|x|≥a} dP
∫ ∫
≤ x2 dP − x2 I{|x|<a} dP,
∫
= EX − x2 I{|x|<a} dP.
2
Thus, we know:
|x2 I{|x|<a} | ≤ x2 , EX 2 < ∞.
By the Dominated Convergence Theorem, we have:
∫ ∫
lim x I{|x|<a} dP =
2
lim x2 I{|x|<a} dP.
a→∞ a→∞
Therefore,
( ∫ )
lim a P (|x| ≥ a) ≤ lim
2
EX − x I{|x|<a} dP
2 2
a→∞ a→∞
∫
= EX − lim
2
x2 I{|x|<a} dP
a→∞
= EX 2 − EX 2 = 0.
Hence, we obtain:
a2 P (|x| ≥ a)
→ 0 as a → ∞.
EX 2
However, by Chebyshev’s inequality, for Q(x) = x2 , A = (−∞, −a] ∪ [a, ∞), and τA = inf{Q(x) = x2 ≤
A} = a2 , we have:
a2 P (|x| ≥ a)
≤ 1.
EX 2
Therefore, Chebyshev’s inequality is not sharp in this case. □
1.6.4 One-sided Chebyshev bound.
(i) Let a > b > 0, 0 < p < 1, and let X have P (X = a) = p and P (X = −b) = 1−p. Apply Theorem 1.6.4
to φ(x) = (x + b)2 and conclude that if Y is any random variable with EY = EX and var(Y ) = var(X),
then P (Y ≥ a) ≤ p and equality holds when Y = X.
(ii) Suppose EY = 0, var(Y ) = σ 2 , and a > 0. Show that P (Y ≥ a) ≤ σ 2 /(a2 + σ 2 ), and there is a Y for
which equality holds.
Proof. (i) Note that EY = EX and var(Y ) = var(X), since E(X 2 ) = (EX)2 + var(X), we can get that
E(X 2 ) = E(Y 2 ). Since ϕ(x) = (x + b)2 , so E(ϕ(X)) = E(ϕ(Y )).
Applying Chebyshev’s inequality, we have
P (Y ≥ a) = P (Y + b ≥ a + b) ≤ E(ϕ(Y ))/(a + b)2 = E(ϕ(X))/(a + b)2 = p.
ADVANCED PROBABILITY HOMEWORK 1 5
When the equality holds, ϕ(Y ) = (a + b)2 on {Y ≥ a} and P (Y ≥ a) = p. Hence P (Y = a) = p. On the other
hand, E[ϕ(Y ); Y < a] = 0, then Y = −b on {Y < a} and p{Y = −b} = 1 − p.
(ii) Since EY = 0, var(Y ) = σ 2 , we have E(Y 2 ) = σ 2 . Moreover, we have E(Y + b) = b, E(Y + b)2 =
σ 2 + b2 .
By Chebyshev’s inequality, we have
E(Y + b)2 σ 2 + b2
P (Y ≥ a) = P (Y + b ≥ a + b) ≤ ≤ .
(a + b)2 (a + b)2
σ2 σ2
Since b is arbitrary, we can set b = a , then the above equality can become P (Y ≥ a) ≤ a2 +σ 2
.
By (i), since E(Y ) = 0, we have p · a − b(1 − p) = 0; since E(Y 2 ) = σ 2 , we have a2 p + (1 − p)b2 = σ 2 , which
means we can find the p and there is a Y .
□
1.6.5 Two nonexistent lower bounds.
Show that: (i) if ϵ > 0, inf{P (|X| > ϵ) : EX = 0, var(X) = 1} = 0.
(ii) if y ≥ 1, σ 2 ∈ (0, ∞), inf{P (|X| > y) : EX = 1, var(X) = σ 2 } = 0.
Proof. (i) Let
−n, with probability1/(2n2 ),
Xn = n, with probability1/(2n2 ),
0, with probability1 − 1/(n2 ).
Then EXn = (−n) 2n1 2 + n 2n1 2 = 0, var(Xn ) = EXn2 − (EXn )2 = EXn2 = (n2 ) 2n1 2 + (n2 ) 2n1 2 = 1.
If 0 < ϵ < n, P (|Xn | > ϵ) = 1/(n2 ). Since 1/(n2 ) can be arbitrarily small, we can obtain that inf{P (|X| >
ϵ) : EX = 0, var(X) = 1} = 0.
(ii) Let Yn = σXn + 1, then EYn = σEXn + 1 = 1, var(Yn ) = σ 2 var(Xn ) = σ 2 . Similar to (i),
P (|Yn | > y) = P ( |Ynσ|−1 > y−1
σ ) ≤ P ( |Ynσ−1| > y−1
σ ) = 1/(n2 ), therefore, inf{P (|X| > y) : EX =
1, var(X) = σ 2 } = 0.
□
2.1.4 Let Ω = (0, 1), F = Borel sets, P = Lebesgue measure. Xn (ω) = sin(2πnω), n = 1, 2, . . ., are
uncorrelated but not independent.
Proof. 1. Showing that Xn are uncorrelated
Two random variables Xn and Xm are uncorrelated if their covariance is zero
Cov(Xn , Xm ) = E[Xn Xm ] − E[Xn ]E[Xm ] = 0.
6 GROUP 4 (1120240065) (2120240153) (2120240161) (1120240072) (2120240160)
First, we calculate E[Xn ]:
∫ 1
−1
E[Xn ] = sin(2πnω) dω = (cos(2πn · 1) − cos(2πn · 0)) = 0
0 2πn
Similarly, we have E[Xm ]=0.
Next, we compute E[Xn Xm ]:
∫ 1
E[Xn Xm ] = sin(2πnω) sin(2πmω) dω.
0
Using the product-to-sum identity for sine functions:
1
sin(2πnω) sin(2πmω) = [cos(2π(n − m)ω) − cos(2π(n + m)ω)] ,
2
we have: ∫ 1
1
E[Xn Xm ] = [cos(2π(n − m)ω) − cos(2π(n + m)ω)] dω.
2 0
Since cos(2πkω) integrates to zero over (0, 1) for any k, therefore we can get:
E[Xn Xm ] = 0 for n 6= m.
Cov(Xn , Xm ) = E[Xn Xm ] − E[Xn ]E[Xm ] = 0.
Therefore, Xn and Xm are uncorrelated for n 6= m.
2. Showing that Xn are not independent
When x = k
2m and k = 0, 1, . . . , 2m − 1,
k
Xm (x) = sin(2πm · ) = sin(kπ) = 0
2m
Thus, at these points, Xm = 0. Meanwhile, Xn (x) takes values based on sin(2πnx), and these values form a
finite set:
Vn = {y0 , y1 , . . . , y2m−1 },
( )
where yi = sin 2πn · k
2m . Let [a, b] ⊂ [−1, 1] − Vn , where a < b. Since the sine function is continuous, for
any ϵ > 0, if ϵ is sufficiently small, we have P (Xm ∈ [0, ϵ]) > 0.
However, when Xm = 0, Xn cannot lie within the interval [a, b]. Therefore:
P (Xm ∈ [0, ϵ], Xn ∈ [a, b]) = 0
Comparing the joint probability and marginal probabilities:
P (Xm ∈ [0, ϵ], Xn ∈ [a, b]) = 0 < P (Xm ∈ [0, ϵ]) · P (Xn ∈ [a, b])
Thus, Xm and Xn are not independent. □
2.1.14 Let X, Y ≥ 0 be independent with distribution functions F and G. Find the distribution function of
XY .
ADVANCED PROBABILITY HOMEWORK 1 7
Solution. Let z(x, y) = I{xy≤z} , then we have:
∫ ∫
P (XY ≤ z) = z(x, y)dF (x)dG(y)
∫ ∫
= I{xy≤z} dF (x)dG(y)
∫
z
= F ( )d(G(y)).
y
□
9. Prove the following statements:
(a) Let Ω = R, F= all subsets so that A or Ac is countable. Show that F is a σ -algebra.
(b) Ω = Z and A= all subsets so that A or Ac is finite, then A is an algebra, but not a σ -algebra.
(c) If F1 ⊂ F2 ⊂ . . . are σ − algebras, then ∪∞
i=1 Fi is an algebra, but may not be a σ − algebra.
Proof. (a) We verify that F is a σ -algebra by definition.
1. Obviously ∅ ∈ F .
2. Clearly if A ∈ F , then Ac ∈ F .
3. {Ai } ∈ F is a countable sequence of sets, if each Ai is countable, then ∪i Ai is countable which means
∪i Ai ∈ F ; if there is an Aci is countable, then ∩i Aci ⊂ Aci is countable which means ∪i Ai ∈ F .
(b) We verify that A is an algebra by definition.
1. Clearly ∅ ∈ A.
2. If A ∈ A, either (Ac )c or Ac is finite. By definition, Ac ∈ A.
3. For A, B ∈ A, if they are both finite, A ∪ B is also finite and A ∪ B ∈ A. Otherwise we know either Ac or
B c is finite. Since (A ∪ B)c =Ac ∩ B c is a subset of Ac and B c , it has to be finite. Thus we conclude A ∪ B ∈ A.
Up to now, we show that A is an algebra.
To show that A is not a σ-algebra, consider {Ai }∞ ∞
i=1 where Ai = {2i}, i = 1, 2, .... We know that {Ai }i=1 ⊂
A, since each one of them is finite. But clearly A := ∪i Ai ∈
/ A, since neither Ac or A is finite.
(c) We directly verify that F := ∪∞
i=1 Fi is an algebra.
1. Obviously ∅ ∈ F .
2. If A ∈ F , we know that for some i A ∈ Fi , so Ac ∈ Fi ⊂ F .
3. If A, B ∈ F , we have A, B for some i, since {Fi }∞
i=1 is an increasing sequence of σ-algebras, hence
A ∪ B ∈ Fi ⊂ F .
Up to now, we show that F is an algebra.
To show that F is not a σ-algebra, let Ω = Z, Fi = σ({1}). Let Fk = σ({1}, {2}, ..., {k}), for k ≥ 2,
F1 ⊂ F2 ⊂ ... are increasing σ-algebras. Let Ai = {1, 2, ..., i} ∈ Fi ⊂ ∪i Fi , ∪∞
i=1 Ai = N ∈
+ / ∪ F . Thus,
i i
∪i Fi is not a σ -algebra.
□
8 GROUP 4 (1120240065) (2120240153) (2120240161) (1120240072) (2120240160)
10. Let {ξk }∞
k=1 be independent random variables where:
P(ξk = 1) = 2−k , P(ξk = 0) = 1 − 2−k , for k ≥ 1.
Define Xn = ξ1 + ξ2 + · · · + ξn . Use Chernoff bounds to show that:
( )
eδ
P sup Xn > 1 + δ ≤ , for any δ > 0.
n (1 + δ)1+δ
Proof. Define Xn = ξ1 + ξ2 + · · · + ξn . Since ξ1 , ξ2 , . . . are independent, we can compute the moment generating
function for Xn :
∏
n
E[e tXn
]= E[etξk ].
k=1
For each ξk , we have:
E[etξk ] = (1 − 2−k ) + 2−k et = 1 + 2−k (et − 1).
Thus:
n (
∏ )
E[e tXn
]= 1 + 2−k (et − 1) .
k=1
To simplify this expression, we apply the logarithmic identity:
( ) ∑
n ( )
log E[e tXn
] = log 1 + 2−k (et − 1) .
k=1
Next, we approximate the logarithm for small values of x, using the first-order Taylor expansion:
log(1 + x) ≈ x for small x.
Since 2−k becomes very small as k increases, we apply this approximation to each term in the sum:
( )
log 1 + 2−k (et − 1) ≈ 2−k (et − 1).
Now, the sum can be approximated as:
∑
n ( ) ∑n
log 1 + 2−k (et − 1) ≈ 2−k (et − 1).
k=1 k=1
This is a geometric series with the sum:
∞
∑
2−k = 1.
k=1
Thus, for large n, we have:
∑
n
2−k (et − 1) ≈ et − 1.
k=1
Exponentiating both sides to reverse the logarithmic transformation, therefore as n → ∞:
t −1
E[etXn ] ≈ ee .
ADVANCED PROBABILITY HOMEWORK 1 9
Using Chernoff’s bound for Xn , we get:
P(Xn > 1 + δ) ≤ min e−t(1+δ) E[etXn ].
t>0
Substitute the estimate for E[etXn ]:
P(Xn > 1 + δ) ≤ min e−t(1+δ)+(e
t −1)
.
t>0
To minimize the exponent, take the derivative of the exponent:
f (t) = −t(1 + δ) + et − 1.
The derivative is:
f ′ (t) = −(1 + δ) + et .
Setting f ′ (t) = 0 gives et = 1 + δ, so t = log(1 + δ). Substitute t = log(1 + δ) into the Chernoff bound:
eδ
P(Xn > 1 + δ) ≤ e− log(1+δ)(1+δ)+(1+δ−1) = .
(1 + δ)1+δ
Thus, we have shown that: ( )
eδ
P sup Xn > 1 + δ ≤ .
n (1 + δ)1+δ
□