Lecture Notes 4: Probability and Statistics
So far we have talked about random experiment, sample space, σ-field
and probability function. Now we are going to introduce an interesting and
important concept in probability theory, and it is known as random variables.
Suppose, we have a random experiment, and based on the random experiment
we have a sample space Ω. Now a random variable X is a real valued function
defined on Ω. Note that Ω can have finite, countable or uncountable number
of elements. Let us look at some example.
Example: Suppose we throw a dice twice and observe the faces which appear
on the top. Clearly in this case the sample space Ω has 36 points, namely
Ω = {(i, j); i, j = 1, 2, . . . , 6}.
Suppose we define a function X on Ω as follows: X(i, j) = i + j. Clearly, X
is a random variable and it can take values 2, 3, . . . , 12. If it is assumed that
probability of appearing any face on the top is equally likely and the two dice
are independently thrown, then
1 2
P (X = 2) = P (X = 12) = , P (X = 3) = P (X = 11) = ,
36 36
3 4
P (X = 4) = P (x = 10) = , P (X = 5) = P (X = 9) = ,
36 36
5 6
P (X = 6) = P (X = 8) = , P (X = 7) = .
36 36
Example: Now let us look at the continuous case when the random variable
X can take continuous values. Suppose we choose a number at random from
[0, 1], and let us denote that number as X. In this case Ω = [0, 1], and X is
a random variable taking any values in [0, 1]. Since, it is assumed that the
number is chosen at random, it means for and 0 < a < b < 1,
P (a < X < b) = b − a.
A discrete random variable will be always characterized by the probability
mass function (PMF). For a continuous random variable X if there exists a
1
function f (x), such that f (x) ≥ 0, and for all −∞ < a < b < ∞,
Z b
P (a < X < b) = f (x)dx,
a
then f (x) is called the probability density function (PDF) of X. It is imme-
diate that Z ∞
f (x)dx = P (−∞ < X < ∞) = 1.
−∞
From now on it is assumed that if f (x) is a function satisfy
Z ∞
f (x) ≥ 0, for all x and f (x)dx = 1,
−∞
then there exists a random variable X, such that for all −∞ < a < b < ∞,
Z b
P (a < X < b) = f (x)dx,
a
and f (x) is the PDF of X.
Now we will introduce another important function, and it is known as the
cumulative distribution function (CDF) or a distribution function (DF) of a
random variable X. It is defined as
FX (x) = P (X ≤ x) = P (−∞ < X ≤ x); for all x.
Note that the CDF can be defined both for the discrete and continuous
random variables. If X is a discrete random variable with the following
PMF
P (X = ai ) = pi ; i = 1, 2, 3 . . . ,
here ai ’s are arbitrary real numbers, and pi ’s satisfy the following continuous;
∞
X
pi ≥ 0, for all i, and pi = 1.
i=1
Then the CDF of X becomes
X
FX (x) = pi .
i:ai ≤x
2
It is clear that the distribution function of a discrete random variable is a
step function, and it has jumps at a1 , a2 , . . ..
Example: Suppose X has the following PMF
1 1 1
P (X = −1) = , P (X = 0) = P (X = 1) = .
4 2 4
Then the CDF of X becomes
0 if x < −1
1
FX (x) = 4 if −1 ≤ x < 0
3
if 0 ≤ x < 1
4
1 if x≥1
It can be seen that FX (x) is a step function, and it has jumps at −1, 0, 1.
If X is a continuous random variable with PDF f (x), then the CDF of X
becomes Z x
FX (x) = f (u)du.
−∞
It is immediate that FX (x) is everywhere continuous.
Example: Suppose X is a continuous random variable with PDF
f (x) = e−2|x| ; −∞ < x < ∞,
then the CDF of X becomes
Z x R x 2u
FX (x) = e−2|u| du = 1
−∞R xe −2u
du if x < 0
−∞ 2 + 0 e du if x ≥ 0
1 2x
= 2e if x < 0
1 1 −2x
2 + 2 (1 −e ) if x ≥ 0
Now let us look at the some of the properties of a CDF. If F (x) is a
distribution function of a random variable, then it is clear
1. 0 ≤ F (x) ≤ 1, for all x.
2. lim F (x) = 0 and lim F (x) = 1.
x→−∞ x→∞
3
Now we will prove that FX (x) is right continuous for all x. Let an be a
sequence of real numbers and an ↓ a0 . Then let us consider the sets An =
∞
\
{X ∈ (−∞, an ]}, for n = 1, 2, . . .. Note that An = A0 = {X ∈ (−∞, a0 ]}.
n=1
Further,
∞
!
\
lim F (an ) = lim P (An ) = P An = P (A0 ) = F (a0 ).
n→∞ n→∞
n=1
On the other hand let bn be a sequence such that bn ↑ a0 , and if Bn = {X ∈
∞
[
(−∞, bn ]}, then Bn = B0 = {X ∈ (−∞, a0 )}. Hence,
n=1
∞
!
[
lim F (bn ) = lim P (Bn ) = P Bn = P (B0 ) = F (a0 −).
n→∞ n→∞
n=1
Therefore,
P (X = a0 ) = P (−∞ < X ≤ a0 ) − P (−∞ < X < a0 ) = F (a0 ) − F (a0 −).
From now on if a function F (x) defined on the whole real line and satisfy the
following properties
1. 0 ≤ F (x) ≤ 1, for all x.
2. lim F (x) = 0 and lim F (x) = 1.
x→−∞ x→∞
3. F (x) is right continuous,
will be called a distribution function.
It easily follows from the definition of a distribution function:
P (a < X ≤ b) = F (b) − F (a), P (a ≤ X ≤ b) = F (b) − F (a−)
P (a ≤ X < b) = F (b−) − F (a−), P (a < X < b) = F (b−) − F (a).
Example Now consider the following function F (x)
0 if x<0
1 −x
F (x) = 1 − 2e if 0 ≤ x < 2
1 if x≥2
4
It is immediate that F (x) satisfies all the above properties, hence, it is a
proper CDF. It can be seen that it has jumps at 0 and 2. Here
1 1
P (X = 0) = F (0) − F (0−) = , P (X = 2) = F (2) − F (2−) = e−2 .
2 2
It is clear that F (x) is not a step function it is also not a continuous function.
This kind of CDF is called a mixture distribution. We would like to write
F (x) in the following form
F (x) = αFc (x) + (1 − α)Fd (x).
Here Fc (x) is a proper continuous CDF, Fd (x) is a proper discrete distribution
d
CDF and α is the mixing proportion. Let g(x) = F (x). Therefore,
dx
0 if x<0
1 −x
g(x) = e if 0 ≤ x < 2
2
0 if x≥2
Z x
Therefore, αFc (x) = g(u)du, and
−∞
0 if x<0
1 −x
αFc (x) = 2 (1 − e ) if 0 ≤ x < 2
1 −2
2 (1 − e ) if x ≥ 2.
1 1
Hence, α = (1 − e−2 ), (1 − α) = (1 + e−2 ) and
2 2
0 if x<0
1−e−x
Fc (x) = if 0 ≤ x < 2
1−e−2
1 if x ≥ 2.
Therefore, (1 − α)Fd (x) = F (x) − αFc (x), and
0 if x<0
1
(1 − α)Fd (x) = 2 if 0 ≤ x < 2
1 −2
2 (1 + e ) if x ≥ 2.
and
0 if x<0
1
Fd (x) = 1+e−2 if 0 ≤ x < 2
1 if x ≥ 2.