ST3236_Note3

LECTURE 3: Random Variables and Expectations
Somabha Mukherjee
National University of Singapore
1 / 17
Outline
1 Random Variables
2 Joint Distribution and Independence
3 Expectation and Variance
2 / 17
What are Random Variables?
The outcomes of a random experiment are often quantified/summarized by
random variables. A random variable assigns values to each outcome of a
random experiment
Mathematically speaking, a random variable is a function
X : Ω 7→ R (set of all real numbers)
satisfying the condition that the sets {X ≤ x} := {ω ∈ Ω : X (ω) ≤ x} are

events for all x ∈ R.
Example: You roll a die twice; your sample space is:
Ω := {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6} .
Now, define X := sum of the two faces that you observe. This is a random
variable, since
X (i, j) = i + j .
3 / 17
Distribution Functions
The distribution function of a random variable X is defined as:
FX (t) = P(X ≤ t) .
For example, if a random variable X takes values 1, 2 and 3 with probabilities

1 1 1
2 , 3 and 6 respectively, then its distribution function is given by:


 0 if t<1
1

if 1≤t<2
FX (t) = 2
5
 if 2≤t<3
6


1 if x ≥ 3.
In general, if a random variable X takes values x1 < x2 < x3 < . . . with

probabilities p1 , p2 , p3 , . . . respectively, then we have:
FX (t) = p1 + p2 + . . . + pk if xk ≤ t < xk+1 .
4 / 17
Properties of Distribution Functions
The distribution function F of any random variable X satisfies the following

four properties:
1 F is non-decreasing, i.e. F (x) ≤ F (y ) whenever x ≤ y .
2 F is right-continuous, i.e. limx→t + F (x) = F (t).
3 F may not be left-continuous, but limx→t − F (x) exists for all t ∈ R, and
equals P(X < t).
4 limx→−∞ F (x) = 0 and limx→∞ F (x) = 1.
Conversely, if a function F : R → R satisfies conditions 1,2 and 4 above, then

it can be expressed as the distribution function of some random variable X .
5 / 17
Discrete Distributions
A random variable X is said to have a discrete distribution, if there exists a
countable set C such that P(X ∈ C ) = 1.
Examples of Discrete Distributions:

Binomial Distribution: P(X = k) = kn p k (1 − p)n−k (0 ≤ k ≤ n)

1
Number of successes in n independent trials of an experiment, with success

probability p.
−λ k
2 Poisson Distribution: P(X = k) = e k!λ (k ∈ {0, 1, 2, . . .})
Limiting form of the Binomial distribution with p = λn .
3 Geometric Distribution: P(X = k) = p(1 − p)k−1 (k ∈ {1, 2, . . .})
Number of independent trials needed to get the first success (with success
probability p).
Negative Binomial Distribution: P(X = k) = k−1
r k−r
r −1 p (1 − p) (k ≥ r )
4
th
Number of independent trials needed to get the r success (with success
probability p).
6 / 17
Continuous Distributions
A random variable X is said to have a continuous distribution, if its

distribution function F is continuous.
A special class of continuous distributions satistfy the property that there

exists a non-negative function f such that the distribution function can be
written as: Z t
F (t) = f (x) dx .
−∞
In this case, the distribution is said to be absolutely continuous, and the

function f is called the density function of the distribution.
A Word of Caution!
There are continuous distributions that are not absolutely continuous. However,
these examples are a bit hard, and beyond the scope of this module. That is why,
some textbooks simply use the name continuous distributions for absolutely
continuous distributions, which is technically incorrect.
7 / 17
Examples of Absolutely Continuous Distributions
1
1 Uniform Distribution: f (t) = b−a (a < t < b).
Assigns equal probability to sets of equal size.
2 2
2 Normal Distribution: f (t) = (2πσ 2 )−1/2 e −(t−µ) /2σ (t ∈ R)
Used to model the weights and heights of individuals in a population.
3 Exponential Distribution: f (t) = λe −λt (t > 0)

Used to model the lifes of electric bulbs and radioactive chemicals.
8 / 17
Outline
1 Random Variables
9 / 17
Joint Distribution of Random Variables
The joint distribution function of random variables X1 , X2 , . . . , Xn is defined

as:
FX1 ,...,Xn (t1 , . . . , tn ) = P (X1 ≤ t1 , . . . , Xn ≤ tn ).
Random variables X1 , . . . , Xn are said to be independent, if
FX1 ,...,Xn (t1 , . . . , tn ) = FX1 (t1 ) . . . FXn (tn ) .
That is, “the joint distribution function is a product of the marginal

distribution functions”.
Consequence: For independent random variables X1 , . . . , Xn and subsets
A1 , . . . , An of R,
P(X1 ∈ A1 , . . . , Xn ∈ An ) = P(X1 ∈ A1 ) . . . P(Xn ∈ An ) .
EXERCISE!!
10 / 17
Outline
1 Random Variables
11 / 17
Expected Value of a Random Variable
The expected value of a random variable is a measure of the (weighted)
average of all the values it can take.
We will mathematically define the expected values of a discrete and an

absolutely continuous random variable.
Let X be a discrete random variable taking values in the countable set C .

Then, X
E(X ) := kP(X = k)
k∈C
provided the above series is well-defined
Let X be an absolutely continuous random variable with density function f .

Then, Z ∞
E(X ) := xf (x) dx
−∞
provided the above integral exists.
Note that E(X ) depends only on the distribution of X and not on the exact
function X .
12 / 17
Some Properties of Expectation
1 If E(c) = c for a non-random constant c.
2 If X ≥ Y , then E(X ) ≥ E(Y ).
3 If X ≥ Y and EX = EY , then X = Y with probability 1.
4 If EX < ∞, then X < ∞ with probability 1.
5 Linearity: For constants c, d and random variables X , Y , we have:
E(aX + bY ) = aE(X ) + bE(Y ) .
6 If g : R 7→ R is any function, then

(P
g (j)P(X = j) if X is discrete
E (g (X )) = R ∞j
−∞
g (t)f (t) dt if X is absolutely continuous with density f .
7 If X and Y are independent random variables, then E(XY ) = E(X ) × E(Y ).
13 / 17
Indicator Functions and Examples
For an event A, we define the indicator of the event A as:
(
1 if ω ∈ A
1A (ω) =
0 if ω ∈
/ A.
E(1A ) = (1 × P(1A = 1)) + (0 × P(1A = 0)) = P(A).
A Funny Example: Two monkeys are independently typing random letters on

two separate typewriters. Suppose that someone stops them when they have
completed typing exactly 100 letters. What is the expected number of places
where their letters match?
For each 1 ≤ i ≤ 100, let Ai denote the event that the letters at the i th place
26 1
match. Then, P(Ai ) = 26 2 = 26 .
Expected number of places where their letters match is given by:

100
! 100
100
1A i = E (1Ai ) =
X X
E .
26
i=1 i=1
14 / 17
Variance and Higher Moments
Variance, or more precisely its square root (known as standard deviation) is a

measure of spread of the random variable around its expectation.
Mathematically speaking, the variance of a random variable X is defined as:

2
Var(X ) := E [(X − E(X ))] .
An alternative expression of the variance is given by:

2
Var(X ) := E(X 2 ) − [E(X )] EXERCISE !! .
Similarly the r th raw moment of X is defined as E(X r ) and the r th central

moment is defined as:
E [(X − E(X ))r ] .
Expectation is the first raw moment and variance is the second central
moment.
15 / 17
Covariance
Covariance is a measure of the relationship between two random variables X

and Y
Mathematically speaking, the covariance between random variables X and Y

is defined as:
Cov(X , Y ) := E [(X − E(X ))(Y − E(Y ))] .
An alternative expression of the variance is given by:
V(X ) := E(XY ) − E(X )E(Y ) EXERCISE !! .
Linearity: Covariance is bilinear (EXERCISE!!)
Cov(aX + bY , cW + dZ ) = ac Cov(X , W ) + ad Cov(X , Z )

+ bc Cov(Y , W ) + bd Cov(Y , Z ) .
Var(X ) = Cov(X , X ).
16 / 17
Some Properties of Variance and Covariance
Var(aX + bY ) = a2 Var(X ) + b 2 Var(Y ) + 2ab Cov(X , Y ).
A Generalization: For n random variables X1 , . . . , Xn ,

n
! n
X X X
Var ai Xi = ai2 Var(Xi ) + 2 ai aj Cov(Xi , Xj ).
i=1 i=1 i<j
For INDEPENDENT random variables X1 , . . . , Xn , we have:
Var(X1 + . . . + Xn ) = Var(X1 ) + . . . + Var(Xn ).
Remark:
Even for random variables X1 , . . . , Xn which are not independent , but
Cov(Xi , Xj ) = 0 for all i ̸= j, we have:
Var(X1 + . . . + Xn ) = Var(X1 ) + . . . + Var(Xn ).
Can you find examples of such random variables for n = 2? EXERCISE!!

17 / 17

ST3236_Note3

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

ST3236_Note3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ST3236_Note3

Uploaded by

Copyright:

Available Formats

LECTURE 3: Random Variables and Expectations

National University of Singapore

2 Joint Distribution and Independence

3 Expectation and Variance

Mathematically speaking, a random variable is a function

X : Ω 7→ R (set of all real numbers)

satisfying the condition that the sets {X ≤ x} := {ω ∈ Ω : X (ω) ≤ x} are

Example: You roll a die twice; your sample space is:

The distribution function of a random variable X is defined as:

For example, if a random variable X takes values 1, 2 and 3 with probabilities

In general, if a random variable X takes values x1 < x2 < x3 < . . . with

FX (t) = p1 + p2 + . . . + pk if xk ≤ t < xk+1 .

The distribution function F of any random variable X satisfies the following

Conversely, if a function F : R → R satisfies conditions 1,2 and 4 above, then

Examples of Discrete Distributions:

Number of successes in n independent trials of an experiment, with success

A random variable X is said to have a continuous distribution, if its

A special class of continuous distributions satistfy the property that there

In this case, the distribution is said to be absolutely continuous, and the

3 Exponential Distribution: f (t) = λe −λt (t > 0)

2 Joint Distribution and Independence

3 Expectation and Variance

The joint distribution function of random variables X1 , X2 , . . . , Xn is defined

FX1 ,...,Xn (t1 , . . . , tn ) = FX1 (t1 ) . . . FXn (tn ) .

That is, “the joint distribution function is a product of the marginal

P(X1 ∈ A1 , . . . , Xn ∈ An ) = P(X1 ∈ A1 ) . . . P(Xn ∈ An ) .

2 Joint Distribution and Independence

3 Expectation and Variance

We will mathematically define the expected values of a discrete and an

Let X be a discrete random variable taking values in the countable set C .

Let X be an absolutely continuous random variable with density function f .

1 If E(c) = c for a non-random constant c.

2 If X ≥ Y , then E(X ) ≥ E(Y ).

3 If X ≥ Y and EX = EY , then X = Y with probability 1.

4 If EX < ∞, then X < ∞ with probability 1.

5 Linearity: For constants c, d and random variables X , Y , we have:

E(aX + bY ) = aE(X ) + bE(Y ) .

6 If g : R 7→ R is any function, then

7 If X and Y are independent random variables, then E(XY ) = E(X ) × E(Y ).

E(1A ) = (1 × P(1A = 1)) + (0 × P(1A = 0)) = P(A).

A Funny Example: Two monkeys are independently typing random letters on

Expected number of places where their letters match is given by:

Variance, or more precisely its square root (known as standard deviation) is a

Mathematically speaking, the variance of a random variable X is defined as:

An alternative expression of the variance is given by:

Similarly the r th raw moment of X is defined as E(X r ) and the r th central

Covariance is a measure of the relationship between two random variables X

Mathematically speaking, the covariance between random variables X and Y

Cov(X , Y ) := E [(X − E(X ))(Y − E(Y ))] .

An alternative expression of the variance is given by:

V(X ) := E(XY ) − E(X )E(Y ) EXERCISE !! .

Linearity: Covariance is bilinear (EXERCISE!!)

Cov(aX + bY , cW + dZ ) = ac Cov(X , W ) + ad Cov(X , Z )

A Generalization: For n random variables X1 , . . . , Xn ,

For INDEPENDENT random variables X1 , . . . , Xn , we have:

Var(X1 + . . . + Xn ) = Var(X1 ) + . . . + Var(Xn ).

Var(X1 + . . . + Xn ) = Var(X1 ) + . . . + Var(Xn ).

Can you find examples of such random variables for n = 2? EXERCISE!!

You might also like