[go: up one dir, main page]

0% found this document useful (0 votes)
32 views31 pages

Chapter 2 - Random Variables and Distributions

Chapter 2 of 'Advanced Statistics: Theory and Methods' by Nandini Kannan covers random variables, their distributions, and key concepts such as cumulative distribution functions, probability mass functions, and probability density functions. It explains the expected value and variance for both discrete and continuous random variables, along with moments, skewness, and kurtosis. The chapter also introduces discrete population models, specifically Bernoulli and Binomial distributions, detailing their properties and probability mass functions.

Uploaded by

Sara Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views31 pages

Chapter 2 - Random Variables and Distributions

Chapter 2 of 'Advanced Statistics: Theory and Methods' by Nandini Kannan covers random variables, their distributions, and key concepts such as cumulative distribution functions, probability mass functions, and probability density functions. It explains the expected value and variance for both discrete and continuous random variables, along with moments, skewness, and kurtosis. The chapter also introduces discrete population models, specifically Bernoulli and Binomial distributions, detailing their properties and probability mass functions.

Uploaded by

Sara Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Advanced Statistics : Theory and Methods

Chapter 2 - Random Variables and Distributions

Nandini Kannan

Plaksha University, Spring Semester AY 2024-25

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 1 / 31
Random Variable

Random Variable

Let S denotes the sample space for the experiment.


Definition: A Random Variable (rv) X is a mapping from the sample
space S to the set of real numbers.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 2 / 31
Random Variable

Cumulative Distribution Function of a random variable

Definition: The cumulative distribution function (cdf) F (x) of the


random variable X is

FX (x) = P(X ≤ x), −∞ < x < ∞.

The function FX (x) satisfies the following three conditions:


1 limx→−∞ FX (x) = 0; limx→∞ FX (x) = 1.
2 FX (x) is a nondecreasing function of x; i.e. for all a, b ∈ R, a < b,
we have FX (a) ≤ FX (b).
3 FX (x) is right continuous; i.e. limx↓x0 FX (x) = FX (x0 ).

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 3 / 31
Random Variable

Discrete Random Variable

A random variable is said to be discrete if it can assume finite or


countably many values.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 4 / 31
Random Variable

Probability Mass Function

The probability distribution or probability mass function (pmf) of a


discrete random variable is a formula, table or graph that associates a
probability with each value of the random variable.
The probability distribution is denoted by p(x).

p(x) = P(X = x)

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 5 / 31
Random Variable

Properties of the PMF

The properties of the probability mass function are:

X ≥ 0, ∀x
p(x)
1

2 p(x) = 1.
x
The cumulative distribution function of a discrete random variable is a
step function with jumps of p(x) at each point x in the support of X .

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 6 / 31
Random Variable

Continuous Random Variable

A continuous random variable is one that can assume values in an


interval of the real line.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 7 / 31
Random Variable

Probability Density Function


For continuous random variables, probabilities are computed from a
smooth function called the probability density function (pdf),
denoted by f(x).
The probability that the random variable takes values in an interval is
simply the area under f(x) between the end points of the specified
interval.
Therefore the probability that the random variable X lies in the
interval [a, b] is given by
Z b
P(a ≤ X ≤ b) = f (x) dx
a
With this definition of probabilities for a continuous random variable,
observe that Z k
P(X = k) = f (x)dx = 0.
k
Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 8 / 31
Random Variable

Properties of PDF

The pdf must satisfy two properties


1. f (x) ≥ 0 for all x.
R∞
2. −∞ f (x)dx = 1.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 9 / 31
Random Variable

Cumulative Distribution function

Definition: The cumulative distribution function F (x) of a continuous


random variable X with pdf f (x) is
Z x
F (x) = P(X ≤ x) = f (t)dt, −∞ < x < ∞.
−∞

We have from the previous definition, that

dF (x)
f (x) = = F ′ (x).
dx

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 10 / 31
Random Variable

Expected value or mean of a Discrete Random Variable

Definition Let X be a discrete random variable with probability mass


function p(x). The Expected value or mean of X is given by
X
µ = E (X ) = x p(x)
x
P
provided that x |x|p(x) < ∞. If the sum diverges,the expectation is
undefined.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 11 / 31
Random Variable

Expectated value or mean of a Continuous random variable

Definition Let X be a continuous random variable with probability density


function f (x). The Expected value or mean of X is given by
Z ∞
µ = E (X ) = x f (x) dx
−∞
R
provided that |x|f (x) dx < ∞. If the integral diverges,the expectation is
undefined.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 12 / 31
Random Variable

Theorem
Let X be a random variable. The expected value of the random variable
g (X ) is X
µg (X ) = E [g (X )] = g (x) p(x)
x

if X is discrete, and
Z ∞
µg (X ) = E [g (X )] = g (x) f (x) dx.
−∞

if X is continuous.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 13 / 31
Random Variable

Variance of a Discrete Random Variable

Definition: Let X be a discrete random variable with probability mass


function p(x) and mean µ. The Variance σ 2 of X is
X
σ 2 = E (X − µ)2 = (x − µ)2 p(x).
x

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 14 / 31
Random Variable

Variance of a Continuous Random Variable

Definition: Let X be a continuous random variable with probability


density function f (x) and mean µ. The Variance σ 2 of X is
Z ∞
2 2
σ = E (X − µ) = (x − µ)2 f (x) dx.
−∞

The positive square root of the variance is called the standard deviation.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 15 / 31
Random Variable

Theorem
The variance of a random variable X is

σ 2 = E (X 2 ) − µ2 .

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 16 / 31
Moments

k−th moment

Definition: The k−th moment of a random variable X is



µk = E (X k ).

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 17 / 31
Moments

k−th central moment

The k−th central moment is

µk = E [(X − µ)]k ,

where µ = µ1 = E (X ).

The first moment of a random variable is its’ mean, while the first
central moment is 0.
The second central moment is the variance of X .
The third central moment µ3 measures the symmetry of the
distribution of X about its’ mean.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 18 / 31
Moments

Measure of Skewness

The dimension-free measure of skewness is given by


µ3
ν1 = .
σ3

The index is zero when the distribution is symmetric about the mean.
Negative values are associated with distributions skewed to the left,
whereas ν1 tends to be positive when the distribution of X is skewed
to the right.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 19 / 31
Moments

Measure of Kurtosis

The fourth central moment µ4 provides an indication of the


”peakedness” or ”kurtosis” of a distribution.
A dimension-free measure of kurtosis is
µ4
ν2 = .
σ4
The peakedness of a distribution is compared to the Gaussian
distribution which has ν2 = 3.
A distribution is ”less peaked” (platykurtic) if ν2 < 3 and ”more
peaked” (leptokurtic) if ν2 > 3.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 20 / 31
Moments

Moment Generating Function

Definition:Let X be a random variable with cdf FX (.). The moment


generating function (mgf) of the random variable X , denoted by MX (t)
is defined as

MX (t) = E (e tX ), (1)

provided the expectation exists in some neighbourhood of 0. We have


( X
e tx p(x), if X is discrete;
mX (t) = R ∞x tx
−∞ e f (x)dx, if X is continuous.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 21 / 31
Moments

Example

Example Let
1
fX (x) = e −x/2 , x > 0.
2
We have
1 ∞ tx −x/2
Z
MX (t) = e e dx
2 0
1 ∞ (t− 1 )x
Z
= e 2 dx
2 0
1 1
= if t < .
1 − 2t 2

If t > 21 , the integral is infinite.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 22 / 31
Moments

Theorem

Theorem
If the mgf MX (t) of X exists in a neighbourhood of 0, the derivatives of
all orders exist at t = 0 and may be obtained by differentiating under the
integral (or summation), i.e.

dn
MXn (0) = MX (t)|t=0 = E (X n ).
dt n

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 23 / 31
Moments

Proof: We have
Z ∞
d d
MX (t) = = e tx f (x)dx
dt dt −∞
Z ∞
d tx
= (e )f (x)dx
−∞ dt
= E (Xe tX ).

d
MX (t)|t=0 = E (X ).
dt

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 24 / 31
Moments

Remark

Remark: Since

t2
 
MX (t) = E (e ) = E 1 + tX + X 2 + . . .
tX
2!
t 2
= 1 + tE (X ) + E (X 2 ) + . . .
2!
E (X n ) is the coefficient of t k /k! in the above expansion.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 25 / 31
Discrete Population Models

Discrete Population Models


Bernoulli and Binomial Distribution

Definition: Consider an experiment that can result in one of two


outcomes. We classify these outcomes as Success and Failure. The
probability of Success is denoted by p. Such a trial is called a Bernoulli
trial.

Examples
1 Toss a fair coin: Heads and Tails
2 Testing a blood sample for Absence or Presence of a particular disease
3 Testing items in a factory: Defective or Nondefective

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 26 / 31
Discrete Population Models

Definition: For any Bernoulli trial, we define the random variable X as


follows: if the trial results in a Success, X = 1; otherwise X = 0. This is
called the Bernoulli random variable and its’ pmf is given by:


 1 − p, if x = 0;  x
p (1 − p)1−x , if x = 0, 1;
p(x) = p, if x = 1; =
0, otherwise.
0, otherwise.

We have µ = E (X ) = p and σ 2 = Var (X ) = p(1 − p). Here p is a


parameter of the distribution.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 27 / 31
Discrete Population Models

Binomial Experiment

Binomial Experiment: A binomial experiment is an experiment that has


the following properties:

1 The experiment consists of n identical trials.


2 Each trial can result in one of two possible outcomes. These
outcomes will be classified as Success S, and Failure F.
3 The probability of success on a single trial is equal to p and remains
constant from trial to trial. The probability of failure is then 1 − p
which is denoted by q.
4 The trials are independent.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 28 / 31
Discrete Population Models

We are interested in X , the number of successes in the n trials. The


random variable X can take values 0, 1, . . . , n. The random variable X is
called a Binomial random variable.
We usually write,
X ∼ Bin(n, p).
We have
 
n
p(k) = P(X = k) = p k q n−k , k = 0, 1, . . . , n
k

where  
n n!
=
k k!(n − k)!

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 29 / 31
Discrete Population Models

Theorem
If Y ∼ Bin(n,p), then the mgf of Y is

MY (t) = (pe t + q)n . (2)


Proof: We have
 
tY
Xn
tk n
MY (t) = E (e )= e p k q n−k
k=0 k
 
Xn n
= (pe t )k q n−k
k=0 k
= (pe t + q)n .

The last equality follows from the Binomial theorem.

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 30 / 31
Discrete Population Models

Mean and Variance of the Binomial Random Variable

Result:
The mean of the Binomial random variable X ∼ Bin(n, p) is given by

µ = E (X ) = n p

The variance of the binomial random variable is

σ 2 = n p (1 − p)

Nandini Kannan (Plaksha University) Advanced Statistics : Theory and Methods January 22, 2025 31 / 31

You might also like