Chapter 3 Probability Distributions
Chapter 3 Probability Distributions
Structures
Lecture 6:
Probability Distributions
Random
Variables
Discrete Continuous
Random Variable Random Variable
3.1 Random Variables
A random variable is a variable in a study in which subjects are
randomly selected. If the subjects such as sex, race, and site of
infection are selected from population randomly, they are called
random variable. It is a outcome of an experiment expressed in
terms of numerical value.
For example, toss a coin twice and observe the face values. The
possible outcomes are “HT, TH, and TT” are non numerical in
nature. However, we can define a random variable say X. As X
represents number of times heads are obtained in two tosses, X
will assume value 2 corresponding to the outcomes HH, 1
corresponding to the outcomes HT or TH or 0 corresponding to
the outcomes TT.
So, Rx={0,1,2}
◦ Outcomes HH HT TH TT
◦X=x 2 1 1 0
Random Variables Distributions
6
Random Variables Distributions
A fair coin is tossed twice, S={ HH, HT, TH, TT}, X number
of times head is obtained. Range={2, 1, 0}. The probability of
distribution is given by
P(2)=P(X=2)=P{HH}=¼
P(1)=P(X=1)=P{HT,TH}=2/4
𝑃 0 = 𝑃 𝑋 = 0 = 𝑃 𝑇𝑇 = ¼
7
3.2 Probability Distributions
Probability
Distributions
Discrete Continuous
Probability Probability
Distributions Distributions
Binomial Normal
Poisson Uniform
Hypergeometric Exponential
Types of probability of distribution
9
Bernouli Trials
There are 2 outcomes: B=1 if it success and B=0 if failure
10
Binomial Probabilities Example
Find the probabilities
e for n=6, p=0.2
X=0; p(0)=0.262144
p(1)=0.0.393216
p(2)=0.24567
p(3)=0.08192
p(4)=0.01536
p(5)=0.001563
p(6)=0.000064
11
Binomial Distribution
12
Binomial Distribution
Binomial probability can be applied under the following conditions
1. Each experiment has two possible outcomes (success or failure)
2. The outcomes of any trial is independent of the outcomes of any other
trails. Example: In coin tossing experiment the result of first toss has no
affect on the result of any other toss.
3. The probability of success or failure in any trail is constant.
4. The number of experiment should be finite.
5. The events are discrete events
Applications:
1. To find the likeability of something in Yes or No.
2. To find defective or good products manufactured in a factory.
3. To find positive and negative reviews on a product.
13
Binomial Distribution
The Mean of Binomial Distribution is the measurement of average success that would
be obtained in the 'n' number of trials. The Mean of Binomial Distribution is also
called Binomial Distribution Expectation. The formula for Binomial Distribution
Expectation is given as:
μ = n.p
Here, μ is the mean or expectation, n is the total number of trials, p is the probability
of success in each trial.
Example: If we toss a coin 20 times and getting head is the success then what is the
mean of success?
Solution:
Total Number of Trials n = 20
Probability of getting head in each trial, p = 1/2 = 0.5
Mean = n.p = 20 ⨯ 0.5
It means on average we would head 10 times on tossing a coin 20 times.
Variance of Binomial Distribution tells about the dispersion or spread of the
distribution. It is given by the product of the number of trials (n), probability of
success (p), and probability of failure (q). The Variance is given as follows:
σ2 = n.p.q
14
Binomial Distribution
Example: If we toss a coin 20 times and getting head is the success
then what is the variance of the distribution?
Solution: Here, n = 20
Probability of Success in each trial (p) = 0.5
Probability of Failure in each trial (q) = 0.5
Variance of the Binomial Distribution, σ = n.p.q = (20 ⨯ 0.5 ⨯ 0.5) = 5
17
Probability function of Binomial distribution
n Binomial expression Expended binomial expression
1 (p + q)1 p+q
2 (p + q)2 P2 + 2pq + q2
3 (p + q)3 P3+ 3p2q + 3pq2+ q3
4 (p + q)4 P4+ 4p3q + 6p2q2+ 4pq3+ q4
5 (p + q)5 P5+ 5p4q + 10p3q2+ 10p2q3+ 5pq4+ q5
6 (p + q)6 P6+ 6p5q + 15p4q2+ 20p3q3+ 15p2q4+ 6pq5+ q6
Permutation is used to find the number of ways to pick r things out of n
different things in a specific order without replacement.
𝒏𝑝
𝑛!
𝑟 =
𝑛−𝑟 !
18
Permutation and Combination
Permutation is used to find the number of ways to pick r things out of n
different things in a specific order without replacement. Permute 1, 2 and 3
taking 2 at a time: (1, 2), (1, 3), (2, 1), (2, 3), (3, 1), and (3, 2), i.e., in 6 ways.
Here, (1, 2) and (2, 1) are distinct.
𝒏𝑃 𝑛!
𝑟 =
𝑛−𝑟 !
Permute 1, 2 and 3 taking 3 numerals at a time: (1, 2, 3), (1, 3, 2), (2, 1, 3), (2,
3, 1), (3, 1, 2) and (3, 2, 1), i.e., in 6 ways.
n distinct things can be set taking r (r < n) at a time in n(n - 1)(n - 2)...(n - r +
1) ways.
Combination is used to choose ‘r’ components out of a total number of ‘n’
components.
𝒏𝐶 𝑛!
𝑟 = 𝒏𝐶 𝑛−𝑟 =
𝑟! 𝑛−𝑟 !
Using three numbers 1, 2, and 3 create sets with two numbers, then
combinations are: (1, 2), (1, 3), and (2, 3). Here, (1, 2) and (2, 1) are identical.
19
Permutation and Combination
Problem: Prevalence of diabetic in a community is 1 diabetic case in 10 persons. If 4
children are born, what will be the probability of occurrence of diabetes in the
following combinations?
i. All the 4 normal
ii.1 diabetic and 3 normal
iii. 2 diabetic and 2 normal
iv. 3 diabetic and 1 normal
v. All the 4 children born diabetic
Solution: Let X be a random variable which represents number of diabetic cases. Here,
n = 4, p(occurrence of diabetics) = p =1/10 and normal = q = 9/10.
The binomial expression is given as
(p + q)4 = P4+4p3q+6p2q2+4pq3+q4
i. P(all 4 normal) = P(X= 0, q4 = (9/10)4 = 0.6561
ii. 1 diabetic and 3 normal = P(X = 1) = pq3 = 4(1/10)(9/10)3 = 0.291
iii. 2 diabetic and 2 normal = P(X = 2) = 6p2q2 = 6(1/10)2(9/10)2 = 0.0486
iv. 3 diabetic and 1 normal = P(X = 3) = 4 p3q1 = 4(1/10)3(9/10)1 = 0.0036
v. All the 4 children born diabetic = P(X = 4) = 4p4 = (1/10)4 = 0.001
20
Permutation and Combination
Problem: What is the probability of having i) 2 male and (ii) 3 male children in
the family of 4 children?
Solution
By applying binomial theorem, the probability of having two male and two
female children in a family of four children can be calculated as: Here,
P(male) = 1/2 and p(female) = q =1/2
According to binomial equation
(p + q)4 = P4 + 4p3q + 6p2q2 + 4pq3 + q4
i. The probability of having two males and two females) = 6p2q2 =
6(1/2)2(1/2)2 = 0.375
ii. The probability of having 3 males and 1 female is calculated as: 1
diabetic and 3 normal = P(X =1) = 4p3q = 4(1/2)3(1/2) = 0.25
21
Permutation and Combination
Alternative method: According to the formula of binomial distribution
P(X=x) = nCx px qn-x
Let X be a random variable which represents number of
male birth, so n= 4 and x = 0, 1, 2, 3, 4
i) P(X = 2) = 2 male birth = C(4,2) (1/2)2 (1/2)2 = 6(1/2)2 (1/2)2 = 0.375
ii) P (X = 3 male birth) = C(4,3) (1/2)3 (1/2)4-3 = 4(1/2)3 (1/2) = 0.25
Example 3: A pair of dice is thrown 6 times and getting sum 5 is a success then what is
the probability of getting (i) no success (ii) two success (iii) at most two success
Solution: Here, n = 6. 5 can be obtained in 4 ways (1, 4) (4, 1) (2, 3) (3, 2).
Probability of getting the sum 5 in each trial, p = 4/36 = 1/9
Probability of not getting sum 5 = 1 - 1/9 = 8/9
(i) Probability of getting no success, P(X = 0) = 6C0(1/9)0(8/9)6 = (8/9)6
(ii) Probability of getting two success, P(X = 2) = 6C2(1/9)2(8/9)4 = 15(84/96)
(iii) Probability of getting at most two successes, P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X
= 2)
⇒ P(X ≤ 2) = (8/9)6 + 6(85/96) + 15(84/96)
22
Permutation and Combination
Example: A die is thrown 6 times and if getting an even number is a success
what is the probability of getting
(i) 4 Successes
(ii) No success
Solution: Here, n = 6, p = 3/6 = 1/2, and q = 1 - 1/2 = 1/2
P(X = r) = nCrprqn-r
(i) P(X = 4) = 6C4(1/2)4(1/2)2 = 15/64
(ii) P(X = 0) = 6C0(1/2)0(1/2)6 = 1/64
Conditions for Binomial Distribution
1. Fixed Number of Trials: There are a set number of trials or experiments (n), such as
flipping a coin 10 times.
2. Two Possible Outcomes: Each trial has only two possible outcomes, often labeled as
"success" and "failure." For example, getting heads or tails in a coin flip.
3. Independent Trials: The outcome of each trial is independent of the others, meaning
the result of one trial does not affect the result of another.
4. Constant Probability: The probability of success (denoted by p) remains the same
for each trial. For example, if you’re flipping a fair coin, the probability of getting
heads is always 0.5.
23
Random Variables Distributions
Probability Distribution Function (PDF)/Cumulative
Distribution Function (CDF): PDF/CDF given 𝐹𝑋 (𝑥) = 𝑃(𝑋
≤ 𝑥 ) = P 𝜔: 𝑋 𝜔 ≤ 𝑥 , which is the probability that X is in
the semi-open interval (−∞, x]. From the three axioms, we know
that 0 ≤ FX (x) ≤ 1; a particular type of random variable is defined
by the form or shape of FX (x) as x is varied on R.
25
The Poisson Probability Distribution
This distribution was developed by the French mathematician, S D
Poisson in 1837. The distribution has been extensively used in medical
and engineering field.
A random variable that follows Poisson distribution when it takes on one
of two distinct outcomes (binary observations) much like a binomial
variable. The Poission distribution arises when we count a number of
events in an interval of time.
The Poisson distribution is used when an event occurs in a given area of
opportunity.
An area of opportunity is a continuous unit or interval of time, volume,
or such area in which more than one occurrence of an event can occur.
◦ The number of scratches in a car’s paint
◦ The number of mosquito bites on a person
◦ The number of computer crashes in a day
The Poisson Distribution
When there is a large number of trials, but a small probability of success,
binomial calculation becomes impractical:
1. The number of trials, N, is very large
2. The probability of success is very small for each trial
Example: If 4% of the total items made by a factory are defective. Find the
probability that less than 2 items are defective in the sample of 50 items.
Solution:
Here, n = 50, p = (4/100) = 0.04, q = (1-p) = 0.96, λ = 2
Using Poisson's Distribution,
20 𝑒 −2 1
P(X = 0) = = 2 = 0.13534
0! 𝑒
1
2 𝑒 −2 2
P(X = 1) = = 2 = 0.27068
1! 𝑒
Hence, the probability that less than 2 items are defective in sample of
50 items is given by:
P( X > 2 ) = P( X = 0 ) + P( X = 1 ) = 0.13534 + 0.27068 = 0.40602
Poisson Distribution Characteristics
Example: If 1% of the total screws made by a factory are defective. Find the
probability that less than 3 screws are defective in a sample of 100 screws.
Solution:
Here we have, n = 100, p = 0.01, λ = np = 1
X = Number of defective screws
Using Poisson's Distribution
P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2)
λ0 𝑒 −λ 10 𝑒 −1 1
P(X = 0) = = =
𝑛! 0! 𝑒
11 𝑒 −1 1
P(X = 1) = =
1! 𝑒
12 𝑒 −1 1
P(X = 2) = =
2! 2𝑒
Thus, P(X < 3) = 1/e + 1/e +1/2e = 2.5/e = 0.919698
Poisson Distribution Characteristics
Example: If in an industry there is a chance that 5% of the employees will
suffer from corona. What is the probability that in a group of 20 employees,
more than 3 employees will suffer from coronavirusus?
Solution:
Here, n = 20, p = 0.05, λ = np = 1
X = Number of employees who will suffer corona
Using Poisson's Distribution
P(X > 3) = 1-[P(X = 0) + P(X = 1) + P(X = 2)+P(X=3)]
λ0 𝑒 −λ 10 𝑒 −1 1
P(X = 0) = = =
𝑛! 0! 𝑒
11 𝑒 −1 1
P(X = 1) = =
1! 𝑒
12 𝑒 −1 1
P(X = 2) = =
2! 2𝑒
13 𝑒 −1 1
P(X = 3) = =
3! 6𝑒
Thus, P(X > 3) =1-[1/e + 1/e +1/2e+1/6e] = 1-[8/3e] = 0.018988
Poisson Distribution Characteristics
Poisson Distribution Table
Mean μλ K-number of
events
P(X=k)
Variance and
0 0.0498
1 0.0498
Standard Deviation 2 0.2241
σ λ
2
where = expected
3
4
0.2241
0.1681
number of events
σ λ 5
6
0.1009
0.0505
Mean = λ = np
E[X] = λ n is the number of trails
7
8
0.0214
0.0080
p is the probability of
success 9 0.0027
10 0.0008
Characteristic Function (CF): CF is an alternative way to describe the distribution and
is given by: ϕ(t) = e(λ(e^(it) - 1))
where i= √-1 (imaginary unit).
Probability Generating Function (PGF): PGF generates the probabilities of the
distribution and is expressed as: G(z) = e(λ(z - 1))
• Skewness and Kurtosis: The Poisson distribution is positively skewed (skewness > 0)
and leptokurtic (kurtosis > 0), meaning it has a longer tail on the right side and heavier
tails than the normal distribution. However, for large values of λ, it becomes increasingly
symmetric and bell-shaped, resembling a normal distribution.
e λ λ X e 0.50 (0.50)2
P(X 2) 0.0758
X! 2!
Graph of Poisson Probabilities
0.70
Graphicall 0.60
y: = 0.50
0.50
= P(x) 0.40
X 0.50
0.30
0 0.6065
0.20
1 0.3033
2 0.0758 0.10
3 0.0126 0.00
0 1 2 3 4 5 6 7
4 0.0016
5 0.0002 x
6 0.0000
P(X = 2) = 0.0758
7 0.0000
Poisson Distribution Shape
The shape of the Poisson Distribution
depends on the parameter :
0.70
= 0.50 0.25
= 3.00
0.60
0.20
0.50
0.40 0.15
P(x)
P(x)
0.30
0.10
0.20
0.05
0.10
0.00 0.00
0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 11 12
x x
The Poisson Distribution
41
The Poisson Distribution
The seismicity of a particular region is described by the Gutenberg-Richter
recurrence law: logλm=4−0.7M.
a) What is the probability that exactly one earthquake of magnitude
greater than 7 will occur in a 10 year period? In a 50 year period? In a 250
year period?
Solution: We know that, logλm=4−0.7M⇒ λm=0.1258925
λ𝑡 𝑛 𝑒 −λ𝑡
𝑃 𝑁=𝑛 =
𝑛!
0.12589×10 1 𝒆−0.12589×10
For 10 years period, 𝑃 𝑁 = 1 = = 35.75%
1!
0.12589×50 1 𝒆−0.12589×50
For 50 years period, 𝑃 𝑁 = 1 = = 1.162%
1!
0.12589×250 1 𝑒 −0.12589×250
For 250 years period, 𝑃 𝑁 = 1 = 10−13 = 0
= 6.75 ∗
1!
b) What is the probability that at least one EQ of magnitude greater than
7 will occur in a 10 year period, 50 year period and 250 year period?
Probability of at least one EQ greater than M7 will occur is given by
𝑃 𝑁 ≥ 1 = 1 − 𝑒 −λm𝑡
For 10 year period: 𝑃 𝑁 ≥ 1 = 1 − 𝑒 −0.12589×10 = 0.716 = 71.6%
For 50 year period: 𝑃 𝑁 ≥ 1 = 1 − 𝑒 −0.12589×50 = 0.998 = 99.8%
For 250 year period: 𝑃 𝑁 ≥ 1 = 1 − 𝑒 −0.12589×250 = 1 = 100%
42
The Poisson Distribution
The seismicity of a particular region is described by the Gutenberg-Richter
recurrence law: logλm=4−0.7M.
c) Determine the earthquake magnitude that would have 10% probability
being exceeded at least one in a 50 year period?
Solution: Probability of at least one EQ greater than M7 will occur is given by
𝑃 𝑁 ≥ 1 = 1 − 𝑒 −λm𝑡
0.1 = 1 − 𝑒 −λm×50
⇒0.1 − 1 = −𝑒 −λm×50 ⇒−0.9 = −𝑒 −λm×50 ⇒ln0.9 = −λm × 50
−λm × 50 = ln0.9 = −0.1053⇒λm = 0.002107
logλm = 4 − 0.7𝑀
log 0.002107 = 4 − 0.7𝑀 ⇒ 𝑀 = 9.537
43
Poisson distribution
On an average 1 house in 1000 in a certain district has a fire during a year.
If there are 2000 houses in that district, what is the probability that exactly 5
houses will have a fire during the year? Take 𝑒 −2 = 0.13534.
1
Solution: Mean, 𝑥 = 𝑛𝑝, 𝑛 = 2000 and 𝑝 =
1000
1
𝑥 = 2000 × =2
1000
Thus, λ=2 and the Poisson distribution is:
𝑒 −λ λx 𝑒 −2 25 0.13534 ×32
𝑃 𝑋=x = ⇒𝑃 𝑋=5 = = = 0.036
x! 5! 120
44
When we use Binomial or Poisson distribution?
Binomial Distribution:
1. When the number of trials are fixed, i.e., a dice is rolled 10 times so, n =
10
2. When the outcomes/probability can be classified into two groups, i.e.,
probability of success (p) and probability of failure (q). For example, a
fair six-sided dice is rolled ten times. The probability that number 3
comes is p=1/6 and the probability that number 3 does not come is q=
5/6.
3. When probability is constant and independent of event. Rolling a fair six-
sided dice will not affect the probability of occurring number 3. It will be
always be p=1/6.
4. When event is random and the variable is discrete.
5. So for a six-sided fair dice rolled 5 times, find the probability that number
3 occurs twice.
45
When we use Binomial or Poisson distribution?
Poisson Distribution:
1. There needs to be a constant rate/mean of an event occurring over a
period of time.
2. The event needs to be random.
For example, cars at a car wash arrive @ 5 cars/hour at random. Find the
probability that 2 cars arrive in a 30-min period.
Under what conditions does the binomial distribution tend to normal
distribution?
Sample size should be very large, i.e., the number of trials n should be
large. There isn't a strict cutoff, but a common rule of thumb is that n
should be at least 30. As sample size increases the probability distribution
curve will tend to be symmetrical and more peaked.
The probability should tend to 0.5. As the distribution curve will tend to
be symmetrical then the probability will tend to be 0.5 at central.
46
When we use Binomial or Normal distribution?
The probability of success p should not be too close to 0 or 1. Specifically,
both np and n(1−p) should be greater than or equal to 5.
This ensures that there are enough successes and failures for the distribution
to approximate normality.
Central Limit Theorem (CLT): The tendency of the binomial distribution to
approximate a normal distribution is a consequence of the CLT, which states
that the distribution of the sum (or average) of a large number of independent
random variables tends toward a normal distribution, regardless of the
original distribution.
47
Exponential distribution
It is widely used to model the time or space between events in a Poisson process.
It describes how long one has to wait before something happens, like a bus
arriving or a customer calling a help center. For example, if buses arrive at a bus
stop every 15 minutes on average, the time you wait for the next bus can be
modeled using an exponential distribution.
The probability density function (pdf) of the exponential distribution is:
𝒇 𝐱; λ = λ𝒆−λ𝐱 𝐟𝐨𝐫 𝐱≥𝟎
= 0 otherwise
Here, λ > 0 is the rate parameter (how often events occur), positive real constant.
x is the time or distance until the next event
The cumulative distribution function (CDF) gives the probability that the event
occurs within time:
𝐹 x; λ = 1 − 𝑒 λx , x ≥ 0
1
Mean (Expected value): E 𝑋 =
λ
1
Variance: Var 𝑋 = λ2
48
Exponential distribution
The exponential distribution is memoryless, which means:
P(X > s + t | X > s) = P(X > t)
This means the probability of waiting longer does not depend on how long
you've already waited. This is unique to the exponential distribution.
Example: Suppose calls come into a customer support center at an average rate
of 2 per minute. What is the probability that you wait more than 30 seconds for
the next call?
Solution:
1. Rate (λ): Since 2 calls come in per minute, that means the average rate is:
λ=2 calls per minute.
2. Convert Time: Find the probability of waiting more than 30 seconds. But
since the rate is in minutes, convert 30 seconds to minutes:
30 seconds=0.5 minutes.
3. Use the Exponential Distribution formula
So, there is about a 36.79% chance that the next call comes after 30 seconds.
49
Exponential distribution
Poisson distribution: talks about number of occurrences in time t
λ𝑡
𝑒− λ𝑡 x
P 𝑥 =
x!
50
Exponential distribution
f(x)
x
Exponential density function
51
Exponential distribution
F(X)
x
Distribution function 1- e-λx
52
Uniform Distribution
PX
X
a
53
Uniform Distribution
A Random Variable X that is uniformly
distributed between x1 and x2 has density
function:
X1 X2
54
Normal Probability Distribution
First it was developed mathematically in 1733 by De Moivre.
Then Laplace used the normal curve in 1783 to describe
the distribution errors. After that Gauss used the normal curve to
analyze astronomical data in 1809.
The normal curve is often called Gaussian curve. It is the
continuous probability distribution. The continuous variables like
body temperature, blood sugar level, height, weight etc. follow
such frequency distribution which make normal curve or bell
shape curve.
In quantitative analysis, mean and standard deviation plays the
vital role So these two summary measures are the silent features of
the data. The concentration and distribution nature of the data set
are determined by these measures.
55
Normal Probability Distribution
When the class intervals are very small and corresponding
frequencies are very large, then the frequency polygon tends to be
a smooth symmetrical curve, the smooth curve is said to normal
curve. For variables like height, weight etc., the curve will be
normal. The frequency distribution is known as normal
distribution.
Normal and standard deviation are the two parameters of the
normal distribution and completely determined the location of the
number line and the shape of a normal curve. Thus, many different
normal curves are possible, one for each combination of mean and
standard deviation. The normal distribution is a probability
distribution, because the area under the curve is equal to 1.
56
Normal Probability Distribution
It is found that most things in nature give rise to a Normal Curve or are
Normally Distributed, e.g., height of people, weight of people, length of leaves,
weight of beans, length around the forehead, and many others. Thus, a
knowledge of the Normal Distribution is very important.
The equation of the curve is: A random variable X that takes any value between
negative and positive infinity (∞, - ∞), the formula for normal distribution is as
follows: ( x )2
1 1 𝑥−𝜇 1
𝑃 𝑋=𝑥 = 𝑒2 𝜎 y e 2 2
𝜎 2𝜋 2
where μ (λ) is the mean, σ is the standard deviation (S.D.).
Note due to the exponential the curve only touches the x axis at ±∞. The value
of e = 2.178. The function depends only on two parameters, i.e., mean and
standard deviation.
57
The Characteristics of Normal Distribution
1. It is the smooth, bell shaped and symmetric curve about the mean of the
distribution. X-axis represents random variable and Y-axis probability
density function.
2. The mean, the median and mode are all equal and coincide at the central
part of the curve
3. The total area under the curve and above the X-axis is one square unit.
This follows that normal distribution is a probability distribution, but it is
symmetric distribution. So half the area 50% is on the left of the mean and
half 50% is on the right.
4. On the basis of mean 𝜇 and standard deviation 𝜎, the area of the normal
curve is described as:
i. 68.27% area of normal curve (or 68.27% of the observations) lying
within one standard deviation on either side of the mean, i. e., 𝜇 ±1𝜎
= 68.27%
ii. 95.45% area of normal curve (or 95.45% of the observations) lying
within two standard deviation on either side of the mean, i. e., 𝜇 ± 2𝜎
58
= 95.45%
The Characteristics of Normal Distribution
iii. 99.73% area of normal curve (or 99.73% of the observations) lying
within three standard deviation on either side of the mean, i.e., 𝜇 ± 3𝜎
= 99.73%
5. The normal distribution is completely determined by the parameters 𝜇 𝑎𝑛𝑑
𝜎
6. The two tails of the curve never touch the x axis
There are different normal curves with different means and standard deviation.
The different normal curves have different probabilities table. Practically, it is
not possible to have different table for every normal curve. That’s why the
normal curve can be converted into the standardized form which is called
“standard normal curve”. The standard normal curve has mean(𝜇) = 0 and
standard deviation (𝜎)=1
𝑋−𝜇
𝑧= ; where, X value of random variable with which we are concerned, 𝜇
𝜎
mean of the distribution of random variable; 𝜎 standard deviation of the
distribution, Z distance from X to the mean.
59
The Characteristics of Normal Distribution
The different types of normal curve can be plotted according to the value
of mean and standard deviation. The standard normal distribution is a
special member of the normal family that has a mean of 0 and a standard
deviation of 1. The standard normal random variable is denoted by Z and
it is written as
𝑋−𝜇
𝑧=
𝜎
The standard normal distribution is important since the probabilities of
any value can be computed from the standard normal distribution if mean
and standard deviation (SD) are known
1 2
1
𝑃 𝑧 = 𝑒 2𝑧 −∞<𝑧<∞
2𝜋
60
The Characteristics of Normal Distribution
61
Gaussian (Normal) Distribution
A Random Variable X that is normally
distributed has density function:
62
Gaussian (Normal) Distribution
63
Gaussian (Normal) Distribution
Often we want to compute the probability that random outcome is
within a specified interval, i.e., P(a<= X <= where a could be -
infinity and b could be +infinity.
For continuous random variables, this probability corresponds to
the area bound by a and b and under curve. The probability X is a
specific value, i. e., P(X= x), is 0 since no area is above a single
point. It follows that:
Probability is measured by the area
• The probability of an area
under the curve
between two points on the
normal curve is obtained by
using standard normal curve.
The z value is computed as
𝑋−𝜇
𝑧=
𝜎
• This is called the z score
transformation.
64
The Normal Distribution
2. Area under the Normal Curve
Gauss Showed that the area under the Normal Curve was
1 Unit2.
Since the curve had been obtained through probability
theory, where the total probability = 1, there is a
connection between probability and the area under the
curve. In fact it has been shown that the probability of an
event occurring in a Normal Distribution is equivalent to
the area of a Normal Curve. We will use this in examples
later.
65
The Normal Distribution
3. Symmetry of the Normal Curve
For a value of plus or minus x away from the mean
we have equal areas.
Again the areas to
the right and left of
the mean value µ are
equal. Since the Area
under the curve is 1
~ A A ~
unit2 then the areas
are 0.5 each.
~ ~
~~ ~~
~~ ~~
~~~ ~~~
~~~ ~~~
~~~~ ~~~~
~~~~ ~~~~
~~~~~ ~~~~~
~~~~~~ ~~~~~~
~~~~~~ ~~~~~~
~~~~~~~ ~~~~~~~
B
~~~~~~~~
~~~~~~~~~~
~~~~~~~~~~~
B
~~~~~~~~
~~~~~~~~~~
~~~~~~~~~~~
~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~
-x µ +x
Mean 66
The Normal Distribution
4. Dispersion or Spread of the Normal Curve
The following results are very important
(a) The area under the Normal
Curve contained between the
Mean ± 1 Standard Deviation i.e. μ
± 1σ is 68.26% of the area under
the whole curve (just over 2/3)
-1σ μ +1σ
x μ
z
σ
68
Problem
Find the area under the normal curve
a. Between Z = 0 and Z = 1.5
b. Between Z = -2.0 and Z = 0
c. Between Z = - 2.0 and Z =1.5
Solution
Find the area under the normal curve: a) Between Z=0 and Z=
1.5
From the table of normal distribution, the figure corresponding
to Z=1.5 is 0.4332 which is the area between Z=0 and Z=1.5.
The shaded area in the figure represents the area between Z=0
and Z=1.5.
Symbolically,
𝑃(𝑂≤𝑧≤1.5)=0.4432(from the normal table) (0.93319-0.50)
69
Solution
b) Find the area under the normal
curve between Z=-2 and Z=0
71
Solution:
𝑋−𝜇 8−11.5
a) By using standard normal distribution, when X1= 8.5, 𝑧1 = =
𝜎 3
= −1.5
𝑋−𝜇 14.5−11.5
when X2= 14.5, 𝑧1 = = =1
𝜎 3
Now,
𝑃(−1.5 < 𝑍 < 1) = P(Z<1) - P (Z<1.5)=0.84134−0.06681=0.7745
𝑋−𝜇
b) 𝑧 = , where mean=11.5 years and standard deviation=3 years
𝜎
𝑃(𝑋>10)=?
By using standard normal distribution,
𝑋−𝜇 10−11.5
When X=10, 𝑧 = = = −0.5
𝜎 3
Now,
𝑃(𝑍 > 0.50) =1- P(Z ≤ −0.50) =1− 0.3085=0.6915
72
Solution:
𝑋−𝜇
c) 𝑧 = , where mean =11.5 years and standard deviation=3 years
𝜎
𝑃(𝑋 < 12) = ?
By using standard normal distribution,
𝑋−𝜇 12−11.5
When X=12, 𝑧 = = = 0.166
𝜎 3
Now,
𝑃(𝑍 < 0.166) = 0.56
73
Problem
The mean and standard deviation of height of college students are 165 cm and 6
cm respectively. It is considered that lowest 10 are under normal height and
highest 10 are above normal height. Find the heights of students that cuts up the
normal range.
a) Solution
Given, mean=16.5cm, standard deviation = 6cm
P(X < X1) = P(Z<Z1) =10% = 0.10
P(X>X1) = P(Z>Z1) = 10% = 0.10
Here, the Z value under the area 0.10 = -1.28 (from normal table)
𝑋1 − 𝜇 𝑋1 − 165
𝑧1 = ⇒ −1.28 = ⇒ 𝑋1 = 157.32
𝜎 6
b) Solution
Here, the Z value under the area 0.90 = 1.29 (right area = 10% and left area =
90%) from normal table:
𝑋2 − 𝜇 𝑋2 − 165
𝑧2 = ⇒ 1.29 = ⇒ 𝑋2 = 172.24
𝜎 6
74
(x - μ)
σ 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0159 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2518 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4430 0.4441
1.6 0.4452 0.4463 0.4474 0.4485 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4762 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4865 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4980 0.4980 0.4981
2.9 0.4981 0.4982 0.4983 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.49865 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.49903 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.49931 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
75
3.3 0.49952 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.49966 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
Uses of Normal Distribution
1. Most of the characteristics are normally distributed for
such characteristics, we use this normal distribution.
2. The shape of normal curve is very useful in practice
and makes statistical analysis. It is bell shaped.
3. It tells the probability of occurrence by chance or how
often an observations measured in terms of mean and
standard deviation can occur normally.
4. Normal curve is used to find the confidence limits of
the population parameters. Used to fix confidence
interval.
5. The testing of hypothesis (test of significance) is
carried out on the basis of normal distribution
6. It is used to find the standard error
7. It is used to calculate sample size
76
The Normal Distribution
Use of the Normal Curve and Normal Probability Tables
Suppose a variable is normally distributed about a mean value of 10 with
a standard deviation of 0.5.
Suppose we want to find the probability of a variable in the distribution
having a value between 10.7 and 11.2
81
The Normal Distribution
Pr mass between 120Kg and
155Kg is the sum of the
areas to the left and the
right of the centre.
Area = 0.4808 + 0.1064
Area = Pr = 0.5872
(b) To find Pr mass >185Kg z = (185 – 151)/15 = 2.27
Pr = Area = 0.477.2
Number of rods equals
0.4772 x 1000 = 477 rods
86
Central Limit Theorem (CLT)
e
87
Central Limit Theorem (CLT)
The CLT states that for a large number of independent
identically distributed (iid) random variables (X1,…,Xn) with
finite variance, the average 𝑋𝑛 , approximately has a normal
distribution, no matter what the distribution of the 𝑋𝑖 is (no
matter what the shape of the population distribution). This fact
holds especially true for sample sizes over 30.
For more samples, especially large ones, the graph of the sample
means will look more like a normal distribution.
The CLT describes the relationship between the sampling
distribution of sample means and the population that the
samples are taken from. CLT states that the sum of a large
number of independent random variables (binomial, Poisson,
etc.) will approximate a normal distribution.
Example: Example: Human height is determined by a large
number of factors, both genetic and environmental, which are
additive in their effects. Thus, it follows a normal distribution.
88
Central Limit Theorem
Normal Populations
Important Fact:
If the population is normally distributed, then the sampling
distribution of x is normally distributed for any sample size n.
The central limit theorem provides a tool to
approximate the probability distribution of the
average or the sum of independent identically
distributed (iid) random variables.
89
Sampling distribution of normally distributed population
When the population from which samples are drawn is normally
distributed with its mean equal to μ and standard deviation equal to
σ, then: The mean of the sample means, μˉx, is equal to the mean of
the population, μ.
90
Sampling distribution of normally distributed population
e
91
Non-normal distribution of population
Normal Distribution is a distribution that has most of the data in the
center with decreasing amounts evenly distributed to the left and the
right. Non-normal Distributions, Skewed Distribution is distribution
with data clumped up on one side or the other with decreasing
amounts trailing off to the left or the right.
92
Non-normal distribution
What can we say about the shape of the sampling distribution of x
when the population from which the sample is selected is not
normal?
93
Central Limit Theorem
• If a random sample of n observations is selected from a
population (any population), then when n is sufficiently large, the
sampling distribution of x will be approximately normal.
• The larger the sample size, the better will be the normal
approximation to the sampling distribution of x.
• Suppose that a sample is obtained containing a large number of
observations, each observation being randomly generated in a
way that does not depend on the values of the other observations,
and that the arithmetic average of the observed values is
computed.
• If this procedure is performed many times, the central limit
theorem says that the computed values of the average will be
distributed according to the normal distribution.
94
Central Limit Theorem
Example:
A population consists of the numbers {1 2 3 4 5} and randomly
selected two numbers from the population and calculated their
mean.
For example, we might select the numbers 1 and 5 whose mean
would be 3. Suppose we repeated this experiment (with
replacement) many times. We would have a collection of sample
means (millions of them). We could then construct a frequency
distribution of these sample means.
The resulting distribution of sample means is called the sampling
distribution of sample means. Now, having the distribution of
sample means proceed to calculate the mean of all sample means
(grand mean).
95
Central Limit Theorem
The Central Limit Theorem predicts that regardless of the
distribution of the parent population.
1. The mean of the population of means is always equal to the
mean of the parent population from which the population
samples were drawn.
2. The standard deviation (standard error) of the population of
means is always equal to the standard deviation of the parent
population divided by the square root of the sample size (N).
SD’= SD/√N
The central limit theorem states that if you have a population with
mean μ and standard deviation σ and take sufficiently large random
samples from the population with replacement, then the distribution
of the sample means will be approximately normally distributed.
96
Central Limit Theorem
In probability theory, the central limit theorem establishes that, in
many situations, when independent random variables are added,
their properly normalized sum tends toward a normal distribution
even if the original variables themselves are not normally
distributed.
Central limit theorem (CLT) is commonly defined as a statistical
theory that given a sufficiently large sample size from a population
with a finite level of variance, the mean of all samples from the
same population will be approximately equal to the mean of the
population.
In other words, the central limit theorem is exactly what the shape of
the distribution of means will be when we draw repeated samples
from a given population. Specifically, as the sample sizes get larger,
the distribution of means calculated from repeated sampling will
approach normality. 97
Three parts of central limit theorem
• µ is the population mean
• σ is the population standard deviation
• n is the sample size
Standard deviation (𝝈)
• The standard deviation is the positive square root of the average
of the squared deviation of the observation from their arithmetic
mean. It is also called root mean square deviation.
• It is denoted by Greek symbol 𝝈)
• The standard deviation of n set is a measure of how much a
typical number in the set differs from the mean. The greater the
standard deviation, the more the numbers in the set varies from
the mean.
∑ 𝑋−𝑥 2
• 𝜎= 98
𝑁
Standard deviation (𝝈)
Problem: From the following monumental temple’s brick testing
data, calculate the standard deviation
2
S.NO. Test value (X) 𝑋−𝑥 𝑋−𝑥
1 11.34 -1.0033 1.006611
2 13.34 0.9967 0.993411
3 12.47 0.1267 0.016053
4 11.65 -0.6933 0.480665
5 12.47 0.1267 0.016053
6 12.79 0.4467 0.199541
Total 74.06, Mean= 74.06/6=12.3433 Sum=2.71233
∑ 𝑋−𝑥 2 2.71233
• 𝜎= = = 0.672351
𝑁 6
99
Normal Distribution
100
Non-Probability Sampling Techniques
Non-probability sampling is defined as a sampling technique in
which the researcher selects samples based on the subjective
judgment of the researcher rather than random selection. It is a less
stringent method. This sampling method depends heavily on the
expertise of the researchers.
Judgmental Sampling
Judgment sample or Expert sample is a type of random sample
that is selected based on the opinion of an expert. Results obtained
from a judgment sample are subject to some degree of bias, due to
the frame and population not being identical.
101
Convenient Sampling
Convenience sampling is a type of nonprobability sampling in which people
are sampled simply because they are convenient sources of data for
researchers. In probability sampling each element in the population has a
known nonzero chance of being selected through the use of a random
selection procedure.
Quota Sampling
Quota sampling is defined as a non-probability sampling method in which
researchers create a sample involving individuals that represent a population.
Researchers choose these individuals according to specific traits or qualities.
These samples can be generalized to the entire population.
Snowball sampling
Snowball sampling or chain referral sampling is defined as a non- probability
sampling technique in which the samples have traits that are rare to find.
This is a sampling technique, in which existing subjects provide referrals to
recruit samples required for a research study.
102
Snowball Sampling
e
103
Sampling Distribution
• A sample distribution is a probability distribution of a statistic obtained
from a larger number of samples drawn from a specific population.
• The sampling distribution of a given population is the distribution of
frequencies of a range of different outcomes that could possibly occur for
a statistic of a population.
• A sampling distribution refers to a probability distribution of a statistic
that comes from choosing random samples of a given population. Also
known as a finite sample distribution, it represents the distribution of
frequencies for how spread apart various outcomes will be for a specific
population.
104
Sampling Distribution
• The sampling distribution depends on multiple factors the statistic,
sample size, sampling process, and the overall population. It is used to
help calculate statistics such as means, ranges, variances, and standard
deviations for the given sample.
How it Works?
1. Select a random sample of a specific size
from a given population.
2. Calculate a statistic for the sample, such
as the mean, median, or standard
deviation.
3. Develop a frequency distribution of each
sample statistic that you calculated from
the step above.
4. Plot the frequency distribution of each
sample statistic that you developed from
the step above. The resulting graph will
be the sampling distribution.
105
Types of Sampling Distribution
1. Sampling distributing of mean
2. Sampling distribution of proportion
3. T distribution
Sampling Distribution of Mean
• A sampling distribution is a probability distribution of a statistic obtained
from a larger number of samples drawn from a specific population.
• It describes a range of possible outcomes that of a statistic, such as the
mean or mode of some variable, as it truly exists a population.
• The overall shape of the distribution is symmetric and approximately
normal. There are no outliers or other important deviations from the
overall pattern.
106
Sampling Distribution of Mean
• The sampling distribution of sample means can be described by its
shape, center, and spread, just like any of the other distributions.
• The shape of sampling distribution is normal a bell shaped curve with a
single peak and two tails extending symmetrically in either direction.
• The center of the sampling distribution of sample means which is,
itself, the mean or average of the means is the true population mean, μ.
This will sometimes be written as μ𝑋 − to denote it as the mean of the
sample means.
• The spread of the sampling distribution is called the standard error, the
quantification of sampling error, denoted μ𝑋 − . The formula for standard
error is:
107
Central Limit Theorem
Objectives
• How to find sampling distributions and verify their properties?
• How to interpret the Central Limit Theorem (CLT)?
• How to apply the CLT to find the probability of a sample mean?
The CLT is one of the most powerful and useful ideas in all of statistics.
Both alternatives are concerned with drawing finite samples of size n from
a population with a known mean, μ, and a known standard deviation, σ.
The second alternative says that if we again collect samples of size n that
are "large enough," calculate the sum of each sample and create a
histogram, then the resulting histogram will again tend to have a normal
bell shape.
In either case, it does not matter what the distribution of the original
population is, or whether you even need to know it. The important fact is
that the sample means and the sums tend to follow the normal distribution.
108
Central Limit Theorem
• It is the probability distribution of a statistic for a large number of
samples taken from a population.
• A random sample is drawn from a population and calculate a statistic
for the sample, such as the mean.
• Now draw another random sample of the same size, and again calculate
the mean.
• Repeating this process many times, we end up with a large number of
means, one for each sample.
• The distribution of the sample means is an example of a sampling
distribution.
Sampling distribution of the mean will always be normally distributed, as
long as the sample size is large enough. Regardless of whether the
population has a normal, Poisson, binomial, or any other distribution, the
sampling distribution of the mean will be normal. A normal distribution is
a symmetrical, bell-shaped distribution.
109
Central Limit Theorem
110
Conditions of the central limit theorem
The central limit theorem states that the sampling distribution of the mean will
always follow a normal distribution under the following conditions:
1. The sample size is sufficiently large. This condition is usually met if the
sample size is n ≥ 30.
2. The samples are independent and identically distributed (iid) random
variables. This condition is usually met if the sampling is random.
3. The population’s distribution has finite variance. Central limit theorem
doesn’t apply to distributions with infinite variance, such as the Cauchy
distribution. Most distributions have finite variance.
111
Central Limit Theorem
The size of the sample, n, that is required in order to be 'large
enough' depends on the original population from which the
samples are drawn. If the original population is far from
normal then more observations are needed for the sample
means or the sample sums to be normal.
When large samples, usually greater than thirty, are taken into
consideration then the distribution of sample arithmetic mean
approaches the normal distribution irrespective of the fact that
random variables were originally distributed normally or not.
112
Central Limit Theorem
Consider a random variable X, its standard deviation is σ and
μ is the mean of the random variable. As per the CLT, the
sample mean 𝑋, will approximate to the normal distribution
which is given as 𝑋⁓ N(μ, σ/√n). The Z-Score of the random
variable 𝑋 is given as:
𝑋−𝜇
Z= 𝜎 ; Here 𝑥 is the mean of 𝑋.
𝑛
The sample mean = population mean = 𝜇
standard deviation 𝜎
Sample standard deviation = =
𝑛 𝑛
113
Lognormal distribution
A lognormal distribution is bounded by zero below and is skewed
to the right.
A lognormal distribution is useful for describing the prices for many
financial assets and a normal distribution is often a good
approximation for returns.
A lognormal distribution is defined by mean and variance which in
turn are derived from mean and variance of its associated normal
distribution.
When σ increases the mean of log normal increases, it can spread
outwards but it cant spread beyond zero, therefore it means
increases.
A normal distribution is a closer fit for quarterly and tearly holding
returns than it is for daily or wekly returns.
A normal distribution is less suitable for asset prices since they cant
fall below zero.
114
Statistical Characterizations
• Second Moment:
115
Statistical Characterizations
Variance of X:
• Standard Deviation of X:
116
Mean Estimation from Samples
Given a set of N samples from a distribution, we
can estimate the mean of the distribution by:
117
Variance Estimation from Samples
118
Lindeberg-Feller Central Limit Theorem (CLT)
The Lindeberg-Feller CLT states that sums of independent random variables,
properly standardized, converge in distribution to standard normal if the
Lindeberg condition is satisfied. Since, these random variables do not have to
be identically distributed (iid), this result generalizes the CLT for independent
and iid sequences.
The Lindeberg-Lyapunov Conditions: Suppose that X1, X2, . . .
are independent random variables such that E[Xn] = μn and Var Xn= 𝜎𝑛2 <∞.
Define: 𝑌𝑛 = 𝑋𝑛 − 𝜇𝑛 ,
𝑇𝑛 = ∑𝑛𝑖=1 𝑌𝑖 ,
𝑠𝑛2 = Var𝑇𝑛 = ∑𝑛𝑖=1 𝜎𝑖2
Note that Tn/sn has mean zero and variance 1. Provide sufficient conditions
𝑇𝑛 𝑑
that ensure 𝑁 0,1 .
𝑠𝑛
119
Lindeberg-Feller Central Limit Theorem
Provide two separate conditions, the Lindeberg and the Lyapunov conditions.
The Lindeberg Condition for sequences
1 𝑛
states that for every ε > 0, 2 ∑𝑖 E 𝑌𝑖2 𝐼 𝑌𝑖 ≥ 𝜀𝑠𝑛 0 𝑎𝑠 𝑛 ∞;
𝑠𝑛
the Lyapunov condition for sequences states that
1
there exists δ> 0 such that 2+𝛿 ∑𝑛𝑖=1 E 𝑌𝑖 2+𝛿
0 𝑎𝑠 𝑛 ∞.
𝑠𝑛
The above conditions implies that Tn/sn N(0, 1)
Among series of independent Bernoulli trials with different success
probabilities, the proportion of successes tends to a normal distribution
whenever
𝑠𝑛2 = ∑𝑛𝑖=1 𝑝𝑛 1 − 𝑝𝑛 ∞, which will be true as long as 𝑝𝑛 1 − 𝑝𝑛 does
not tend to 0 too fast.
120