0% found this document useful (0 votes)

19 views32 pages

Sampling Distributions

The document discusses sampling distributions and the central limit theorem. It provides definitions of parameters, statistics, and sampling distributions. Examples are given to illustrate computing the mean and variance of sampling distributions for different sample sizes. As sample size increases, the sampling distributions approach a normal distribution as predicted by the central limit theorem.

Uploaded by

Asmi Gulati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views32 pages

Sampling Distributions

Uploaded by

Asmi Gulati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Plaksha University: Technology Leaders Program

Dr. Nandini Kannan

email : nandini.kannan@plaksha.edu.in

1
Chapter 3

Sampling Distributions

3.1 Introduction

Statistics as the science that deals with

(a) the collection, organization, and summary of information about

a particular topic of interest (Descriptive Statistics)

(b) drawing inferences about the population of interest using in-

formation obtained from a sample (Inferential Statistics)

Definition: A Parameter is a numerical measure associated with

a population.

2
Definition: A Statistic is a numerical measure associated with

the sample.

Example Paper Boat would like to determine the sugar content

of Nagpur oranges. How would you help Paper Boat answer this

question?

A consumer group wants to determine the fuel efficiency of the

new Honda SUV. How would you proceed.

In both cases, identify the parameter and the statistics of interest.

Therefore, Statistics are Random Variables.

We would like to know how the statistic changes over different

samples.

Definition: The probability distribution of a statistic is called a

sampling distribution.

Example: For the fuel efficiency example, the sampling distribution

of the mean could be attempted as follows:

3
• Draw a sample of 100 from the population of all vehicles manu-

factured in a given time period. Compute the sample mean.

• Repeat the process of drawing samples of size 100 several times.

• Each time a sample is selected, the value of the statistic (mean)

is calculated.

• Draw the relative frequency histogram of these computed statis-

tics.

If the process is repeated a large number of times, the histogram

will provide an approximation of the sampling distribution.

4
Let X1, . . . , Xn be n mutually independent observations made of a

particular quantitative phenomenon-for example, blood pressure. We

assume that each observation Xi has the same probability distribu-

tion. We say that the n observations X1, . . . , Xn are independent

and identically distributed (i.i.d.).

If X is described by a pmf, then

p(x1, . . . , xn) = P (X1 = x1, . . . , Xn = xn) = p(x1) . . . p(xn).

If X is described by a pdf, then

f (x1, . . . , xn) = f (x1) . . . f (xn).

X1, . . . , Xn are said to be a random sample of size n from

the distribution of X.

5
3.2 Sampling Distribution of the Mean

Suppose we are interested in estimating the mean µX of a random

variable X based on a random sample X1, . . . , Xn. We can estimate

µX by the sample average

n
1X
X̄ = Xi .
n i=1

Using properties of expectation, we have

µX̄ = E(X̄) = µX

and
2
2 σX
σX̄ = V ar(X̄) = .
n

Example: Consider a random sample of n = 12 from U (−1/2, 1/2).

Let T = ni=1 Xi. Figure 2.1 shows a histogram of 1000 such sums
P

with a superimposed normal pdf.

Figure 2.2 shows a histogram of the distribution of X(n), the largest

order statistic.

6
0.4
0.3
0.2
0.1
0.0

-3 -2 -1 0 1 2 3
Value

Figure 3.1: Probability Histogram: Sum of 12 Uniform Random Variables

7
400
300
200
100
0

0.0 0.1 0.2 0.3 0.4 0.5

ymax

Figure 3.2: Probability Histogram: Max of 12 Uniform Random Variables

8
3.3 Central Limit Theorem

Example: Suppose X is a discrete random variable with probability

distribution given by

x 0 1
1 2
p(x) 3 3

i.e. the population consists of 0’s and 1’s, 1/3-rd of the population

consists of 0’s, 2/3-rd of the population consists of 1’s.

We can compute the mean and variance. We have

2 2
E(X) = µ = V ar(X) = σ 2 =
3 9

Let X1, X2 be a random sample of size 2 drawn from this pop-

ulation. Both X1 and X2 can take values 0 and 1. There are four

possible samples of size 2:

(0, 0), (0, 1), (1, 0), (1, 1).

We compute the average X 2 and total T2 for all possible samples.

The table is shown below.

9
X1 X2 X 2 T2

0 0 0 0

0 1 0.5 1

1 0 0.5 1

1 1 1 2

Thus the average and total are both random variables, and we can

compute their probability distributions.

X 2 can assume values 0, 0.5, and 1.

P (X 2 = 0) = P (X1 = 0 and X2 = 0)

= P (X1 = 0) P (X2 = 0) (by independence)

1 1
=
3 3
1
=
9

P (X 2 = 0.5) = P (X1 = 0 and X2 = 1) + P (X1 = 1 and X2 = 0)

10
= P (X1 = 0) P (X2 = 1) + P (X1 = 1) P (X2 = 0)

1 2 2 2
= +
3 3 3 3
4
=
9

P (X 2 = 1) = P (X1 = 1 and X2 = 1)

= P (X1 = 1) P (X2 = 1)

2 2
=
3 3
4
=
9

Similarly the total T2 is a random variable taking values 0, 1, and

2. We can compute the probabilities in exactly the same way. The

probability distributions can be summarized as follows:

X2 0.0 0.5 1.0

1 4 4
prob 9 9 9

We compute the mean and variance to be

2
E(X 2) = =µ
3
11
1 σ2
V ar(X 2) = =
9 2

The probability distribution of T2 is given below.

T2 0.0 1.0 2.0

1 4 4
prob 9 9 9

We compute the mean and variance to be

4
E(T2) = =2µ
3
4
V ar(X 2) = = 2 σ 2
9

We can repeat this process by drawing a sample of size 3 from the

population. We can compute the probability distributions for X 3

and T3. The probability distributions are given below.

1 2
X3 0.0 3 3 1.0
1 6 12 8
prob 27 27 27 27

12
40

30
Percent of Total

0.0 0.2 0.4 0.6 0.8 1.0

ybar

Figure 3.3: Probability Histogram for Average: n=2

13
40

30
Percent of Total

0.0 0.5 1.0 1.5 2.0

ysum

Figure 3.4: Probability Histogram for Sum: n=2

14
We compute the mean and variance to be

2
E(X 3) = =µ
3
2 σ2
V ar(X 3) = =
9 3

The probability distribution of T3 is given below.

T3 0.0 1.0 2.0 3.0

1 6 12 8
prob 27 27 27 27

We compute the mean and variance to be

E(T3) = 2 = 3 µ
2
V ar(X 3) = = 3 σ2
3

Even with a sample of size 3, the histograms are beginning to

look symmetric and more like a normal distribution. If we continue

this process, we will observe that the probability histograms start

resembling the Gaussian Distribution.

15
50

40
Percent of Total

0.4 0.6 0.8 1.0

ybar

Figure 3.5: Probability Histogram for Average: n=3

16
4
3
2
1
0

0.3 0.4 0.5 0.6 0.7 0.8 0.9

ybar

Figure 3.6: Probability Histogram for Average: n=20

17
Theorem 3.3.1. Central Limit Theorem: Let X1, . . . , Xn be a

sequence of iid random variables drawn from a population with

finite mean µ, and variance σ 2. Then for large n, the sampling

distribution of the sample mean is approximately normal with

mean

E(X) = µ

and variance
σ2
V ar(X) = .
n

A similar statement can be written for the total.

Remark. If the population is known to be normal, then the

distribution of the sample mean is exactly normal for any sample

size n.

Remark. By large n, we usually mean a sample of at least 25

measurements.

18
Example: A manufacturer of automobile batteries claims that the

distribution of the lifetimes of its best battery has an average of 54

months, and a standard deviation of 6 months. Suppose a consumer

group decides to check the claim by purchasing a sample of 50 of

these batteries and testing them.

(a) Describe the sampling distribution of the average lifetime of a

sample of 50 batteries.

(b) What is the probability that the sample has an average life of

52 months or fewer?

Solution Since the sample size is greater than 25, the sampling

distribution of the average based on a sample of size 50 is approx-

imately normally distributed with mean

E(X 50) = 54 months

and variance
36
V ar(X 50) = .
50

19
(b) We want to find

P (X 50 ≤ 52)

We know the distribution is approximately normal, so we can stan-

dardize the above using the mean and variance computed in (a).

P (X 50 ≤ 52) = P ((X 50 − 54) ≤ (52 − 54))

(X 50 − 54) (52 − 54)
= P ≤
0.85 0.85
= P (Z ≤ −2.35)

= 0.0094

The probability the consumer group will observe a sample average

of 52 or less is 0.0094 if the manufacturer’s claim is true. If the 50

tested batteries do result in an average of 52 or fewer months, the

consumer group will have strong evidence that the manufacturer’s

claim is untrue. Such an event is very unlikely to happen if the claim

is true.

20
3.4 Normal Approximation to the Binomial

Let Y ∼ Bin(n, p). Y is the number of successes that are observed

in the n trials and p represents the probability of success in any trial.

Y = X1 + . . . + Xn, where Xi′s are iid Bernoulli random variables.

An estimate of p, denoted by p̂, is the proportion of successes that

are observed in the n trials, i.e.,

Y
p̂ = .
n

In order to use the normal approximation to the binomial, we

require

n p ≥ 5; n(1 − p) ≥ 5.

Theorem 3.4.1. The sampling distribution of p̂ is approximately

normal with mean

and variance
p (1 − p)
.
n
21
Example: An airline has determined that the no-show rate for reser-

vations is 10 %. Suppose the next flight has 100 parties with advance

reservations.

a. Find the probability that the number of no-shows is between

20 and 25.

b. Approximate the probability in (a). Justify the approximation.

22
3.5 Distributions Derived From the Normal

Linear combinations of normally distributed random variables are

normal.

Theorem 3.5.1. Let X1, . . . , Xn be independent N (µi, σi2) ran-

Pn
dom variables. Let Y = i=1 aiXi be a linear combination of the

Xi′s with a1, . . . , an constants. Then Y ∼ N ( ni=1 aiµi, ni=1 a2i σi2).
P P

3.5.1 The χ2 distribution

Theorem 3.5.2. If Z ∼ N (0, 1), then Z 2 ∼ χ21.

Proof: The pdf of Z is given by

1 2
f (z) = √ e−z /2, −∞ < z < ∞.
2π

Let Y = Z 2. This defines a transformation between the value of Z

and Y that is not one-to-one.

√ √
The inverse solutions of y = z 2 are z = ± y. Let z1 = − y and
√
z2 = y.

23
The Jacobian of the transformation is given by

d √ −1 d √ 1
J1 = (− y) = √ ; J2 = ( y) = √ .
dy 2 y dy 2 y

The pdf of Y is given by

1 −1 1 / 1 1
g(y) = √ e−y/2 √ + √ e−y 2 √ = √ y 1/2−1e−y/2
2π 2 y 2π 2 y 2π
for y > 0.
R∞
Since 0 g(y)dy = 1, we have
Z ∞
1 Γ(1/2)
1=√ y 1/2−1e−y/2dy = √ ,
2π 0 π

since the function within the integral sign resembles a Gamma ran-

dom variable with α = 1/2 and β = 2.

Recall the pdf of a Gamma(α, β) rv is

1
f (x) = α
xα−1e−x/β .
β Γ(α)
√
Therefore Γ(1/2) = π and the pdf of Y is given by

 √ 1 y 1/2−1e−y/2, y > 0;

2Γ(1/2)

 0,

otherwise.

24
0.0 0.2 0.4 0.6
0
2
4
6
y
8
10
12

Figure 3.7: Sampling Distribution of Z 2 with superimposed Chi-squared distribution

This is the pdf of a chi-squared random variable with 1 degree of

freedom. ■

We also showed that if X ∼ N (µ, σ 2), then (X −µ)/σ ∼ N (0, 1).

Therefore [(X − µ)/σ]2 ∼ χ21.

Theorem 3.5.3. If U1, . . . , Un are iid chi-squared random vari-

ables with 1 degree of freedom, then V = U1 + . . . + Un ∼ χ2n.

This is the reproductive property of the chi-squared distribution.

Sampling Distribution of S 2

Let X1, . . . , Xn be a random sample drawn from a normal pop-

25
0.00 0.02 0.04 0.06 0.08
0
10
y
20
30

Figure 3.8: Sampling Distribution of Sum of 10 chi-squared Random variables

ulation with mean µ and variance σ 2. The sample variance

n
2 1 X
S = (Xi − X̄)2
n − 1 i=1

is a random variable. We have

n
X n
X
2
(Xi − µ) = [(Xi − X̄) + (X̄ − µ)]2
i=1 i=1
Xn n
X n
X
= (Xi − X̄)2 + (X̄ − µ)2 + 2(X̄ − µ) (Xi − X̄)
i=1 i=1 i=1
Xn
= (Xi − X̄)2 + n(X̄ − µ)2.
i=1
2 2
Pn
Dividing each term by σ and substituting (n − 1)S for i=1 (Xi −

26
X̄)2, we have
n
1 X 2 (n − 1)S 2 (X̄ − µ)2
(Xi − µ) = +
σ 2 i=1 σ2 σ 2/n

We know that
n
X (Xi − µ)2
i=1
σ2
is a chi-squared random variable with n degrees of freedom. We also

know that
(X̄ − µ)
√ ∼ N (0, 1)
σ/ n
which implies
(X̄ − µ)2
σ 2/n
is a chi-squared random variable with 1 degree of freedom. We then

have the following result.

Theorem 3.5.4. Consider a random sample of size n from a

normal population with mean µ and variance σ 2. Then

2 (n − 1)S 2
χ =
σ2

is a chi-squared random variable with n − 1 degrees of freedom

27
3.5.2 Student’s t− distribution

The Central Limit Theorem allows us to determine the sampling dis-

tribution of X̄ when σ is known. In many practical applications,

the population variance is unknown and must be estimated from the

data. This introduces additional variability and produces a distribu-

tion that deviates significantly from a standard normal.

Theorem 3.5.5. Let Z ∼ N (0, 1) and V ∼ χ2(ν). If Z and V

are independent, then

Z
T =p
V /ν

has a t− distribution with ν degrees of freedom.

Corollary 3.5.6. Let X1, . . . , Xn be a random sample from a

normal population with mean µ and variance σ 2. Let

n
1X 1 Xn
2
X̄ = Xi and S = (Xi − X̄)2.
n i=1 n−1 i=1

X̄−µ
Then the random variable T = √
S/ n
has a t− distribution with

ν = n − 1 degrees of freedom.

28
• The t−distribution is symmetric about 0.

• It is bell-shaped.

• The t−distribution is more variable than the Z (standard nor-

mal).

• The distribution is characterized by a single parameter ν called

the degrees of freedom (df).

• As the df increases, the t−distribution gets closer and closer to

the normal curve.

• Assumes underlying population is normal: however, if the un-

derlying population is not normal but is ”nearly” bell shaped

the distribution of T will be approximately a t.

• Tables of the percentage points for the t− distribution are avail-

able for different degrees of freedom.

29
0.4
0.3

t-3
t-20
stnorm
0.2
0.1
0.0

-4 -2 0 2 4

Figure 3.9: The t densities on 1 and 20 df, and the standard normal

30
3.5.3 The F − distribution

Theorem 3.5.7. Let U ∼ χ2(ν1) and V ∼ χ2(ν2) be independent

random variables. Then

U/ν1
F =
V /ν2
has the F −distribution with ν1 and ν2 degrees of freedom.

The F −distribution is not symmetric.

Theorem 3.5.8. Let fα (ν1, ν2) be the value from the F −table

that cuts off an area of α in the upper tail for ν1 and ν2 degrees

of freedom. Then
1
f1−α (ν1, ν2) = .
fα (ν2, ν1)

Theorem 3.5.9. Let S12 and S22 be the variances corresponding to

two independent random samples of size n1 and n2 from normal

populations with variances σ12 and σ22, respectively. Then

S12/σ12
F = 2 2
S2 /σ2
31
has an F − distribution with ν1 = n1 − 1 and ν2 = n2 − 1 degrees

of freedom.

CHAPTER 5 Distributions of Functions of Random Variables
No ratings yet
CHAPTER 5 Distributions of Functions of Random Variables
6 pages
Sampling Dist
No ratings yet
Sampling Dist
34 pages
Sampling Distributions Explained
No ratings yet
Sampling Distributions Explained
9 pages
Gsbiju MA202 3 1
No ratings yet
Gsbiju MA202 3 1
5 pages
Applied Statistics and Probability For Engineers Chapter - 7
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 7
8 pages
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
No ratings yet
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
22 pages
Econ-2042 - Unit 5-HO
No ratings yet
Econ-2042 - Unit 5-HO
22 pages
Statistics II for Engineering Module
No ratings yet
Statistics II for Engineering Module
9 pages
5 BSM214 Lecture5 Fall2023
No ratings yet
5 BSM214 Lecture5 Fall2023
25 pages
Lecture 3 - Sampling-Distribution & Central Limit Theorem
No ratings yet
Lecture 3 - Sampling-Distribution & Central Limit Theorem
5 pages
Lecture No. Probability & Statistics
No ratings yet
Lecture No. Probability & Statistics
34 pages
Chap 1 Sampling Distributions
No ratings yet
Chap 1 Sampling Distributions
14 pages
Chapter 4
No ratings yet
Chapter 4
20 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
ProbabilityStatistics Probability3
No ratings yet
ProbabilityStatistics Probability3
9 pages
Sampling 2.1
No ratings yet
Sampling 2.1
21 pages
STAT 410 Chapter 07 PPT Sem 231
No ratings yet
STAT 410 Chapter 07 PPT Sem 231
18 pages
Part 2-1 Random Samples Sampling Distributions - Notes
No ratings yet
Part 2-1 Random Samples Sampling Distributions - Notes
8 pages
6.sampling Distribution
No ratings yet
6.sampling Distribution
24 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
MAS 206 Lecture Notes 2
No ratings yet
MAS 206 Lecture Notes 2
24 pages
Chap 1samp Distributions
No ratings yet
Chap 1samp Distributions
7 pages
Central Limit Theorem
100% (3)
Central Limit Theorem
38 pages
Lecture Transcript 3 (Sampling and Sampling Distribution)
No ratings yet
Lecture Transcript 3 (Sampling and Sampling Distribution)
5 pages
Distributions
No ratings yet
Distributions
21 pages
Fundamentals of Statistics
No ratings yet
Fundamentals of Statistics
8 pages
JB Ise Exercises Sampdists Answers
No ratings yet
JB Ise Exercises Sampdists Answers
18 pages
Week 9
No ratings yet
Week 9
19 pages
Untitled 3
No ratings yet
Untitled 3
32 pages
Chapter 5
No ratings yet
Chapter 5
21 pages
Lect9 Math231
No ratings yet
Lect9 Math231
42 pages
Sampling Distributions
No ratings yet
Sampling Distributions
92 pages
Probd
No ratings yet
Probd
49 pages
8.chapter 2
No ratings yet
8.chapter 2
13 pages
Chapter 4. Sampling Distributions
No ratings yet
Chapter 4. Sampling Distributions
31 pages
Statistics For Economists Lecture V
No ratings yet
Statistics For Economists Lecture V
37 pages
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
No ratings yet
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
36 pages
Module01 HypothesisTesting
No ratings yet
Module01 HypothesisTesting
62 pages
Chapter 8 - Sampling Distribution
No ratings yet
Chapter 8 - Sampling Distribution
34 pages
Sampling and CLT
No ratings yet
Sampling and CLT
4 pages
Module 5. Sampling Distribution and Central Limit Theorem
No ratings yet
Module 5. Sampling Distribution and Central Limit Theorem
13 pages
Hypothesis Testing 23.09.2023
No ratings yet
Hypothesis Testing 23.09.2023
157 pages
Chapter4 Random Samples December 2024
No ratings yet
Chapter4 Random Samples December 2024
23 pages
Formula List Statistics 2
No ratings yet
Formula List Statistics 2
4 pages
2.11.sampling and Central Limit Theorem
No ratings yet
2.11.sampling and Central Limit Theorem
26 pages
Module-5-Statistics-And-Probability 11
No ratings yet
Module-5-Statistics-And-Probability 11
9 pages
Ch1 Prob II NAU Spring23
No ratings yet
Ch1 Prob II NAU Spring23
17 pages
2s03 Session 3 CLT & Normal Dist (Handout)
No ratings yet
2s03 Session 3 CLT & Normal Dist (Handout)
51 pages
Statistics Formula Sheet New
No ratings yet
Statistics Formula Sheet New
22 pages
Chapter 6
No ratings yet
Chapter 6
12 pages
Week 7 and 8 31 Aug To 18 Sept Sampling Distributions
No ratings yet
Week 7 and 8 31 Aug To 18 Sept Sampling Distributions
6 pages
Statistics for College Students
100% (2)
Statistics for College Students
57 pages
Module02A Slides Print
No ratings yet
Module02A Slides Print
66 pages
Radiation Counting Statistics Guide
100% (1)
Radiation Counting Statistics Guide
36 pages
CH I - Sampling and Sampling Distributions
No ratings yet
CH I - Sampling and Sampling Distributions
13 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages