[go: up one dir, main page]

0% found this document useful (0 votes)
19 views32 pages

Sampling Distributions

The document discusses sampling distributions and the central limit theorem. It provides definitions of parameters, statistics, and sampling distributions. Examples are given to illustrate computing the mean and variance of sampling distributions for different sample sizes. As sample size increases, the sampling distributions approach a normal distribution as predicted by the central limit theorem.

Uploaded by

Asmi Gulati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views32 pages

Sampling Distributions

The document discusses sampling distributions and the central limit theorem. It provides definitions of parameters, statistics, and sampling distributions. Examples are given to illustrate computing the mean and variance of sampling distributions for different sample sizes. As sample size increases, the sampling distributions approach a normal distribution as predicted by the central limit theorem.

Uploaded by

Asmi Gulati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Plaksha University: Technology Leaders Program

Dr. Nandini Kannan

email : nandini.kannan@plaksha.edu.in

1
Chapter 3

Sampling Distributions

3.1 Introduction

Statistics as the science that deals with

(a) the collection, organization, and summary of information about

a particular topic of interest (Descriptive Statistics)

(b) drawing inferences about the population of interest using in-

formation obtained from a sample (Inferential Statistics)

Definition: A Parameter is a numerical measure associated with

a population.

2
Definition: A Statistic is a numerical measure associated with

the sample.

Example Paper Boat would like to determine the sugar content

of Nagpur oranges. How would you help Paper Boat answer this

question?

A consumer group wants to determine the fuel efficiency of the

new Honda SUV. How would you proceed.

In both cases, identify the parameter and the statistics of interest.

Therefore, Statistics are Random Variables.

We would like to know how the statistic changes over different

samples.

Definition: The probability distribution of a statistic is called a

sampling distribution.

Example: For the fuel efficiency example, the sampling distribution

of the mean could be attempted as follows:

3
• Draw a sample of 100 from the population of all vehicles manu-

factured in a given time period. Compute the sample mean.

• Repeat the process of drawing samples of size 100 several times.

• Each time a sample is selected, the value of the statistic (mean)

is calculated.

• Draw the relative frequency histogram of these computed statis-

tics.

If the process is repeated a large number of times, the histogram

will provide an approximation of the sampling distribution.

4
Let X1, . . . , Xn be n mutually independent observations made of a

particular quantitative phenomenon-for example, blood pressure. We

assume that each observation Xi has the same probability distribu-

tion. We say that the n observations X1, . . . , Xn are independent

and identically distributed (i.i.d.).

If X is described by a pmf, then

p(x1, . . . , xn) = P (X1 = x1, . . . , Xn = xn) = p(x1) . . . p(xn).

If X is described by a pdf, then

f (x1, . . . , xn) = f (x1) . . . f (xn).

X1, . . . , Xn are said to be a random sample of size n from

the distribution of X.

5
3.2 Sampling Distribution of the Mean

Suppose we are interested in estimating the mean µX of a random

variable X based on a random sample X1, . . . , Xn. We can estimate

µX by the sample average


n
1X
X̄ = Xi .
n i=1

Using properties of expectation, we have

µX̄ = E(X̄) = µX

and
2
2 σX
σX̄ = V ar(X̄) = .
n

Example: Consider a random sample of n = 12 from U (−1/2, 1/2).

Let T = ni=1 Xi. Figure 2.1 shows a histogram of 1000 such sums
P

with a superimposed normal pdf.

Figure 2.2 shows a histogram of the distribution of X(n), the largest

order statistic.

6
0.4
0.3
0.2
0.1
0.0

-3 -2 -1 0 1 2 3
Value

Figure 3.1: Probability Histogram: Sum of 12 Uniform Random Variables

7
400
300
200
100
0

0.0 0.1 0.2 0.3 0.4 0.5


ymax

Figure 3.2: Probability Histogram: Max of 12 Uniform Random Variables

8
3.3 Central Limit Theorem

Example: Suppose X is a discrete random variable with probability

distribution given by

x 0 1
1 2
p(x) 3 3

i.e. the population consists of 0’s and 1’s, 1/3-rd of the population

consists of 0’s, 2/3-rd of the population consists of 1’s.

We can compute the mean and variance. We have


2 2
E(X) = µ = V ar(X) = σ 2 =
3 9

Let X1, X2 be a random sample of size 2 drawn from this pop-

ulation. Both X1 and X2 can take values 0 and 1. There are four

possible samples of size 2:

(0, 0), (0, 1), (1, 0), (1, 1).

We compute the average X 2 and total T2 for all possible samples.

The table is shown below.


9
X1 X2 X 2 T2

0 0 0 0

0 1 0.5 1

1 0 0.5 1

1 1 1 2

Thus the average and total are both random variables, and we can

compute their probability distributions.

X 2 can assume values 0, 0.5, and 1.

P (X 2 = 0) = P (X1 = 0 and X2 = 0)

= P (X1 = 0) P (X2 = 0) (by independence)


  
1 1
=
3 3
1
=
9

P (X 2 = 0.5) = P (X1 = 0 and X2 = 1) + P (X1 = 1 and X2 = 0)


10
= P (X1 = 0) P (X2 = 1) + P (X1 = 1) P (X2 = 0)
     
1 2 2 2
= +
3 3 3 3
4
=
9

P (X 2 = 1) = P (X1 = 1 and X2 = 1)

= P (X1 = 1) P (X2 = 1)
  
2 2
=
3 3
4
=
9

Similarly the total T2 is a random variable taking values 0, 1, and

2. We can compute the probabilities in exactly the same way. The

probability distributions can be summarized as follows:

X2 0.0 0.5 1.0


1 4 4
prob 9 9 9

We compute the mean and variance to be


2
E(X 2) = =µ
3
11
1 σ2
V ar(X 2) = =
9 2

The probability distribution of T2 is given below.

T2 0.0 1.0 2.0


1 4 4
prob 9 9 9

We compute the mean and variance to be

4
E(T2) = =2µ
3
4
V ar(X 2) = = 2 σ 2
9

We can repeat this process by drawing a sample of size 3 from the

population. We can compute the probability distributions for X 3

and T3. The probability distributions are given below.

1 2
X3 0.0 3 3 1.0
1 6 12 8
prob 27 27 27 27

12
40

30
Percent of Total

20

10

0.0 0.2 0.4 0.6 0.8 1.0


ybar

Figure 3.3: Probability Histogram for Average: n=2

13
40

30
Percent of Total

20

10

0.0 0.5 1.0 1.5 2.0


ysum

Figure 3.4: Probability Histogram for Sum: n=2

14
We compute the mean and variance to be

2
E(X 3) = =µ
3
2 σ2
V ar(X 3) = =
9 3

The probability distribution of T3 is given below.

T3 0.0 1.0 2.0 3.0


1 6 12 8
prob 27 27 27 27

We compute the mean and variance to be

E(T3) = 2 = 3 µ
2
V ar(X 3) = = 3 σ2
3

Even with a sample of size 3, the histograms are beginning to

look symmetric and more like a normal distribution. If we continue

this process, we will observe that the probability histograms start

resembling the Gaussian Distribution.


15
50

40
Percent of Total

30

20

10

0.4 0.6 0.8 1.0


ybar

Figure 3.5: Probability Histogram for Average: n=3

16
4
3
2
1
0

0.3 0.4 0.5 0.6 0.7 0.8 0.9


ybar

Figure 3.6: Probability Histogram for Average: n=20

17
Theorem 3.3.1. Central Limit Theorem: Let X1, . . . , Xn be a

sequence of iid random variables drawn from a population with

finite mean µ, and variance σ 2. Then for large n, the sampling

distribution of the sample mean is approximately normal with

mean

E(X) = µ

and variance
σ2
V ar(X) = .
n

A similar statement can be written for the total.

Remark. If the population is known to be normal, then the

distribution of the sample mean is exactly normal for any sample

size n.

Remark. By large n, we usually mean a sample of at least 25

measurements.

18
Example: A manufacturer of automobile batteries claims that the

distribution of the lifetimes of its best battery has an average of 54

months, and a standard deviation of 6 months. Suppose a consumer

group decides to check the claim by purchasing a sample of 50 of

these batteries and testing them.

(a) Describe the sampling distribution of the average lifetime of a

sample of 50 batteries.

(b) What is the probability that the sample has an average life of

52 months or fewer?

Solution Since the sample size is greater than 25, the sampling

distribution of the average based on a sample of size 50 is approx-

imately normally distributed with mean

E(X 50) = 54 months

and variance
36
V ar(X 50) = .
50

19
(b) We want to find

P (X 50 ≤ 52)

We know the distribution is approximately normal, so we can stan-

dardize the above using the mean and variance computed in (a).

P (X 50 ≤ 52) = P ((X 50 − 54) ≤ (52 − 54))


 
(X 50 − 54) (52 − 54)
= P ≤
0.85 0.85
= P (Z ≤ −2.35)

= 0.0094

The probability the consumer group will observe a sample average

of 52 or less is 0.0094 if the manufacturer’s claim is true. If the 50

tested batteries do result in an average of 52 or fewer months, the

consumer group will have strong evidence that the manufacturer’s

claim is untrue. Such an event is very unlikely to happen if the claim

is true.

20
3.4 Normal Approximation to the Binomial

Let Y ∼ Bin(n, p). Y is the number of successes that are observed

in the n trials and p represents the probability of success in any trial.

Y = X1 + . . . + Xn, where Xi′s are iid Bernoulli random variables.

An estimate of p, denoted by p̂, is the proportion of successes that

are observed in the n trials, i.e.,

Y
p̂ = .
n

In order to use the normal approximation to the binomial, we

require

n p ≥ 5; n(1 − p) ≥ 5.

Theorem 3.4.1. The sampling distribution of p̂ is approximately

normal with mean

and variance
p (1 − p)
.
n
21
Example: An airline has determined that the no-show rate for reser-

vations is 10 %. Suppose the next flight has 100 parties with advance

reservations.

a. Find the probability that the number of no-shows is between

20 and 25.

b. Approximate the probability in (a). Justify the approximation.

22
3.5 Distributions Derived From the Normal

Linear combinations of normally distributed random variables are

normal.

Theorem 3.5.1. Let X1, . . . , Xn be independent N (µi, σi2) ran-


Pn
dom variables. Let Y = i=1 aiXi be a linear combination of the

Xi′s with a1, . . . , an constants. Then Y ∼ N ( ni=1 aiµi, ni=1 a2i σi2).
P P

3.5.1 The χ2 distribution

Theorem 3.5.2. If Z ∼ N (0, 1), then Z 2 ∼ χ21.

Proof: The pdf of Z is given by

1 2
f (z) = √ e−z /2, −∞ < z < ∞.

Let Y = Z 2. This defines a transformation between the value of Z

and Y that is not one-to-one.


√ √
The inverse solutions of y = z 2 are z = ± y. Let z1 = − y and

z2 = y.

23
The Jacobian of the transformation is given by

d √ −1 d √ 1
J1 = (− y) = √ ; J2 = ( y) = √ .
dy 2 y dy 2 y

The pdf of Y is given by

1 −1 1 / 1 1
g(y) = √ e−y/2 √ + √ e−y 2 √ = √ y 1/2−1e−y/2
2π 2 y 2π 2 y 2π
for y > 0.
R∞
Since 0 g(y)dy = 1, we have
Z ∞
1 Γ(1/2)
1=√ y 1/2−1e−y/2dy = √ ,
2π 0 π

since the function within the integral sign resembles a Gamma ran-

dom variable with α = 1/2 and β = 2.

Recall the pdf of a Gamma(α, β) rv is

1
f (x) = α
xα−1e−x/β .
β Γ(α)

Therefore Γ(1/2) = π and the pdf of Y is given by

 √ 1 y 1/2−1e−y/2, y > 0;

2Γ(1/2)

 0,

otherwise.

24
0.0 0.2 0.4 0.6
0
2
4
6
y
8
10
12

Figure 3.7: Sampling Distribution of Z 2 with superimposed Chi-squared distribution

This is the pdf of a chi-squared random variable with 1 degree of

freedom. ■

We also showed that if X ∼ N (µ, σ 2), then (X −µ)/σ ∼ N (0, 1).

Therefore [(X − µ)/σ]2 ∼ χ21.

Theorem 3.5.3. If U1, . . . , Un are iid chi-squared random vari-

ables with 1 degree of freedom, then V = U1 + . . . + Un ∼ χ2n.

This is the reproductive property of the chi-squared distribution.

Sampling Distribution of S 2

Let X1, . . . , Xn be a random sample drawn from a normal pop-

25
0.00 0.02 0.04 0.06 0.08
0
10
y
20
30

Figure 3.8: Sampling Distribution of Sum of 10 chi-squared Random variables

ulation with mean µ and variance σ 2. The sample variance


n
2 1 X
S = (Xi − X̄)2
n − 1 i=1

is a random variable. We have


n
X n
X
2
(Xi − µ) = [(Xi − X̄) + (X̄ − µ)]2
i=1 i=1
Xn n
X n
X
= (Xi − X̄)2 + (X̄ − µ)2 + 2(X̄ − µ) (Xi − X̄)
i=1 i=1 i=1
Xn
= (Xi − X̄)2 + n(X̄ − µ)2.
i=1
2 2
Pn
Dividing each term by σ and substituting (n − 1)S for i=1 (Xi −

26
X̄)2, we have
n
1 X 2 (n − 1)S 2 (X̄ − µ)2
(Xi − µ) = +
σ 2 i=1 σ2 σ 2/n

We know that
n
X (Xi − µ)2
i=1
σ2
is a chi-squared random variable with n degrees of freedom. We also

know that
(X̄ − µ)
√ ∼ N (0, 1)
σ/ n
which implies
(X̄ − µ)2
σ 2/n
is a chi-squared random variable with 1 degree of freedom. We then

have the following result.

Theorem 3.5.4. Consider a random sample of size n from a

normal population with mean µ and variance σ 2. Then

2 (n − 1)S 2
χ =
σ2

is a chi-squared random variable with n − 1 degrees of freedom

27
3.5.2 Student’s t− distribution

The Central Limit Theorem allows us to determine the sampling dis-

tribution of X̄ when σ is known. In many practical applications,

the population variance is unknown and must be estimated from the

data. This introduces additional variability and produces a distribu-

tion that deviates significantly from a standard normal.

Theorem 3.5.5. Let Z ∼ N (0, 1) and V ∼ χ2(ν). If Z and V

are independent, then

Z
T =p
V /ν

has a t− distribution with ν degrees of freedom.

Corollary 3.5.6. Let X1, . . . , Xn be a random sample from a

normal population with mean µ and variance σ 2. Let


n
1X 1 Xn
2
X̄ = Xi and S = (Xi − X̄)2.
n i=1 n−1 i=1

X̄−µ
Then the random variable T = √
S/ n
has a t− distribution with

ν = n − 1 degrees of freedom.

28
• The t−distribution is symmetric about 0.

• It is bell-shaped.

• The t−distribution is more variable than the Z (standard nor-

mal).

• The distribution is characterized by a single parameter ν called

the degrees of freedom (df).

• As the df increases, the t−distribution gets closer and closer to

the normal curve.

• Assumes underlying population is normal: however, if the un-

derlying population is not normal but is ”nearly” bell shaped

the distribution of T will be approximately a t.

• Tables of the percentage points for the t− distribution are avail-

able for different degrees of freedom.

29
0.4
0.3

t-3
t-20
stnorm
0.2
0.1
0.0

-4 -2 0 2 4

Figure 3.9: The t densities on 1 and 20 df, and the standard normal

30
3.5.3 The F − distribution

Theorem 3.5.7. Let U ∼ χ2(ν1) and V ∼ χ2(ν2) be independent

random variables. Then


U/ν1
F =
V /ν2
has the F −distribution with ν1 and ν2 degrees of freedom.

The F −distribution is not symmetric.

Theorem 3.5.8. Let fα (ν1, ν2) be the value from the F −table

that cuts off an area of α in the upper tail for ν1 and ν2 degrees

of freedom. Then
1
f1−α (ν1, ν2) = .
fα (ν2, ν1)

Theorem 3.5.9. Let S12 and S22 be the variances corresponding to

two independent random samples of size n1 and n2 from normal

populations with variances σ12 and σ22, respectively. Then


S12/σ12
F = 2 2
S2 /σ2
31
has an F − distribution with ν1 = n1 − 1 and ν2 = n2 − 1 degrees

of freedom.

32

You might also like