[go: up one dir, main page]

0% found this document useful (0 votes)
15 views4 pages

Lecture 5 - Sampling Distribution

The document outlines Lecture 6, focusing on the distribution of sample means and the central limit theorem. It includes examples of pizza order completion times with varying sample sizes and discusses the implications of increasing sample size on the distribution of sample means. Key points include that larger samples yield more normal distributions and that the standard error decreases as sample size increases.

Uploaded by

ningning02122005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Lecture 5 - Sampling Distribution

The document outlines Lecture 6, focusing on the distribution of sample means and the central limit theorem. It includes examples of pizza order completion times with varying sample sizes and discusses the implications of increasing sample size on the distribution of sample means. Key points include that larger samples yield more normal distributions and that the standard error decreases as sample size increases.

Uploaded by

ningning02122005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Lecture 6 Outline

• Distribution of sample means


SAMPLING DISTRIBUTION
• The central limit theorem

Reading materials:
Chap 9 (Keller)

1 2

1 2

Distribution of Samples: example (1) Another 50 observations; 1000 observations,


• Data were collected on the time taken for a pizza order to be
on the time to complete a pizza order (2)
completed in minutes (from order taken to pizza handed over
to customer). Below is a histogram of 50 observations and
some summary statistics.

100
10

10
Frequency

Frequency
5 50
Frequency

0 0

6 8 10 12 14 16 18 20 22 24 26 10 20 30
Pizza time Pizza time
0

10 12 14 16 18 20 22 24 26
Pizza time Variable N Mean Median StDev
Pizza time 50 17.585 17.374 3.872
Variable N Mean Median StDev Pizza time 1000 17.934 17.627 4.009
Pizza time 50 17.256 17.041 3.743
3 4

3 4

10,000 observations on the time to complete a


pizza order (3) General notice

• When the sample size gets large (infinitive), the


distribution of the sample is approximately normal.
600

500

400
Frequency

300

200

100

10 20 30 40
Pizza time

Variable N Mean Median StDev


Pizza time 10000 18.046 17.744 4.006

5 6

5 6
Distribution of sample means
S.D for the 1000 random samples of size 10
• One thousand datasets, each with 10 observations in it (that
is, 1 thousand samples of size 10) are generated (simulated
data) from this model and for each sample, the average
(sample mean), median (sample median) and sample
standard deviation are calculated and recorded. 90
80
70
60
Variable N Mean Median StDev

Frequency
50

average 1000 18.007 18.020 1.231 40


30
median 1000 17.757 17.804 1.433 20
10
0

1 2 3 4 5 6 7
90 stdev
80
80
70 70

60 60
Variable N Mean Median StDev
Frequency

Frequency

50 50

40
30
40

30
stdev 1000 3.8183 3.7282 0.9505
20 20
10 10
0 0

13 14 15 16 17 18 19 20 21 22 14 15 16 17 18 19 20 21 22 23
average median
7 8

7 8

More random numbers S.D for samples of size 25


• Another thousand datasets are generated from the same model,
but this time each dataset has 25 observations.

70

60
100
80
90 50
70
Frequency

80
60 40
70
50
Frequency

60 30
Frequency

50 40
40 20
30
30
20 10
20
10 10 0
0 0
2 3 4 5 6
15.5 16.5 17.5 18.5 19.5 20.5 14 15 16 17 18 19 20 21 22
stdev
average median

 Variable N Mean Median StDev Variable N Mean Median StDev


 average 1000 17.991 17.982 0.814 stdev 1000 3.9637 3.9391 0.6048
 median 1000 17.711 17.675 1.017
9 10

9 10

A general result of great importance


Notices as we take larger samples….
 No matter what model a random sample is taken
• The histograms for all three statistics (sample mean, from, as the sample size (number of random
sample median and sample standard deviation) are observations) increases, the distribution of the
becoming more and more symmetric and bell-shaped sample mean becomes closer and closer to the
and less variable, particularly those for the sample normal distribution. And
mean  No matter what model a random sample is taken
• Also notice that the estimated standard deviation of from, and for any sample size n, the standard
the sample mean is not only decreasing as sample deviation of the sample mean is the model standard
size increases, but is also approximately the same for deviation,  , (the theoretical standard deviation)
the same sample sizes. divided by n , that is,  / n =>Called standard
error of the means (SE).
11 12

11 12
The Central Limit Theorem (1) The Central Limit Theorem (2)

• Whatever the population


dist. looks like (normal
or not), when a sample
size is large enough, the
distribution of sample
means will be normal and
we can use Z-statistic to
calculate probability of
any mean value

13 14

13 14

This is the Central Limit Theorem So, how large does n need to be?

• If X is a random variable with a mean µ and


variance σ², then in general,

 2 
X  N  , 
 n  Sampling error
X 
Z  Z ~ N  0,1 as n  .
 n

15 16

15 16

So, how large does n need to be? In general


• It depends on the original distribution of X.
– If X has a normal distribution, then the sample mean has a
normal distribution for all sample sizes.
– If X has a distribution that is close to normal, the
approximation is good for small sample sizes (e.g. n=20).
– If X has a distribution that is far from normal, the
approximation requires larger sample sizes (e.g. n=50).

17 18

17 18
Activity 1

• The average height of Vietnamese women is 1.6m,


with a standard deviation of 0.2m. If I choose 25
women at random, what is the probability that their
average height is less than 1.53m?

19

19

You might also like