0% found this document useful (0 votes)

71 views46 pages

Sampling Distribution and P G Estimation: T I3 Topic 3

This document discusses statistical estimation and sampling distributions. It defines key terms like population parameters, statistics, estimates, and estimators. The document explains that estimators like the sample mean and variance are used to calculate point estimates of population values. It also introduces the concept of a sampling distribution, which describes the distribution of all possible values that a statistic like the sample mean could take on from samples of a given size. The document provides examples to illustrate concepts like sampling error versus non-sampling error and how to calculate the sampling distribution of the sample mean from a given population.

Uploaded by

taiiq zhou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views46 pages

Sampling Distribution and P G Estimation: T I3 Topic 3

Uploaded by

taiiq zhou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

T i 3

Topic

Sampling
p g Distribution and
Estimation

Section 1 – Estimation
Section 1.1 Point Estimate

Statistical inference enables us to make judgments about a

population on the basis of sample information.
information

The mean, standard deviation, and proportions of a

population are called population parameters; in
other words, they serve to define the population.

Estimating a population’s parameters is essential to

statistical analysis, and sometimes sampling is the best
(fastest and most economical) way to approach the study.

1
Section 1 – Estimation
Section 1.1 Point Estimate

Definitions:

A parameter or population parameter is a

characteristic of an entire population.

A statistic is a summary measure that is computed to

describe a characteristic for only a sample of the
population.
population

An estimate is a specific observed value of a statistic.

Section 1 – Estimation
Section 1.2 Estimator

The rule that specifies how a sample statistic can be

obtained for estimating the population parameter is called
an estimator. It is the random variable, defined by a
formula, from which we obtain all possible estimates.

The point estimate is the single number that is

obtained from the estimator. It is a single value calculated
from only one sample,
sample used to estimate a population
parameter.

Point estimation is a process that generates specific

numbers, each of which is a point estimate.

2
Section 1 – Estimation
Section 1.2 Estimator

The symbols we use to represent several important

population parameters and their sample counterparts:

Population Sample
Parameter Statistic

Mean  X
Standard deviation  s
Variance 2 s2
Proportion p p

Section 1 – Estimation
Section 1.2 Estimator

Example:

If a professor wants information on central tendency in a

list of test scores, she can calculate a sample mean.

The number for the sample mean is called the estimate,

and the sample mean is the estimator for the population
mean.
ea .

3
Section 1 – Estimation
Section 1.2 Estimator

Example:

Suppose that a professor, whose course has an enrollment

of 50 students, wants information on the performance of
his class.

H takes
He k a sample
l off 10 scores:

95, 67, 89, 70, 56, 97, 68, 78, 50, 79

Section 1 – Estimation
Section 1.2 Estimator

The estimator for the population mean is the sample

mean X .
mean,

The estimate for the population mean, on the basis of the

10 sample scores, is

95  67    79
X  74.9
10

4
Section 1 – Estimation
Section 1.2 Estimator
The estimator for the population variance is the sample
variance, s 2 .

The estimate of the population variance is

s 2

 95 2
 67 2    792   10(74.9) 2
 247.65
10  1
The professor can use X  74.9 and s 2  247.65 to do his
or her class performance analysis. The formula for
combinations reveals that there are 50 C10  10,272,278,000
possible estimates each for the population mean and the
population variance.

Section 1 – Estimation
Section 1.2 Estimator

Definition:

An Interval
A I t l Estimate
E ti t is i constructed
t t d aroundd the th
point estimate, and it is stated that this interval is likely to
contain the corresponding population parameter. Interval
estimates indicate the precision, or accuracy, of an
estimate and are therefore preferable.

In order to have an in-depth study of the interval estimate,

we have to study the sampling distribution for the
estimated parameters (i.e. X , s 2 , and p ).

5
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Definition:

The population
l ti di t ib ti
distribution is the probability
distribution of the population data.

The probability distribution of X is called the sampling

distribution of X. It lists the various values that X can
assume and the probability of each value of X .

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Example:

Suppose there are only five students in an advanced

statistics class and the midterm scores of these five
students are:
70 78 80 80 95

6
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Let X denote the score of a student. Using single-valued

classes, the frequency distribution of scores is depicted as
f ll
follows:
X f f ( x)
70 1 0.2
78 1 0.2
80 2 0.4
95 1 0.2

The values of the mean and standard deviation calculated

for the probability distribution give the values of the
population parameters   80.6 and   8.0895 .

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Example:

Reconsider the population of midterm scores of five

students given in the previous example.

Consider all possible samples of three scores each that can

be selected, without replacement, from that population.

 The total number of possible samples is 5 C3  10 .

7
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Suppose we assign letters A, B, C, D, and E to the scores

of five students so that

A = 70, B = 78, C = 80, D = 80, E = 95.

Then the 10 possible samples of three scores each are

ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE,
CDE.

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean
These 10 samples and their respective means are listed in
the following table:
Sample Scores in the Sample X
ABC 70 78 80 76.00
ABD 70 78 80 76.00
ABE 70 78 95 81.00
ACD 70 80 80 76.67
ACE 70 80 95 81 67
81.67
ADE 70 80 95 81.67
BCD 78 80 80 79.33
BCE 78 80 95 84.33
BDE 78 80 95 84.33
CDE 80 80 95 85.00

8
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

By using the value of X , we record the frequency

distribution of X as follows:

X f f (X )
76.00 2 0.2
76.67 1 0.1
79.33 1 0.1
81 00
81.00 1 01
0.1
81.67 2 0.2
84.33 2 0.2
85.00 1 0.1

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean
Sampling Error is the difference between the value of
a sample statistic and the value of the corresponding
population parameter.
parameter
In the case of the mean,
Sample Error  X  
assuming that the sample is random and no non-sampling
p g error occurs because of
error has been made. A sampling
chance.

Non-sampling Errors are errors that occur in the

collection, recording, and tabulation of data. Such errors
occur because of human mistakes and not chance.

9
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Comparison between Sampling and Non-Sampling Errors

Sampling errors Non-sampling errors

•occurs only when a •occur both in a sample survey
sample survey is and in a census
conducted
•ccan be minimized
ed by preparing
p ep g
•impossible to avoid the survey questionnaire carefully
sampling error and handling the data cautiously

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Example:

Reconsider the population of midterm scores of five

students given in the previous example.

The population mean is

70  78  80  80  95
  80.60
5

10
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Now suppose we take a random sample of three scores

from this population. Assume that this sample includes
the
h scores 70, 80, andd 95. The
h mean for
f this
hi sample
l is
i

70  80  95
X  81.67
3
Consequently, Sample Error  X    81.67  80.60  1.07

That is, the mean score estimated from the sample is 1.07
higher than the mean score of the population. Note that
this difference occurred due to chance, that is, because we
used a sample instead of the population.

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Now suppose, when we select the above mentioned sample,

we mistakenly record the second score as 82 instead of 80.
As a result,
l we calculate
l l the h samplel mean as

70  82  95
X  82.33
3

Consequently, this difference between the sample mean

and the population mean is

X    82.33  80.60  1.73

11
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean
However, this difference between the sample mean and the
population mean does not represent the sampling error.

As we calculated earlier, only 1.07 of this difference is due

to the sampling error.

The remaining portion, which is equal to 1.73  1.07  0.66

represents the non
non-sampling
sampling error because it occurred due
to the error we made in recording the second score in the
sample.

 Sampling error = 1.07 , Non-sampling error = 0.66

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

The mean and standard deviation calculated for the

sampling distribution of X are called the mean  X
and standard deviation  X of X .

Actually, the mean and standard deviation of X are,

respectively, the mean and standard deviation of the means
of all samples of the same size selected from a population.

The standard deviation of  X is also called the standard

error of X .

12
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Mean of the Sampling Distribution of X

The mean of the sampling distribution of X is equal to the

mean of the population. Thus,

X  

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Standard Deviation of the Sampling Distribution of X

The standard deviation of the sampling distribution of X

is

X 
n

where
h  is
i the
th standard
t d d deviation
d i ti off the
th population
l ti andd n is
i
the sample size. This formula is used when n / N  0.05 ,
where N is the population size.

13
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

If this condition is not satisfied, we use the following

formula to calculate  X

 N n
X  
n N 1

finite population
correction factor

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

The Shape of the Sampling Distribution of X

Case I:
Sampling from a Normally Distributed Population

Case II:
Sampling from a population that is not Normally
Distributed

14
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Case I: Sampling from a Normally Distributed

Population
When the population from which samples are drawn is
normally distributed with its mean equals to  and standard
deviation equal to  , then

p of the sampling
1. The shape p g distribution of X is normal,
whatever the value of n.
2. The mean of X ,  X , is equal to  .

3. The standard deviation of X ,  X , is equal to  X  .
n
( assume n / N  0.05 )

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Case II: Sampling from a population that is not

Normally Distributed

Most of the time the population from which the samples are
selected is not normally distributed.

However, if the sample size is at least 30, the shape of the

sampling distribution of X is inferred from a very important
theorem called the Central Limit Theorem (CLT).

15
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Central Limit Theorem (CLT)

For a large sample size (usually considered large if n  30 )
1. The sampling distribution of the sample mean X is
approximately normal, irrespective of the shape of the
population distribution.
2. The mean of X ,  X , is equal to  .

q to  X 
3. The standard deviation of X ,  X , is equal .
n

If the population distribution is fairly symmetrical, the

sampling distribution of the sample mean X is
approximately normal if sample size n  15 .

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean
Sampling Distribution of X

Normal Population Non-normal

Non normal Population

Mean X   X  

Standard error X  / n X  / n

Shape Normal Approximate Normal if n  30

   2     2 
Notation X ~ N  ,   X ~ N  ,  
  n     n  
   

16
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

Example:

A company which manufactures drink dispensing

machines sets the fill level at 198cc. The standard
deviation is 4cc. Assume that the fill levels have a normal
distribution.

( a ) A drink is randomly selected, what is the probability

that the drink will have less than 195cc?

( b ) What is the probability that a random sample of 50

drinks has a mean value greater than 199cc?

Section 2 – Sampling Distribution

Section 2.1 Sampling Distribution of the Sample Mean

Solution:

( a ) A drink is randomly selected,

selected what is the probability
that the drink will have less than 195cc?

Let X be the fill level and  be the mean fill level.

Given X  N (198, 42 ) ,

 X   195  198 
P ( X  195)  P   
  4 
 P  Z  0.75   0.2266

17
Section 2 – Sampling Distribution
Section 2.1 Sampling Distribution of the Sample Mean

( b ) What is the probability that a random sample of 50

drinks has a mean value greater than 199cc?
Let X be the sample mean. Since the population is
normally distributed, thus the shape of the sampling
distribution of X is normal. We have,
 4   4  
2

 X    198;;  X    X  N 198,,   
n 50   50  


 X   X 199  198 
P ( X  199)  P     P  Z  1.77   0.0384
  X 4 / 50 

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

Definition:

The population
l ti proportionti , denoted by p, is
obtained by taking the ratio of the number of elements in a
population with a specific characteristic to the total number
of elements in the population.

The sample proportion, denoted by p , gives a similar

ratio for a sample.

18
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

The population and sample proportions, denoted by p and

p , respectively,
respectively are calculated as

x x
p and p
N n

w ee
where
N – Total number of elements in the population
n – Total number of elements in the sample
x – Number of elements in the population or sample that
possess a specific characteristic

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

Example:

Suppose a total of 789,654

789 654 families live in a city and
563,282 of them own homes.

Then, N  789,654 and x  563,282

The proportion of all families in this city who own

homes is
x 563, 282
p   0.7133
N 789,654

19
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion
Now, suppose a sample of 240 families is taken from this
city and 158 of them are homeowner.
Th
Then, n  240 andd x  158 .
x 158
The sample proportion is p    0.6583
n 240
The difference between the sample proportion and the
corresponding population proportion gives the sampling
error, assuming that the sample is random and no non-
sampling error has been made. That is, in case of the
proportion,
Sample Error  p  p  0.6583  0.7133  0.055

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion
The probability distribution of the sample proportion p is
called the sampling distribution of p .
It gives the various values that p can assume and their
probabilities.

Example:
Boe Consultant Associates has five employees. The
following table gives the name of these five employees
and information concerning their knowledge of statistics.

Name Ally John Susan Peter Tom

Knows Statistics Yes No No Yes Yes

20
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

If we define the population proportion p as the proportion

of employees who know statistics, then p  3 / 5  0.6
statistics then, 06.

Now, suppose we draw all possible samples of three

employees each and compute the proportion of employees,
for each sample, who know statistics. The total number of
samples
p of size three that can be drawn from the
population of five employees is 5 C3  10 .

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

The table lists the 10 possible samples and the proportion

of employees
p y who know for each of those samples
p

Sample Prop. ( p ) Sample Prop. ( p )

Ally, John, Susan 1/3 = 0.33 Ally, Peter, Tom 3/3 = 1.00
Ally, John, Peter 2/3 = 0.67 John, Susan, Peter 1/3 = 0.33
Ally, John, Tom 2/3 = 0.67 John, Susan, Tom 1/3 = 0.33
Ally, Susan, Peter 2/3 = 0.67 John, Peter, Tom 2/3 = 0.67
Ally, Susan, Tom 2/3 = 0.67 Susan, Peter, Tom 2/3 = 0.67

21
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

The sampling distribution of p as

p f ( p)
0.33 0.30
0.67 0.60
1.00 0.10

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

Mean of the Sampling Distribution of p

The mean off the

Th th samplel proportion
ti p is
i denoted
d t d byb p
and is equal to the population proportion p. Thus,  p  p .

Standard Deviation of the Sampling Distribution of p

The standard deviation of the sampling distribution of p

is
p (1  p ) p (1  p ) N  n
p  p  
n n N 1
n / N  0.05 n / N  0.05

22
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

The Shape of the Sampling Distribution of p

Central Limit Theorem – The sampling distribution of p

is approximately normal for a sufficiently large sample size.

In the case of proportion, the sample size n is considered to

be sufficiently large if np  5 and n(1  p )  5 .

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

Sampling Distribution of p

Mean p  p

p(1  p)
Standard error p 
n

Shape Normal if np  5 and n(1  p )  5

  p (1  p )  
2

Notation 
p ~ N p,   
  n  


23
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

Example:

The election returns showed that a certain candidate

received 46% of the votes.

( a ) Determine the probability that a poll of 200 people

selected at random from the voting population would have
shown a majority (over 50%) of votes in favor of the
candidate.

( b ) 95% of the sample proportions will be greater than

what value?

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

Solution:

( a ) Determine the probability that a poll of 200 people

selected at random from the voting population would
have shown a majority (over 50%) of votes in favor
of the candidate.

From the given information: p  0.46

0 46

This gives:
p (1  p ) (0.46)(0.54)
 p  p  0.46,  p    0.0352
n 200

24
Section 2 – Sampling Distribution
Section 2.2 Sampling Distribution of the Sample Proportion

Since np  200(0.46)  92  5 and

n((1  p )  200(0.54)
( )  108  5 , we can infer from the
Central Limit Theorem that the sampling distribution
of p is approximately normal. Thus,

p ~ N (0.46, (0.0352) 2 )

Required probability:
 p   p 0.50  0.46 
P ( p  0.50)  P     P( Z  1.14)  0.1271
  0.0352
 p 

Section 2 – Sampling Distribution

Section 2.2 Sampling Distribution of the Sample Proportion

( b ) 95% of the sample proportions will be greater than

what value?

Let A be the required value. We want P ( p  A)  0.95

and from the standard normal table, P ( Z  1.645)  0.95

A  0.46
0 46
  1.645  A  0.4021
0.0352

25
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance

Considering a random sample of n observations drawn

from a population with unknown mean  and unknown
variance  2 .
Denote the sample observations as x1 , x2 ,, xn .

The population variance is the expectation

 2  E ( X   )2  which suggests that the mean of
( xi   ) 2 over n observations. Since  is unknown, the
sample mean x is used to compute a sample variance.

Section 2 – Sampling Distribution

Section 2.3 Sampling Distribution of the Sample Variance

Definition:

Let x1 , x2 ,, xn be a random sample of observations from a

population.

The quantity
1 n
s 
2

n  1 i 1
( xi  x ) 2

is called the sample variance, and its square root, s,

is called the sample standard deviation.

26
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance

Suppose a random sample of n observations with sample

variance s 2 is taken from a normally distributed population
with population variance  2 .

Then,
(n  1) s 2 1 n

 2

 2 (x  x )
i 1
i
2

has a chi-square (  ) distribution with n  1 degrees of

freedom

Section 2 – Sampling Distribution

Section 2.3 Sampling Distribution of the Sample Variance

a.  0.005,, 5  16.750 b.  0.9,, 9  4.168

2 2
Verify:
c. P (  2  2 )  0.05 with   10     0.05,10  18.307
2 2

d. P (  2   2 )  0.05 with   10     0.95,10  3.940

2 2


e. Given that  2   222 , P (10.982    36.781)  0.95

27
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance

Mean of the Sampling Distribution of s 2

The mean of the sample variance s 2 is equal to the

population variance  2 .

Variance of the Sampling Distribution of s 2

The variance of the sample variance s 2 is given by the

formula
2 4
Var ( s ) 
2

n 1

Section 2 – Sampling Distribution

Section 2.3 Sampling Distribution of the Sample Variance
Example:
The variability of the electrical resistance is critical for
manufacturing a control device.
device Manufacturing standards
specify a standard deviation of 3.6, and the population
distribution of resistance measures is normal.

The monitoring process requires that a random sample for

n  6 observations be obtained from the population of
devices and the sample variance be computed.

Determine an upper limit for the sample variance such that

the probability of exceeding this limit, given a population
standard deviation of 3.6, is less than 0.05.

28
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance
Solution:
From the ggiven information,, n  6 and  2  3.62  12.96
Let K be the required upper bound.

We have,

 (n  1) s 2 
P( s  K )  P 
2
  52   0.05
 12.96 

 52  11.07 is the upper 0.05 critical value of the chi-square

distribution with 5 d.f.

Section 2 – Sampling Distribution

Section 2.3 Sampling Distribution of the Sample Variance

The required upper limit for s 2 – labelled as K – can be

obtained by

(n  1) s 2 (6  1) K
  11.07  K  28.69
12.96 12.96

If the sample variance, s 2 , from a random sample of size

n  6 exceeds 28.69, there is strong evidence to suspect
that the population variance exceeds 12.96 and that the
manufacturing process should be halted and appropriate
adjustments should be performed.

29
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance
Example:
A manager of a quality assurance food company wants to ensure
the variation of ppackage
g weights
g is small so that the company
p y
does not produce a large proportion of packages that are under
the stated package weight. The manager wants to obtain upper
and lower limits for the ratio of the sample variance divided by
the population variance for a random sample of
n  20 observations.

The limits are such that the probability that the ratio is below the
lower limit is 0.025 and the probability that the ratio is above the
upper limit is 0.025. Thus, 95% of the ratios will be between
these limits. The population distribution can be assumed to be
normal.

Section 2 – Sampling Distribution

Section 2.3 Sampling Distribution of the Sample Variance

Solution:

To obtain values K L and KU such that

 s2   s2 
P  2  K L   0.025 and P  2  KU   0.025
   

given that n  20 is used to compute the sample variance.

30
Section 2 – Sampling Distribution
Section 2.3 Sampling Distribution of the Sample Variance

For the  (n  1) s 2 
lower limit: 0.025  P   2  (n  1) K L   P (   (n  1) K L )
2

 

8.91  19 K L  K L  0.4689

For the  (n  1) s 2 < 

upper limit: 0.975  P   ( n  1) K   P (  2  ( n  1) KU )
 2 U


32.85  19 KU  KU  1.7289

 The 95% acceptance interval for the ratio ( s 2 /  2 ) is

0.4689  s 2 /  2  1.7289

Section 2 – Sampling Distribution

Section 2.4 Properties of Estimators

A number of different estimators are possible for the same

ppopulation
p pparameter,, but some estimators are better than
others.

To understand how, we need to look at three important

properties of estimators.

I. Unbiasedness
II. Efficiency
III. Consistency

31
Section 2 – Sampling Distribution
Section 2.4 Properties of Estimators

Unbiasedness

An estimator exhibits unbiasedness when the mean of the

sampling estimator ˆ is equal to the population parameter
 . That is, E (ˆ)   .

The sample mean is an unbiased estimator of the

population mean because the mean of the sampling
distribution of X , E ( X ) , is equal to the population mean
 .
The sample proportion is an unbiased estimator of the
population proportion, E ( p )  p .

Section 2 – Sampling Distribution

Section 2.4 Properties of Estimators

Efficiency

Efficiency refers to the size of the standard error of the

statistics. The most efficient estimator is the one with the
smallest variance.

Thus, if there are two estimators for  with variances

Var (ˆ1 ) and Var (ˆ2 ) , then the first estimator ˆ1 is said to
be more efficient than the second estimator ˆ2 , if
Var (ˆ1 )  Var (ˆ2 ) although E (ˆ1 )  E (ˆ2 )   .

32
Section 2 – Sampling Distribution
Section 2.4 Properties of Estimators

Consistency

Consistency is related to the behavior of estimators as the

sample size gets large. A statistic is a consistent
estimator of a population parameter if, as the sample size
increases, it becomes almost certain that the value of the
statistic comes very close to the value of the population
parameter.

It can be shown that an unbiased estimator ˆn for  is a

consistent estimator if the variance approaches 0 as n
increases.

Section 2 – Sampling Distribution

Section 2.4 Properties of Estimators

We can show that the sample mean is a consistent

estimator of the population.

The sample mean is unbiased because E ( X )   . The

variance of X is  2 / n n .

2
As n   , Var
A V (X )  0 .
nn

So this estimator is consistent.

33
Section 3 – Confidence Interval
Definitions:

Each interval is constructed with regard to a given

confidence level and is called a confidence
interval. The confidence level associated with a
confidence interval states how much confidence we have
that this interval contains the true population parameter.

The confidence level is denoted byy ((1   )100%

) . When
expressed as a probability, it is called the confidence
coefficient and is denoted by 1   .

Section 3 – Confidence Interval

Although any value of the confidence level can be chosen

to construct a confidence interval, the more common
values are 90%, 95% and 99%. The corresponding
confidence coefficients are 0.90, 0.95 and 0.99.

34
Section 3 – Confidence Interval

Interval Estimation of a Population Mean:

Known Variances

Recall that in the case of X , the sample size is considered

to be large when n  30 . According to the central limit
theorem, for a large sample the sampling distribution of
the sample mean X is (approximately) normal irrespective
of the shape of the population from which the sample is
d
drawn.

Therefore, when n  30 , use the normal distribution to

construct a confidence interval for  .

Section 3 – Confidence Interval

Confidence Interval for population mean μ

The (1   )100% confidence interval for  is


X  Z /2
n
where
X is sample mean;  is population standard deviation;
n is the sample size; and Z /2 is read from the standard
normal distribution table for the given confidence level.
Conditions: Normal population with known variance
OR Non-normal population, large sample with
known variance

35
Section 3 – Confidence Interval

Maximum Error of Estimate for μ

The maximum error of estimate for  , denoted by

E, is the quantity that is subtracted from and added to the
value of X to obtain a confidence interval for  .

Thus, given the (1   )100% confidence interval,


E  Z /2
n

Section 3 – Confidence Interval

Example:

A publishing company has just published a new college

textbook.
b k Before f the
h company decides
d id the
h price
i at which
hi h
to sell this textbook, it wants to know the average price of
all such textbooks in the market.

The research department at the company took a sample of

36 such textbooks and collected information on their
prices. This information produced a mean price of $48.4
for this sample. It is known that the standard deviation of
the prices of all such textbooks is $4.50.

36
Section 3 – Confidence Interval

Assume that the prices of all such textbooks are normally

distributed.

( a ) What is the point estimate of the mean price of all

such college textbooks?

( b ) Construct a 95% confidence interval for the mean

price of all such
s ch college textbooks.
te tbooks

Section 3 – Confidence Interval

Solution:

From the given information,

n  36, X  48.40,   4.50

( a ) What is the point estimate of the mean price of all

such college textbooks?

The point estimate of the mean price of all such

college textbooks is $48.40, that is,

Point estimate of   X  $48.40

37
Section 3 – Confidence Interval

( b ) Construct a 95% confidence interval for the mean

price of all such college textbooks.

The confidence level is 95% or 0.95    0.05

The 95% confidence interval for  is

 4.5
X  Z /2  48.40
48 40  1.9
1 96  (46.93,4
(46 93 499.87)
n 36

Thus, we are 95% confident that the mean price of all such
college textbooks is between $46.93 and $49.87.

Section 3 – Confidence Interval

Note: We cannot say for sure whether the interval $46.93

to $49.87 contains the true population mean or not.

Since  is a constant, we cannot say that the probability is

0.95 that this interval contains  because either it contains
 or it does not. Consequently, the probability is either 1
or 0 that this interval contains  .

All we can say is that we are 95% confident that the mean
price of all such college textbooks between $46.93 and
$49.87.

38
Section 3 – Confidence Interval

Interpretation of confidence interval:

How do we interpret a 95% confidence level? In the

previous example, if we take all possible samples of 36
such college textbooks each and construct a 95%
confidence interval for  around each sample mean, we
can expect that 95% of these intervals will include  and
5% will not.

Section 3 – Confidence Interval

Interpretation of confidence interval:

Illustration: 
95% C.I. 95% C.I. – #1
X 1  K1 X1 X 1  K1

95% C.I. – #2
X 2  K2 X2 X 2  K2
95% C.I. – #3
95% C.I. – #4
95% C.I.
CI – #5
95% C.I. – #6
95% C.I. – #7


95% C.I. – #n

39
Section 3 – Confidence Interval

The Width of a Confidence Interval

The width of a confidence interval depends on the size of

the maximum error Z   X , which depends on the values of
Z,  , and n because  X   / n .

However, the value of  is not within the control of the

investigator. Hence, the width of a confidence interval
depends
p on

( i ) The value of Z
( ii ) The sample size n

Section 3 – Confidence Interval

The value of Z which depends on the confidence level

The value of Z increases as the confidence level increases,
and it decrease as the confidence level decreases.
Therefore, the width of a confidence interval increases or
decreases with the confidence level.

The sample size n

For the same value of  , an increase in n decreases the
value of  X , which in turn decreases the size of the
maximum error when the confidence level remains
unchanged. Therefore, an increase in the sample size
decreases the width of the confidence interval.

40
Section 3 – Confidence Interval

Thus, if we want to decrease the width of a confidence

interval, we have two choices:

 Lower the confidence level - not a good choice because

a lower confidence level may give less reliable results.

 Increase the sample size - preferred way to decrease the

width of a confidence interval.

Section 3 – Confidence Interval

Example (revisit):

A publishing company has just published a new college

textbook.
b k Before f the
h company decides
d id the
h price
i at which
hi h
to sell this textbook, it wants to know the average price of
all such textbooks in the market.

The research department at the company took a sample of

41
Section 3 – Confidence Interval

Assume that the prices of all such textbooks are normally

distributed. Construct a 90% confidence interval for the
mean
e p price
ce o
of all suc
such co
college
ege textbooks.
e boo s.

Solution:
 4.5
X  Z /2  48.40  1.65  (47.16,49.64)
n 36

Comparing this to the 95% confidence interval obtained

previously, (46.93,49.87) , it is observed that the width of
the confidence interval for a 95% C.I. is wider than the
one for a 90% C.I.

Section 3 – Confidence Interval

Example (revisit):
Consider the previous example again. Now suppose the
information given in that example is based on a sample
size of 160. Further assume that all other information
given in that example, construct the 95% confidence level.

Solution:
 4.5
X  Z /2  48.40  1.96  (47.70,49.10)
n 160
160
Comparing this to the 95% confidence interval obtained
previously, (46.93,49.87) , it is observed that the width of
the 95% confidence interval for n  160 is smaller than the
one for n  36 .

42
Section 3 – Confidence Interval

Interval Estimation of a Population Mean:

Unknown Variances

If the sample size is small, the normal distribution can

still be used to construct a confidence interval for  if

1.the population from which the sample is drawn is

normally distributed, and
2 th population
2.the l ti standard
t d d deviation
d i ti  is
i known.
k

Section 3 – Confidence Interval

The t distribution is used to make a confidence
interval about  if
1.the ppopulation
p from which the samplep is selected is
(approximately) normally distributed, and
2.the population standard deviation  is not known.

43
Section 3 – Confidence Interval

Verify:
a. t4,0.05  2.132 and t4,0.95
,  2.132
b. t6,0.005  3.707 and t6,0.995  3.707
c. P (T  t )  0.10 with   22  t22,0.1  1.321
d. P (T  t )  0.05 with   16  t16,0.95  1.746
e Given that T  t5 , P(T  3.365)
e. 3 365)  0.99
0 99
f. Given that T  t8 , P (2.306  T  2.306)  0.95
g. Given that T  t26 , P(T  3.435)  0.999

Section 3 – Confidence Interval

Confidence Interval for population mean μ
using t distribution

The (1   )100% confidence interval for  is

s
X  t /2, n1
n
where
p mean; s is sample
X is sample p standard deviation; n is the
sample size; and t /2, n1 is obtained from the t distribution
table for n  1 d.f. and the (1   )100% confidence level.

Conditions: Population is approximately normal distributed

 is not known

44
Section 3 – Confidence Interval

Example:

Dr. Moore wanted to estimate the mean cholesterol level

f all
for ll adult
d l males
l living
li i ini London.
d He tookk a sample
l off
25 adult males from London and found that the mean
cholesterol level for this sample is 186 with a standard
deviation of 12.

Assume that the cholesterol levels for all adult males in

London are (approximately) normally distributed.
Construct a 95% confidence interval for the population
mean.

Section 3 – Confidence Interval

Solution:

From the given information,

n  25, X  186, s  12

The confidence level is 95% or 0.95    0.05

D
Degree off ffreedom:
d 25  1  24

Area in each tail: 0.05 / 2  0.025

From the t distribution table, the value for t is t0.025,24  2.064

45
Section 3 – Confidence Interval

The 95% confidence interval for  is

s 12
X  t /2, n1  186  2.064  (181.0464,190.9536)
n 25

Thus, we can state with 95% confidence that the mean

cholesterol level for all adult males livingg in London lies
between 181.05 and 190.95.

Note that X  186 is a point estimate of  in this example.

AMA 1006
Lecture Notes

~ END ~

1sampling Distribution PDF
No ratings yet
1sampling Distribution PDF
6 pages
Statistics For Management 2
No ratings yet
Statistics For Management 2
14 pages
Chapter 03 Sampling Distribution
No ratings yet
Chapter 03 Sampling Distribution
103 pages
Lecture 03 Probability and Statistics Review Part2
No ratings yet
Lecture 03 Probability and Statistics Review Part2
74 pages
12 - Sampling Distribution of Statistics
No ratings yet
12 - Sampling Distribution of Statistics
55 pages
Chapter 7 - Sampling Distribution
No ratings yet
Chapter 7 - Sampling Distribution
7 pages
C-7 Sampling Distribu
No ratings yet
C-7 Sampling Distribu
109 pages
Statistics Group 1
No ratings yet
Statistics Group 1
59 pages
Statistics For Management - 2
80% (10)
Statistics For Management - 2
14 pages
QBM101 Chapter7
No ratings yet
QBM101 Chapter7
48 pages
Statisticsppt Copy 170221201132
No ratings yet
Statisticsppt Copy 170221201132
30 pages
Sampling Theory & Distributions Guide
No ratings yet
Sampling Theory & Distributions Guide
15 pages
Screenshot 2024-12-15 at 01.18.34
No ratings yet
Screenshot 2024-12-15 at 01.18.34
161 pages
Statistics: Sampling Distributions
No ratings yet
Statistics: Sampling Distributions
97 pages
Estadística II T2
No ratings yet
Estadística II T2
4 pages
Topic 4.1 - Sample and Sampling Distribution
No ratings yet
Topic 4.1 - Sample and Sampling Distribution
29 pages
Sampling Distributions and Confidence Intervals
No ratings yet
Sampling Distributions and Confidence Intervals
69 pages
MTPDF6 - Sampling Distribution and Point Estimation
No ratings yet
MTPDF6 - Sampling Distribution and Point Estimation
62 pages
Sampling Distributions
No ratings yet
Sampling Distributions
97 pages
Distributions of Sample Statistics
No ratings yet
Distributions of Sample Statistics
112 pages
Sampling Technique and Sampling Distribution
No ratings yet
Sampling Technique and Sampling Distribution
47 pages
6sampling Distribution
No ratings yet
6sampling Distribution
82 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
UNIT 2 - Sampling - Distribution
No ratings yet
UNIT 2 - Sampling - Distribution
27 pages
Chapter 6 Sampling and Estimation - v2
No ratings yet
Chapter 6 Sampling and Estimation - v2
57 pages
Biost 6.1
No ratings yet
Biost 6.1
28 pages
Sampling Distribution With CLT
No ratings yet
Sampling Distribution With CLT
22 pages
Chap8 STAT 2 Merged
No ratings yet
Chap8 STAT 2 Merged
15 pages
4 Sampling-Distributions
No ratings yet
4 Sampling-Distributions
22 pages
Sampling Distribution & Estimator Properties
No ratings yet
Sampling Distribution & Estimator Properties
32 pages
Sample and Sampling Procedure: Population
No ratings yet
Sample and Sampling Procedure: Population
21 pages
Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
Stat Chapter 2
No ratings yet
Stat Chapter 2
15 pages
CH06
No ratings yet
CH06
48 pages
Topic 1 Sampling
No ratings yet
Topic 1 Sampling
19 pages
Sampling and Sampling Distribution
No ratings yet
Sampling and Sampling Distribution
31 pages
Chapter 7 Sampling and Sampling Distributions
No ratings yet
Chapter 7 Sampling and Sampling Distributions
44 pages
Sampling Distribution
No ratings yet
Sampling Distribution
29 pages
Sampling Distribution Explained
No ratings yet
Sampling Distribution Explained
15 pages
06 Stat Est
No ratings yet
06 Stat Est
41 pages
Lec 10-13
No ratings yet
Lec 10-13
207 pages
Lec4 Inferential - Stats - Sampling - Distribution - Correct
No ratings yet
Lec4 Inferential - Stats - Sampling - Distribution - Correct
28 pages
Economics 1280 Notes
No ratings yet
Economics 1280 Notes
67 pages
Sampling Distributions Guide
No ratings yet
Sampling Distributions Guide
7 pages
3.3 Sampling Distribution
No ratings yet
3.3 Sampling Distribution
22 pages
Lecture
No ratings yet
Lecture
44 pages
Lecture - 9 EstimationRM (ECON 1005 2011-2012)
No ratings yet
Lecture - 9 EstimationRM (ECON 1005 2011-2012)
52 pages
CH-5 Sampling Distribution Lecture
No ratings yet
CH-5 Sampling Distribution Lecture
19 pages
Isom 2500
No ratings yet
Isom 2500
58 pages
15chap 3.1 Sampling Distribution
No ratings yet
15chap 3.1 Sampling Distribution
33 pages
Class 12th Statistics FBISE/Punjab Definitions Formulas
100% (1)
Class 12th Statistics FBISE/Punjab Definitions Formulas
27 pages
Sampling
No ratings yet
Sampling
27 pages
Sampling and Estimation
No ratings yet
Sampling and Estimation
36 pages
CHAPTER TWO Statistics Method (2) - 1
No ratings yet
CHAPTER TWO Statistics Method (2) - 1
10 pages
MC 106 354 395
No ratings yet
MC 106 354 395
42 pages
Sampling Distributions & Estimators
No ratings yet
Sampling Distributions & Estimators
32 pages
Lab 11
No ratings yet
Lab 11
49 pages
L5.Sampling Distribution 2023 1
No ratings yet
L5.Sampling Distribution 2023 1
33 pages
Lecture4 Organelles Cell Division Proliferation
No ratings yet
Lecture4 Organelles Cell Division Proliferation
35 pages
Ierg4030 Part0
No ratings yet
Ierg4030 Part0
33 pages
Lecture3 Logistic Regression Classifier V0
No ratings yet
Lecture3 Logistic Regression Classifier V0
41 pages
Hybrimoe: Hybrid Cpu-Gpu Scheduling and Cache Management For Efficient Moe Inference
No ratings yet
Hybrimoe: Hybrid Cpu-Gpu Scheduling and Cache Management For Efficient Moe Inference
7 pages
Lecture2 - Gradient Descent - V0
No ratings yet
Lecture2 - Gradient Descent - V0
51 pages
Lecture1 Introduction V0
No ratings yet
Lecture1 Introduction V0
25 pages
Cinema 4D Configuration Guide
No ratings yet
Cinema 4D Configuration Guide
3 pages
The Hong Kong Polytechnic University: Reference Checklist (Confidential)
No ratings yet
The Hong Kong Polytechnic University: Reference Checklist (Confidential)
4 pages
Computational Thinking and Problem Solving (COMP1002) and Problem Solving Methodology in Information Technology (COMP1001)
No ratings yet
Computational Thinking and Problem Solving (COMP1002) and Problem Solving Methodology in Information Technology (COMP1001)
3 pages
Distribusi & Penjualan PT. Sinar Kalimantan
No ratings yet
Distribusi & Penjualan PT. Sinar Kalimantan
11 pages
JSO (Test - 10) Paid
No ratings yet
JSO (Test - 10) Paid
6 pages
CH12 Decision Analysis
No ratings yet
CH12 Decision Analysis
29 pages
MGS3100-Solved - Exercises Decision Analyisis
No ratings yet
MGS3100-Solved - Exercises Decision Analyisis
12 pages
Book 111
No ratings yet
Book 111
3 pages
Machine Learning Model Selection
No ratings yet
Machine Learning Model Selection
7 pages
Correlation Regression
No ratings yet
Correlation Regression
26 pages
Annuities
No ratings yet
Annuities
39 pages
Regression Evaluation Metrics
No ratings yet
Regression Evaluation Metrics
12 pages
Six Sigma and DOE Exam Questions
No ratings yet
Six Sigma and DOE Exam Questions
31 pages
Instrumental Variables & 2SLS Guide
No ratings yet
Instrumental Variables & 2SLS Guide
21 pages
Lec8 Difference in Difference
No ratings yet
Lec8 Difference in Difference
62 pages
Probability Distributions Guide
No ratings yet
Probability Distributions Guide
31 pages
Lecture 2 - Adaptive Forecasting - Moving Averages and Exponential Smoothing
No ratings yet
Lecture 2 - Adaptive Forecasting - Moving Averages and Exponential Smoothing
29 pages
Finance Students' Time Value Guide
No ratings yet
Finance Students' Time Value Guide
4 pages
Panion PDF
No ratings yet
Panion PDF
154 pages
Epidemiology Study Guide
No ratings yet
Epidemiology Study Guide
37 pages
Ma-21001 (P&S) - CS Mid Sept 2024
No ratings yet
Ma-21001 (P&S) - CS Mid Sept 2024
3 pages
STAT512 Split Plot Design
No ratings yet
STAT512 Split Plot Design
7 pages
Stats 2
No ratings yet
Stats 2
16 pages
HUL311Quizzes 2
No ratings yet
HUL311Quizzes 2
3 pages
ADMS 3330 - Test - II Formula Sheets
No ratings yet
ADMS 3330 - Test - II Formula Sheets
4 pages
BAYES
No ratings yet
BAYES
11 pages
Time Series Analysis: Christian Kleiber
No ratings yet
Time Series Analysis: Christian Kleiber
14 pages
Least Squares for Data Analysts
No ratings yet
Least Squares for Data Analysts
5 pages
Transportation Problem Solutions
100% (1)
Transportation Problem Solutions
2 pages
Classification - Naive Bayes Classifier: DR - Aruna Malapati Asst Professor Dept of CS & IT BITS Pilani, Hyderabad Campus
No ratings yet
Classification - Naive Bayes Classifier: DR - Aruna Malapati Asst Professor Dept of CS & IT BITS Pilani, Hyderabad Campus
9 pages
Forecasting
No ratings yet
Forecasting
50 pages
Game Theory: Mixed Strategies
No ratings yet
Game Theory: Mixed Strategies
5 pages