Slides CH 14

ACMS 20340
Statistics for Life Sciences
Chapter 14:
Introduction to Inference
Sampling Distributions
For a population distributed as N(µ, σ) the statistic x̄ calculated

√
from a sample of size n has the distribution N(µ, σ/ n).
We would like to use x̄ to estimate µ.
Unfortunately, while x̄ is likely to be close to µ, they are unlikely to

be exactly equal.
We will make things easier and only guess an interval which

contains µ instead of its exact value.
Inference Assumptions
We will make the following (possibly unrealistic) assumptions:

I The population is normally distributed N(µ, σ).
I We do not know µ, but we do know σ.
I We have a random sample of size n.
Later we will see how to handle the common case where we do not
know σ.
To what extent can we determine µ?
Since the population is distributed as N(µ, σ), we know x̄ has the
√
distribution N(µ, σ/ n).
For example, heights of 8 year old boys are normally distributed

with σ = 10. The population also has a mean µ, but we do not
know it. The population distribution is N(µ, 10).
Samples of size 217 are distributed as N(µ, 0.7). Why?

√ √
σ/ n = 10/ 217 ≈ 10/14.73 ≈ 0.6788 ≈ 0.7.
Using the normal tables, we can calculate the probability that x̄ is

within 1.4 of µ.

µ − 1.4 − µ µ + 1.4 − µ
P(µ − 1.4 < x̄ < µ + 1.4) = P <Z <
0.7 0.7
= P(−2 < Z < 2)
= 0.954
Thus, the probability that x̄ is within 1.4 of µ is 0.95.
In other words, for 95% of all samples, 1.4 is the maximum

distance between x̄ and µ.
So if we estimate that µ lies in the interval [x̄ − 1.4, x̄ + 1.4], we

will be right 95% of the time we take a sample.
Confidence Intervals
We say the interval [x̄ − 1.4, x̄ + 1.4] is a 95% confidence interval
for µ, because 95% of the time, the interval we construct contains
µ.
The 95% is the confidence level.
In general we write the interval as
x̄ ± 1.4
Of course, we could ask for different confidence levels. Other

common choices are 90%, and 97%, 98%, 99%.
A 100% confidence interval would be the range [−∞, ∞], which is

not useful at all. So we must allow the possibility of being wrong.
The interval x̄ ± 1.4 is not 100% reliable.
The exact interval we will get depends on the sample we chose.
All the intervals will have length 2.8, but their centers will vary.
Saying we are 95% confident means

the interval we constructed will contain µ 95% of the
time, but 5% of the time it will be wrong.
For any given sample

we construct an interval.
We only know about the
long run probability of
our sample giving a good
interval.
We do not know, without further information, whether the interval

from our particular sample is one of the 95% which contains µ, or
one of the 5% which don’t.
Summing Up The Main Idea
The sampling distribution of x̄ tells us how close to µ the sample

mean x̄ is likely to be.
A confidence interval turns that information around to say how

close to x̄ the unknown population mean µ is likely to be.
General Method to Construct a Confidence Interval
We estimate parameter µ of a normal population N(µ, σ) using x̄
by constructing a level C confidence interval.
The interval will look like
σ
x̄ ± z ∗ √ .
n
| {z }
margin of error
z∗ is called the critical value and depends only on C .

Confidence Levels
Common z ∗ values are
Confidence Level z∗
90% 1.645
95% 1.960
99% 2.576
For any confidence level C , the critical value z ∗ is the number for
which
1−C
P(Z < −z ∗ ) =
2
We can find this using a table look-up.
Critical Value in Tables
Or, common values of z ∗ are listed in table C in the textbook.
Assumptions
Remember the assumptions we made at the beginning:
I The population is normal with distribution N(µ, σ)
I We know the value of σ, but do not know µ.
I We have a SRS.
How much can we relax these assumptions?

I We always need a SRS, otherwise x̄ is not a random variable.
I This method requires us to know σ. (There are technical
problems with estimating σ by s)
I We only needed the population to be normal to ensure the
sampling distribution was normal. In practice we can fudge
this, especially if the sample sizes are large enough. Then the
central limit theorem says the sampling distribution is
approximately normal.
A Story About Basketball
Charlie claims that he makes free throws at an 80% clip.
To test his claim, we ask Charlie to take 20 shots.
Unfortunately, Charlie only makes 8 out of 20.
We respond, “Someone who makes 80% of his shots would almost

never make only 8 out of 20!”
The basis for our response: If Charlie’s claim were true and we
repeated the sample of 20 shots many times, then he would almost
never make just 8 out of 20 shots.
The basic idea of significance tests
An outcome that would rarely happen if a claim were true is good

evidence that the claim is NOT true.
As with confidence intervals, we ask what would happen if we

repeated the sample or experiment many times.
For now, we will assume that we have a perfect SRS from an

exactly Normal population with standard deviation σ known to us.
Phosphorus in the blood
Levels of inorganic phosphorus in the blood of adults are Normally

distributed with mean µ = 1.2 and standard deviation σ = 0.1
mmol/L.
Does inorganic phosphorus blood level decrease with age?
A retrospective chart review of 12 men and women between the

ages of 75 and 79 yields:
1.26 1.00 1.19 1.39 1.10 1.29

1.00 0.87 1.03 1.00 1.23 1.18
The sample mean is x̄ = 1.128 mmol/L.

The Question
Do these data provide good evidence that, on average,

inorganic phosphorus levels among adults of ages 75 to
79 are lower than in the whole adult population?
To answer this question, here’s how we proceed:
I We want evidence that the mean blood level of inorganic

phosphorus in adults of ages 75 to 79 is less than 1.2 mmol/L.
I Thus the claim we test is that the mean for people ages 75 to
79 is 1.2 mmol/L.
Answering the Question (I)
If the claim that the population mean µ for adults aged 75 to 79 is

1.2 mmol/L were true,
then sampling distribution of x̄ from 12 individuals ages 75 to 79

would be Normal
with mean µx̄ = 1.2 and standard deviation

σ 0.1
σx̄ = √ = √ = 0.0289.
n 12
Answering the Question (II)
There are two general outcomes to consider:
1. A sample mean is close to the population mean.

This outcome could easily occur by chance when the
population mean is µ = 1.2.
2. A sample mean is far from the population mean.

It is somewhat unlikely for this outcome to occur by chance
when the population mean is µ = 1.2.
Answering the Question (III)
In our case, the sample mean x̄ = 1.128 mmol/L is very far from
the population mean µ = 1.2.
An observed value this small would rarely occur just by chance if

the true µ were equal to 1.2 mmol/L.
Null and Alternative Hypotheses
The claim tested by a statistical test is called the null hypothesis.
I The test is designed to determine the strength of the evidence

against the null hypothesis.
I Usually the null hypothesis is a statement of “no effect” or

“no difference.”
The claim about the population that we are trying to find evidence
for is called the alternative hypothesis.
One-sided vs. two-sided alternative hypotheses
The alternative hypothesis is one-sided if it states that a

parameter is larger than or that it is smaller than the null
hypothesis value.
The alternative hypothesis is two-sided if it states that the

parameter is merely different from the null value.
Hypothesis Notation
Null hypothesis: H0
Alternative hypothesis: Ha
Remember that these are always hypotheses about some

population parameter, not some particular outcome.
Back to the phosphorus example
Null “No difference H0 : µ = 1.2

hypothesis: from adult mean
of 1.2 mmol/L.”
Alternative “Their mean is Ha : µ < 1.2

hypothesis: lower than 1.2 (one-sided)
mmol/L.”
Aspirin labels
On an aspirin label, we find the following: “Active Ingredient:

Aspirin 325 mg”
There will be slight variation in the amount of aspirin, but this is

fine as long as the production has mean µ = 325 mg.
Let’s test the accuracy of the statement on the label:
H0 : µ = 325mg
Ha : µ 6= 325mg
Note that this is a two sided alternative hypothesis.
Why do we use a two-sided Ha rather than a one-sided Ha ?

One last point on hypotheses
Hypotheses should express the expectations or suspicions we have

prior to our seeing the data.
We shouldn’t first look at the data and then frame hypotheses to

fit what the data show.
The P-value of a test
Starting with a null hypothesis, we consider the strength of the

evidence against this hypothesis.
The number that measures the strength of the evidence against a

null hypothesis is called a P-value.
How statistical tests work
A test statistic calculated from the sample data measures how far
the data diverge from the null hypothesis H0 .
Large values of the statistic show that the data are far from what
we would expect if H0 were true.
The probability, assuming that H0 is true, that the test statistic

would take a value as or more extreme than the observed value is
called the P-value of the test.
The smaller the P-value, the stronger the evidence provided by the
data is against H0 .
Interpreting P-values
Small P-values ⇒ Evidence

against H0
Why? Small P-values say the observed result would be unlikely to

occur if H0 were true.
Large P-values ⇒ Fail to pro-

vide evidence
against H0
One-sided P-value
In the inorganic phosphorus levels example, we tested the

hypotheses
H0 : µ = 1.2
Ha : µ < 1.2.
Values of x̄ less than 1.2 favor Ha over H0 .
The 12 individuals of ages 75 to 79 had mean inorganic

phosphorus level x̄ = 1.128.
Thus the P-value is the probability of getting an x̄ as small as

1.128 or smaller when the null hypothesis is really true.
Computing P-values, or Not
We can compute P-values by means of the applet P-Value of a

Test of Significance.
Our focus for now: Understanding what a P-value means.
Next time we’ll talk about how to compute P-values.

Aspirin revisited
Suppose the aspirin content of aspirin tablets from the previous

example follows a Normal distribution with σ = 5 mg.
H0 : µ = 325mg
Ha : µ 6= 325mg
Data from a random sample of 10 aspirin tablets yields x̄ = 326.9.
The alternative hypothesis is two-sided, so the P-value is the

probability of getting a sample whose mean x̄ at least as far from
µ = 325 mg in either direction as the observed x̄ = 326.9.
Conclusion about aspirin?
We failed to find evidence against H0 .
This just means that the data are consistent with H0 .
This does not mean that we have clear evidence that H0 is true.
How small should P-values be?
In the phosphorus level example, a P-value of 0.0064 was strong

evidence against the null hypothesis.
In the aspirin example, a P-value of 0.2302 did not give convincing

evidence.
How small should a P-value be for us to reject the null hypothesis?
Unfortunately, there is no general rule as this ultimately depends

on the specific circumstances.
Statistical Significance
However, there are fixed values commonly used as evidence against

a null hypothesis.
The most common values are 0.05 and 0.01.
If P ≤ 0.05, then there is no more than a 1 in 20 chance that a

sample would give evidence this strong just by chance when the
null hypothesis is true.
If P ≤ 0.01, its no more than a 1 in 100 chance.
These fixed standards for P-values are called significance levels.

Significance levels
If the P-value is less than or equal to α, we say the data are

statistically significant at level α.
“signficant” 6= “important”
“not likely to happen just by chance due to ran-

“signficant” =
dom variations from sample to sample”
Test for a Population Mean
For significance tests of a population mean, we compare the

sample mean x̄ with the claimed population mean stated in the
null hypothesis H0 .
The P-value shows how likely (or unlikely) an x̄ is if H0 is true.
So how do we calculate the P-value (without help from an applet)?

z-test for a Population Mean
Draw an SRS of size n from a Normal population that has

unknown mean µ and known standard deviation σ.
To test the null hypothesis that µ has a specified value
H0 : µ = µ0 ,
calculate the one-sample z test statistic

x̄ − µ0
z= √ .
σ/ n
z-scores and P-values
!"#$%&'#()*+(,"-)./'#
! !
Example: Body Temperature
The normal healthy body temperature is 98.6 degrees Fahrenheit

(37.0 degrees Celsius).
This widely quoted value is based on a paper published in 1868 by

German physician Carl Wunderlich, based on over a million
body-temperature readings.
Suppose we claim that this value is not correct.

Example: State Hypotheses
The null hypothesis is “no difference” from the accepted mean

µ0 = 98.6◦ F.
H0 : µ = 98.6
The alternative hypothesis is two-sided because we have no

particular direction in mind prior to examining the data.
Ha : µ 6= 98.6
Example: z Test Statistic
Suppose we know that individual body temperatures follow a

Normal distribution with standard deviation σ = 0.6◦ F.
We take a sample of 130 adults and the mean oral temperature is

x̄ = 98.25◦ F.
The one-sample z test statistic is

x̄ − µ0 98.25 − 98.6
z= √ = √
σ/ n 0.6/ 130
= −6.65
Example: Finding the P-value
!"#$%&'()*+,-+,.)/0')!12#&3'
40')516789')+6)8::)/0')70#9/)8,)4#;&')<=
The z-score is off the chart on Table B, so P(Z ≤ −6.65) is
!
! ! "zero.
essentially "#"#"$ $
68))))))))))))))))))))))))+6)'66',/+#&&>)5'98?)
!
Example: Conclusion
We are doing a two-sided test, so the probability that we compare

with the significance level is 2P(Z ≤ −6.65) ≈ 0.
Using α = 0.01, we will reject H0 .
There is strong evidence that the true mean body temperature of

healthy adults is not 98.6◦ F.
Hypothesis Tests from Confidence Intervals
Confidence intervals and tests of significance have similarities.
I Both start with a sample mean x̄.
I Both rely on Normal probabilities.
In fact, a two-sided test at significance level α can be carried out

from a confidence interval with confidence level C = 1 − α.
Hypothesis Tests from Confidence Intervals
A level α two-sided significance test rejects a hypothesis
H0 : µ = µ0
when the value µ0 falls outside a level 1 − α confidence interval for

µ.
Let’s look back at our body temperature example.

Recall we had a sample of 130 adults with a mean body

temperature of x̄ = 98.25◦ F.
Also recall that µ0 = 98.6◦ F, and the population standard

deviation σ = 0.6◦ F.
Now we will construct a 99% confidence interval.

The confidence interval is

σ
x̄ ± z ∗ √ .
n
Plugging in x̄, z ∗ , σ, and n yields

0.6
98.25 ± (2.576) √ ,
130
that is,
98.25 ± 0.136.
Thus our interval is
[98.11, 98.39].
The hypothesized parameter value µ0 = 98.6◦ F falls outside the

confidence interval [98.11, 98.39].
Thus we reject the null hypothesis at a significance level of

α = 0.01.

Slides CH 14

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Slides CH 14

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides CH 14

Uploaded by

Copyright:

Available Formats

ACMS 20340

Statistics for Life Sciences

For a population distributed as N(µ, σ) the statistic x̄ calculated

We would like to use x̄ to estimate µ.

Unfortunately, while x̄ is likely to be close to µ, they are unlikely to

We will make things easier and only guess an interval which

We will make the following (possibly unrealistic) assumptions:

For example, heights of 8 year old boys are normally distributed

Samples of size 217 are distributed as N(µ, 0.7). Why?

Using the normal tables, we can calculate the probability that x̄ is

Thus, the probability that x̄ is within 1.4 of µ is 0.95.

In other words, for 95% of all samples, 1.4 is the maximum

So if we estimate that µ lies in the interval [x̄ − 1.4, x̄ + 1.4], we

The 95% is the confidence level.

In general we write the interval as

Of course, we could ask for different confidence levels. Other

A 100% confidence interval would be the range [−∞, ∞], which is

The exact interval we will get depends on the sample we chose.

Saying we are 95% confident means

For any given sample

We do not know, without further information, whether the interval

The sampling distribution of x̄ tells us how close to µ the sample

A confidence interval turns that information around to say how

z∗ is called the critical value and depends only on C .

How much can we relax these assumptions?

Charlie claims that he makes free throws at an 80% clip.

To test his claim, we ask Charlie to take 20 shots.

Unfortunately, Charlie only makes 8 out of 20.

We respond, “Someone who makes 80% of his shots would almost

An outcome that would rarely happen if a claim were true is good

As with confidence intervals, we ask what would happen if we

For now, we will assume that we have a perfect SRS from an

Levels of inorganic phosphorus in the blood of adults are Normally

Does inorganic phosphorus blood level decrease with age?

A retrospective chart review of 12 men and women between the

1.26 1.00 1.19 1.39 1.10 1.29

The sample mean is x̄ = 1.128 mmol/L.

Do these data provide good evidence that, on average,

To answer this question, here’s how we proceed:

I We want evidence that the mean blood level of inorganic

If the claim that the population mean µ for adults aged 75 to 79 is

then sampling distribution of x̄ from 12 individuals ages 75 to 79

with mean µx̄ = 1.2 and standard deviation

There are two general outcomes to consider:

1. A sample mean is close to the population mean.

2. A sample mean is far from the population mean.

An observed value this small would rarely occur just by chance if

The claim tested by a statistical test is called the null hypothesis.

I The test is designed to determine the strength of the evidence

I Usually the null hypothesis is a statement of “no effect” or

The alternative hypothesis is one-sided if it states that a

The alternative hypothesis is two-sided if it states that the

Remember that these are always hypotheses about some

Null “No difference H0 : µ = 1.2

Alternative “Their mean is Ha : µ < 1.2

On an aspirin label, we find the following: “Active Ingredient:

There will be slight variation in the amount of aspirin, but this is

Let’s test the accuracy of the statement on the label: