[go: up one dir, main page]

0% found this document useful (0 votes)
5 views13 pages

Chapter Three

Uploaded by

esubalew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views13 pages

Chapter Three

Uploaded by

esubalew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Lecture note of business statistics II for RMC students

. CHAPTER THREE
3. Hypothesis Testing:
3. 1 Basic concepts
- Statistical hypothesis testing is also one way of making inference about population
parameter, where the investigator has prior notion about the value of the parameter.
Definitions:
- Statistical hypothesis: is an assertion or statement about the population whose plausibility is
to be evaluated on the basis of the sample data.
- Test statistic: is a statistics whose value serves to determine whether to reject or accept the
hypothesis to be tested. It is a random variable.
- Statistic test: is a test or procedure used to evaluate a statistical hypothesis and its value
depends on sample data.
There are two types of hypothesis:
Null hypothesis:
- It is the hypothesis to be tested.
- It is the hypothesis of equality or the hypothesis of no difference.
- Usually denoted by H0.
Alternative hypothesis:
- It is the hypothesis available when the null hypothesis has to be rejected.
- It is the hypothesis of difference.
- Usually denoted by H1 or Ha.

Prepared by: Worku Shanko (MSc in Biostatistics) 1


3.2 Steps in Hypothesis testing
1. Evaluate data 6. State decision rule
2. Review assumptions 7. Calculate test statistics
3. State hypothesis 8. Make statistical decision
4. Select test statistics 9. Conclusion
5. Determine distribution of test statistics
Evaluate Data:
 The nature of the data must be understood whether it is categorical or continuous

Assess the statistical assumptions:


 A distribution is approximately normal
 Variance is known or unknown
 Independence of samples

State the appropriate hypotheses explicitly and clearly:


 Specify 𝐻0 and 𝐻𝑎 : Suppose the assumed or hypothesized value of 𝛍 is denoted by 𝛍𝟎 .

Example:
1. Can we conclude that a certain population mean is:
 not 50?
 Ho: μ = 50 Ha: μ ≠ 50
 greater than 50?
 Ho: μ = 50 Ha: μ > 50
 Less than 50?
 Ho: μ = 50 Ha: μ < 50
2. Among the 225 students who ate the sandwiches, 109 became ill. While, among the 38 students
who did not eat the sandwiches, 4 became ill. Can we conclude that there is significant
difference between the two groups?
𝐇𝟎 : 𝐏𝟏 = 𝐏𝟐 𝐯𝐬 𝐇𝐚 : 𝐏𝟏 ≠ 𝐏𝟐

Decide on the appropriate test statistic:

Prepared by: Worku Shanko (MSc in Biostatistics) 2


Specify the desired level of significance level and determine the critical value:
 Mostly used values of α where α is selected by researcher are 0.01, 0.05 and 0.10.

State decision rule:


 A value used to declare its significance

Obtain sample evidence and compute the test statistic:


Reach a decision and draw the conclusion:
 If Ho is rejected, we conclude that Ha is true (or accepted)
 If Ho is not rejected, we conclude that Ho may be true
Rejection and Non-Rejection Regions
 The values of the test statistic assume the points on the horizontal axis of the normal
distribution and are divided into two groups:
 Rejection region, and
 Non-rejection region
 The values of the test statistic forming the rejection region are less likely to occur if the Ho is
true
 The values making the acceptance (non-rejection) region are more likely to occur if the Ho is
true
Example: Two-sided test at α = 5%

Prepared by: Worku Shanko (MSc in Biostatistics) 3


Statistical Decision:
 Reject Ho if the value of the test statistic that we compute from our sample is one of the values
in the rejection region
 Don’t reject Ho if the computed value of the test statistic is one of the values in the non-
rejection region

3.3. Type I and type II errors (concepts)


Types and size of errors:
- Testing hypothesis is based on sample data which may involve sampling and non sampling
errors.
- The following table gives a summary of possible results of any hypothesis test:

Decision

Reject H0 Don't reject H0

H0 Type I Error Right Decision


Truth
H1 Right Decision Type II Error

- Type I error: Rejecting the null hypothesis when it is true.


- Type II error: Failing to reject the null hypothesis when it is false.

Type I Error
 The error committed when a true Ho is rejected
 The probability of type I error is α
 Called level of significance of the test
Type II Error
 The error committed when a false Ho is not rejected
 The probability of Type II Error is 
3.4.One tailed \ two tailed hypothesis tests

 In a one tail test, the rejection region is at one end of the distribution or the other
 In a two tail test, the rejection region is split between the two tails
 Which one is used depends on the way the Ho is stated

Prepared by: Worku Shanko (MSc in Biostatistics) 4


Example:
 The average survival year after cancer diagnosis is less than 3 years, greater than 3 years or
different from 3 years
3.5.Hypothesis testing of population and proportion:
3.5.1. Population single mean and proportion
i. Hypothesis testing of a single mean

1. Known Variance:

Example: two-tailed test:


 A simple random sample of 10 people from a certain population has a mean age of 27. Can
we conclude that the mean age of the population is not 30? Let 2=20, α=.05.

Solution:
A. Data: n = 10, sample mean = 27, 2 = 20, α = 0.05
B. Assumptions
 Simple random sample
 Normally distributed population
C. Hypotheses:
Ho: µ = 30
Vs.
Ha: µ ≠ 30

Prepared by: Worku Shanko (MSc in Biostatistics) 5


D. Test statistic:
 As the population variance is known, we use Z test statistic.

E. Decision Rule:
 Reject Ho if the Z value falls in the rejection region
 That is, Reject Ho if Z ≤ -1.96 or Z ≥ 1.96,

OR
 Don’t reject Ho if the Z value falls in the non-rejection region
 That is, Don’t reject Ho if -1.96 ≤ Z ≤1.96

F. Calculate test statistic

G. Statistical decision:

 We reject the Ho because Z = -2.12 is in the rejection region. The value is significant at 5% α
H. Conclusion
 We can conclude that the mean age of the population is not equal to 30.

One tail and two tail values of commonly used confidence levels:

Z-Value One tailed Two tailed

90% 1.28 1.645

95% 1.645 1.96

99% 1.96 2.575

Prepared by: Worku Shanko (MSc in Biostatistics) 6


Example: one -tailed test: A simple random sample of 10 people from a certain population has a
mean age of 27. Can we conclude that the mean age of the population is less than 30? The variance
is known to be 20. Let α = 0.05.
Solution:
a. Data

n = 10, sample mean = 27, 2 = 20, α = 0.05


b. Hypotheses

Ho: µ = 30 Ha: µ < 30


c. Decision rule: since α = 0.05 (one tail), that is, lower tail test
 The critical value will be Z = -1.645,
 Reject Ho if Z < -1.645

d. Calculate test statistic:


e. Statistical decision:
 We reject Ho because -2.12 < -1.645
f. Conclusion:
 We can conclude that the mean age of the population is less than 30.
2. Unknown Variance:
 In most practical applications the standard deviation of the underlying population is not
known
 In this case,  can be estimated by the sample standard deviation (s)

Prepared by: Worku Shanko (MSc in Biostatistics) 7


Example: two-tailed test: A simple random sample of 14 people from a certain population gives
a sample mean BMI of 30 and SD of 10.64. Can we conclude that the BMI is not 35 at α 5%?
Solution:
Ho: µ = 35 vs Ha: µ ≠35

Test statistic:
 If the assumptions are correct and Ho is true, the test statistic follows Student's t distribution
with 13 degrees of freedom

Decision rule:
 We have a two tailed test. With α = 0.05 it means that each tail is 0.025
 The critical t values with 13-degree freedom are -2.1604 and 2.1604
 We reject Ho if the t ≤ -2.1604 or t ≥ 2.1604

 Do not reject Ho because -1.58 is not in the rejection region


 Based on the data of the sample, it is possible that µ = 35.
 With a large sample, even if the population is not normally distributed, we can use Z as
the test statistic calculated using the sample standard deviation

ii. Hypothesis tests for proportions: Hypothesis testing about a single population proportion
o Involves categorical values
o Two possible outcomes
• Success” (possesses certain characteristic)
• “Failure” (does not possess)
o Fraction or proportion of population in the “success” category is denoted by p

Prepared by: Worku Shanko (MSc in Biostatistics) 8


Examples one: In a survey of injection drug users in a large city, researchers found that 18 out of
423 were HIV positive. We wish to know if we can conclude that fewer than 5% of the injection
drug users in the sampled population are HIV positive. Let α = 5%
Solution:
Data: data are obtained from 423 individuals, that is p= 18/423 = 0.0426
Assumption: sampling distribution of p is approximately normally distributed according to central
limit theorem
State Hypothesis: Ho: p = 0.05 Ha: p < 0.05

Test statistics:
Decision rule: since α = 5%,
• the critical value of z is –1.645,
• reject Ho if the computed z is < –1.645
(𝟎.𝟎𝟒𝟐𝟔−𝟎.𝟎𝟓) (𝟎.𝟎𝟒𝟐𝟔−𝟎.𝟎𝟓)
 Calculation of test statistics: 𝒁= = = −𝟎. 𝟕
√𝟎.𝟎𝟓(𝟏−𝟎.𝟎𝟓) √𝟎.𝟎𝟓(𝟎.𝟗𝟓)
𝟒𝟐𝟑 𝟒𝟐𝟑

Statistical decision:
 Don’t reject Ho since -0.7 > -1.645
Conclusion:
 We conclude that in the population the proportion who are HIV positive may be 0.05
3.5.2. The difference between two means and two proportions
i. Hypothesis testing about the difference between two population means
 Two tailed test:

Prepared by: Worku Shanko (MSc in Biostatistics) 9


 H0 : μ1 = μ2 vs. Ha : μ1 ≠ μ2
 H0 : μ1 − μ2 = 0 vs. H0 : μ1 − μ2 ≠ 0
 Upper (Right) tail test:
 H0 : μ1 = μ2 vs. Ha : μ1 > μ2
 H0 : μ1 − μ2 = 0 vs. Ha : μ1 − μ2 > 0
 Lower (left) tail test:
 H0 : μ1 = μ2 vs. Ha : μ1 < μ2
 H0 : μ1 − μ2 = 0 vs. Ha : μ1 − μ2 < 0

Example: A simple random samples of 12 and 14 people from a certain population has a mean
age of 30 and 32 respectively. Can we conclude that the mean age of the population is zero? Let
𝜎 21 = 𝜎 2 2 = 9, α=.05.
Solution:
State the hypothesis: H0 : μ1 − μ2 = 0 vs. H0 : μ1 − μ2 ≠ 0
(30 − 32) −2
𝑍= = = −𝟏. 𝟔𝟗
1.18
√9 + 9
12 14
Since -1.69 is in the non-rejection region, we don’t reject H0.
Here, at 5% level of significance we can conclude that the mean age of the population is zero.
ii. Hypothesis tests about the difference between two population proportions
 Two tailed test:
 H0 : p1 = p2 vs. Ha : p1 ≠ p2, or
 H0 : p1 − p2 = 0 vs. H0 : p1 − p2 ≠ 0
 Upper (Right) tail test:

Prepared by: Worku Shanko (MSc in Biostatistics) 10


 H0 : p1 = p2 vs. Ha : p1 > p2 , or
 H0 : p1 − p2 = 0 vs. Ha : p1 − p2 > 0
 Lower (left) tail test:
 H0 : p1 = p2 vs. Ha : p1 < p2
 H0 : p1 − p2 = 0 vs. Ha : p1 − p2 < 0

 Where X1 = the observed number of events in the first sample and X2 = the observed
number of events in the second sample.

Example: In a study of nutrition care in nursing home found that among 55 patients with
hypertension, 24 were on Na-restricted diet. Of 149 patients without hypertension, 36 were on Na-
restricted diet. May we conclude that the proportion of Na-restricted diet is higher among patients
with hypertension than among patients without hypertension at α = 5%?
Solution:
Data: it consists sodium status of the diet of nursing home patients with hypertension and without
hypertension
Assumption: patients in the study constitute independent random sample from the population

State Hypothesis:

Prepared by: Worku Shanko (MSc in Biostatistics) 11


Where: , population with hypertension

, Population without hypertension

Statistical decision:
 Reject Ho, since 2.71> 1.645
Conclusion:
 The proportion of patients on Na-restricted diets is higher among hypertensive patients than
among patients without hypertension

Example two: Among the 225 students who ate the sandwiches, 109 became ill. While, among
the 38 students who did not eat the sandwiches, 4 became ill. Is there a significant difference
between the two groups at α = 1%?
Solution:
Data: it consists 225 and 38 students who ate and didn’t eat the sandwiches, respectively
Assumption: sampling distribution of the two population are normally distributed according to
central limit theorem
Hypothesis, we wish to test: Ho: p1 = p2 Ha: p1 ≠ p2
Test statistics:

Prepared by: Worku Shanko (MSc in Biostatistics) 12


Decision rule: since α = 0.01
 The critical value of z is 2.58,
 Reject Ho, if the computed z is > 2.58
Calculate test statistics:

Statistical decision:
 Reject Ho, since 4.36 > 2.58
Conclusion:
 The proportion of students who became ill differs in the two groups; those who ate the prepared
sandwiches were more likely to develop illness.

Prepared by: Worku Shanko (MSc in Biostatistics) 13

You might also like