, or ≠. 2) Type I and Type II errors in hypothesis testing. A Type I error rejects the null hypothesis when it is true. Type II fails to reject the null when it is false. 3) The traditional method for hypothesis testing involves stating hypotheses, choosing a significance level, selecting a test statistic, and making a decision to reject or not reject the null based on the critical value.">, or ≠. 2) Type I and Type II errors in hypothesis testing. A Type I error rejects the null hypothesis when it is true. Type II fails to reject the null when it is false. 3) The traditional method for hypothesis testing involves stating hypotheses, choosing a significance level, selecting a test statistic, and making a decision to reject or not reject the null based on the critical value.">
[go: up one dir, main page]

0% found this document useful (0 votes)
294 views35 pages

L2 Hypothesis Testing

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 35

HYPOTHESIS

TESTING
Example 1
• State the hypotheses – Identify the null and alternative hypotheses for the
following claims:
1. The average number of defective items is equal to 10.
2. The average number of defective items is greater than 10.
3. The average number of defective items is less than 10.
• Solution:
o 𝐻 :𝜇 𝑘 claim Two – Tailed Test
One – Tailed Test
Right – Tailed Test Left ‐Tailed Test
o 𝐻 :𝜇 𝑘 𝐻 :𝜇 𝑘 𝐻 :𝜇 𝑘 𝐻 :𝜇 𝑘

o 𝐻 :𝜇 𝑘, 𝐻 :𝜇 𝑘 claim 𝐻 :𝜇 𝑘 𝐻 :𝜇 𝑘 𝐻 :𝜇 𝑘

o 𝐻 :𝜇 𝑘, 𝐻 :𝜇 𝑘 claim Table 1: Mathematical symbols for one – tailed and two – tailed test for one 


sample

o Where 𝑘 is a given value.


NOTE

 For the null hypotheses (𝐻 ), we always use


the equality sign.
 For alternative hypotheses (𝐻 ), we always
use either <, or >, or .
Example 2

• Finding the critical and noncritical region – Identify the critical


and noncritical regions for each of the alternative hypotheses.
Use a significance level of 𝜶 𝟎. 𝟎𝟓 . Assumed that the
distribution is normal.
1. 𝐻 : 𝜇 5 2.  𝐻 : 𝜇 5
3.  𝐻 : 𝜇 5
ERRORS IN HYPOTHESIS TESTING
• In hypothesis testing situation, we should make a decision whether to reject
the null hypothesis or not. The decision may be correct or wrong, thus the
error in making a decision using hypothesis testing is either type I error or
type II error. The two types of errors are defined as follows:
1. Type I error is defined as the event of rejecting the null hypothesis when the 
null hypothesis is true. The probability of the type I error is called the 
significance level, α.
2. Type II error is defined as the event of failing to reject the null hypothesis 
when the null hypothesis is false. The probability of type II error is β. The two 
types of errors are summarized in the following table:
𝑯𝟎 true 𝑯𝟎 false
Correct 
Reject 𝐻 Type I error
decision
Correct 
Fail to reject 𝐻 Type II error
decision
NOTE

 Large α will result in a small β, and small α 
will result in a large β.
PROCEDURE FOR CONDUCTING HYPOTHESIS TESTING.

• Before giving the general procedure, we should mention that there are three methods for 
conducting hypothesis testing. The methods are:
1. Traditional method
2. 𝑃 value
3. Confidence method

• In this chapter, the traditional and 𝑃 value methods will be studied. The third method, 


confidence interval will be given in the next chapter.
TRADITIONAL METHOD

• Now, we can give the procedure for conducting hypothesis testing using the traditional 
method, since we have presented all related terms to be used in hypothesis testing. The steps 
are given as follows:
1. State the null and alternatives hypothesis.
2. Choose significance level, α.
3. Select the appropriate test statistic and determine its value using the sample data.
4. Make a decision to reject or not the null hypothesis by comparing the test statistic value 
calculated from the sample with the critical value. The null hypothesis is rejected if the 
test statistic value falls within the critical region and is not rejected if the test statistic 
value does not fall within the critical region.
One-Sample Hypothesis Test About a Mean

• In this section, the procedure will be studied for testing a claim about a mean 
in the case of large and small samples and when the standard deviation (𝜎) is 
known or not known. Two tests will be presented in this section, Z‐test and t‐
test.
Z-TEST FOR TESTING CLAIMS ABOUT A MEAN

• The Z‐test is a statistical test used to test the mean when the population is normally 
distributed and 𝜎 is known. Furthermore, Z‐test is used when the sample size is 
large, 𝑛 30. The formula for Z‐test is
𝑋 𝜇
𝑍
𝜎/ 𝑛
• Where
𝑋 = the sample mean,
𝜇 = the hypothesized mean (claimed value),
𝜎 = the population standard deviation, and
𝑛 = the sample sizes.
Example 6.3
• Confectionery company – A manager of a confectionery company claims that 
the average number of cakes sold daily is more than 1750. A random sample 
of 36 days was selected to test the manager’s claim. The sample data showed 
that the average is 1765 cakes. The standard deviation of the population is 100 
cakes. Is there enough evidence to support the claim? Use α=0.05. Assumed 
that the population is normally distributed.

• Solution.
First, we should state the null and alternative hypotheses:
𝐻 : 𝜇 1750, 𝐻 : 𝜇 1750 claim
The critical value for a right‐tailed at 𝛼 0.05 is 1.96.
Apply Z‐test formula to find the test value:
𝑋 𝜇 1765 1750
𝑍 0.9
𝜎/ 𝑛 100/ 36
We conclude that there is enough to support the manager’s claim that the average cakes sold 
daily is more than 1750 cakes. Can you state the reason? 
Reason
Since the test value is in the nonrejection (noncritical) 
region, so we do not reject the null hypothesis. 
Exercise: 10 minutes

• Quality control – A quality control engineer in a company which


produces bolts claims that an automatic machine produces bolts
with a mean diameter of 14mm. Bolts that vary too much in either
direction from the mean diameter are not suitable for their use. The
standard deviation from past experiment is 0.37mm. A sample of 36
bolts were tested, the mean was 13.8766mm. Is there enough
evidence to support the claim? Use α=0.01. Assumed that the
population is normally distributed.
Example 6.4
• Quality control – A quality control engineer in a company which produces bolts
claims that an automatic machine produces bolts with a mean diameter of 14mm.
Bolts that vary too much in either direction from the mean diameter are not suitable
for their use. The standard deviation from past experiment is 0.37mm. A sample of 36
bolts were tested, the mean was 13.8766mm. Is there enough evidence to support the
claim? Use α=0.01. Assumed that the population is normally distributed.
• Solution. 
First, we should state the hypotheses: 
𝐻 : 𝜇 14 claim , 𝐻 : 𝜇 14
The critical values for two‐tailed test at α=0.01 are ‐2.58 and 2.58.
Compute the test value by applying the following formula:
𝑋 𝜇 13.8766 14
𝑍 2
𝜎/ 𝑛 0.37 36
Decision: Since the test value is in the nonrejection (noncritical) region, so we do 
not reject the null hypothesis as shown in the following. We conclude that there 
is enough evidence to support the engineer’s claim that the average diameter of 
bolt produced by the automatic machine is 14mm.
T-TEST FOR TESTING CLAIMS ABOUT MEAN
• In many situations, the population standard deviation is 
not known, and the sample size is small. In the case, the 
Z‐test is not valid to be used for testing a claim about 
the mean and we should use another test which is called 
t‐test.

• Before giving the definition of t‐test, a new distribution 
shall be presented which is called t distribution. 
• The t distribution is a family of curves based on the concept of degrees of 
freedom (𝒅𝒇) (number of values that are free to vary after a test statistic has 
been calculated) which is related to the sample size. Some important 
properties of the t distribution are given as follows:
1. It has bell shape as the standard normal distribution where wider 
shape reflects greater variability.
2. The mean of t distribution is equal to 0.
3. The standard deviation is greater than 1.
4. As the sample size increases, the t distribution gets closer to the 
standard normal distribution.
5. The t distribution is symmetric about the mean.
1. t‐TEST

o The t‐test is a statistical test used to test the mean when the 
population is normally or approximately normally distributed 
and 𝜎 is unknown. Furthermore, t‐test is used when the sample 
size is small, 𝑛 30. The formula for t‐test is
𝑋 𝜇
𝑡
𝑆/ 𝑛
o When 𝑛 1 degrees of freedom.
2. How to find t critical value
o To get the critical value from t distribution table, we should provide the degrees of freedom and significance 
level as follows:
o Find t critical value for α=0.05 with 𝑑𝑓 =15 for a two‐tailed t‐test.
o The t table consists of two panels: the vertical panel represents the degrees of freedom and the horizontal panel 
represents the significance level, α. We look in the vertical panel to locate the degrees of freedom and then we 
look in the horizontal panel to locate the significance level. Draw a line from the degrees of freedom and another 
line from the significance level. Then, the point of intersection represents the t‐critical value as shown in Table 
6.2. The critical value is 2.131.
One‐tailed, 
0.25 0.10 0.05 0.025 0.01 0.005
α
𝑑𝑓
Two‐tailed, 
0.50 0.20 0.10 0.05 0.02 0.01
α
1
2
3
.
.
.
12
13
14
15 2.131
16
.
.
Example 6.5

• Weight of dried fruits – The label on the dried fruits packs showed a
weight of 275g. A sample of size 10 packs was selected and checked.
The mean and standard deviation were 277.25g and 2.725g,
respectively. Does it appear that the mean weight is 275g. Assumed
that the distribution is normally distributed. Use α=0.05.
Solution.
State the null and alternative hypotheses. The null hypothesis states that the weight is 
275g, while the alternative hypothesis is either more or less than 275g. Thus, the 
alternative hypothesis will be unequal to 275g.
𝐻 : 𝜇 275 claim , 𝐻 : 𝜇 275
The critical values for a two‐tailed test at α=0.05 and 𝑑𝑓 = 9 are 2.262 and ‐2.262.
Apply the t formula to compute the test value:
𝑋 𝜇 277.25 275
𝑡 2.611
𝑆/ 𝑛 2.725/ 10
Make a decision: Since the test value is in the rejection (critical) region, so we reject the 
null hypothesis.

There is enough evidence to reject the null hypothesis and conclude that the weight is 
not equal to 275g.
Exercise:
10 minutes exercise
• Zinc concentration in water – An environmentalist claimed that
the average concentration of zinc (Zn) in the Juru River is less
than 0.15mg/L. A sample of 8 sampling points was selected and
showed that the average is 0.092 mg/L and standard deviation is
0.014 mg/L. use α = 0.10 to test the environmentalist’s claim,
assuming the population is normally distributed.
• Solution.
State the hypotheses and verify the claim:
𝐻 : 𝜇 1750, 𝐻 : 𝜇 1750 claim
The critical value for α = 0.10 and 𝑑𝑓 = 7 is ‐1.415.
Apply the t formula to compute the test value:
𝑋 𝜇 0.092 0.15
𝑡 15.556
𝑆/ 𝑛 0.014/ 8
Make a decision: Since the test value is in the rejection (critical)
region, so we reject the null hypothesis.

There is enough evidence to support the claim that the


concentration of Zn in Juru River is less than 0.15 mg/L
More Exercise
The flow discharge of Perak River (measured in m3/s) was obtained at 
random. 20 readings were collected, and the mean flow discharge was found 
to be 3.85m3/s with a standard deviation of 0.5m3/s. 

(a) Test the hypothesis that mean flow discharge at Perak River is not  equal 
to 4m3/s . Use =0.05;

(b) Use the P‐value approach to test the hypothesis null.

(c)  Construct a 95% two‐sided CI on mean flow discharge. What is 
conclusion?

October 19 24
Solution: (a)
1. Problem: To test about the mean, variance unknown.
2. Hypothesis : H 0 :   4 vs H 1 :   4

Test statistics: X  0
3. T 
s/ n
4. Critical value: 𝑡 𝑡 =2.093
, . ,

5. Rejection region: Reject Ho IF


T  2.093 or T  2.093
6. Calculation: 3.85 4
𝑇 1.34
0.5/ 20
7. Conclusion:
Since -1.34 > -2.093, so we fail to reject the null hypothesis and conclude the true
mean flow discharge is not significantly different from 4m3/s at α = 0.05.
October 19 25
Solution: (b)
From a t-distribution table, for a t – distribution with
19 degree of freedom, that T=1.34 is falls between
two values: 1.328 for which =0.1 and 1.729 for
which =0.05. So the P-value is :

2(0.05 < P < 0.1)= 0.1 < P < 0.2

Since P > 0.05, thus we fail to reject H0 and conclude


that the mean flow discharge is not significantly
different from 4m3/s. Same result as in (a).

October 19 26
Solution: (c)
A 95% two-sided CI flow discharge is

x  3 .85 , s  0 .5, n  20 , t / 2 , n 1  t 0 .025 ,19  2 . 093


 s   s 
x  t  / 2 , n 1      x  t / 2 , n 1  
 n  n
 0 .5   0 .5 
3 .85  ( 2 .093 )      3 .85  ( 2 .093 )  
 20   20 
so the 95% two sided CI is 3 .616    4 .084

Since  4 is falls inside of the CI, so we fail to reject the null
hypothesis and conclude the true mean flow discharge is not
significantly different from 4m3/s at α = 0.05.
Same results as in (a) and (b). October 19 27
Example (Large Sample Size)
The flow discharge of Perak River (measured in m3/s) was obtained at 
random. 100 readings were collected and the mean flow discharge was 
found to be 3.85m3/s with a standard deviation of 0.5m3/s. 

(a) Test the hypothesis that mean flow discharge at Perak River is not  equal 
to 4m3/s . Use  =0.05;

(b) Use the P‐value approach to test the hypothesis null.

(c)  Construct a 95% two‐sided CI on mean flow discharge. What is 
conclusion?

October 19 29
Solution: (a)
1. Problem: To test about the mean, variance unknown (Large Sample ).
2. Hypothesis : H 0 :   4 vs H 1 :   4

X  0
3. Test statistics: Z 
s/ n
4. Critical value: Critical value: α = 0.05 (two-tailed)
Z 0.025= 1.96
5. Rejection region: Reject Ho IF
Z  1.96 or Z  1.96
6. Calculation: 3.85  4
Z  3.0
0.5 / 100
7. Conclusion:
Since -3.0 < -1.96, so we reject the null hypothesis and conclude the
true mean flow discharge is significantly different from 4m3/s at
α = 0.05. October 19 30
Solution: (b)

P-value is :

2(1- = 2(1-0.998) = 0.004

Since P < 0.05, thus we to reject H0 at α = 0.05 and


conclude that the mean flow discharge is significantly
different from 4m3/s. Same result as in (a).

October 19 31
Solution: (c)
A 95% two-sided CI flow discharge is

x  3 . 85 , s  0 . 5 , n  100 , Z  / 2  1 . 96
 s   s 
x  Z /2     x  Z  /2 
 n   n 
 0 .5   0 .5 
3 . 85  (1 . 96 )      3 . 85  (1 . 96 )  
 100   100 
so the 95% two sided CI is 3 . 752    3 . 948

Since  4 is falls outside of the CI, so we reject the null


hypothesis and conclude the true mean flow discharge is
significantly different from 4m3/s at α = 0.05.
Same results as in (a) and (b). October 19 32
Exercise
1. An executive engineer in company which produces juice claims that the
can is filled with an average of 250 mL of juice. If the mean is significantly
less than 250 mL, customers will likely to complain, prompting undesirable
publicity. The physical size of the can does not allow a mean volume
significantly above 250 mL. A sample of 45 cans shows the average of 244.2
mL. Production records show that the standard deviation is 0.4mL. Use
significance level 0.05 to test the engineer’s claim.
2. A bakery owner claims that the average numbers of cakes sold daily is
1650. In a sample of 36 days, the mean is 1770 cakes and the standard
deviation of the population is 95. Use the significance level 0.05 to test the
owner’s claim.
3. An environmentalist claims that the average concentration of zinc (Zn) in
Juru River is more than 0.088. A sample of 8 days was tested for Zn, the
average of Zn is 0.092mg/L and the standard deviation is 0.014mg/L. Is
there enough evidence to support the claim? Use α = 0.01
Answer
• 1. Z=‐9.73 , Z critical= ‐1.96, 1.96. Reject H0.
• 2. Z=7.579, Z critical= ‐1.96, 1.96. Reject H0.
• 3. t=0.808, t critical=2.998. Accept H0 (mu <=0.088), H1 
(mu>0.088)
YOUR TASK

STUDY LINEAR REGRESSION AND ANOVA

You might also like