Chapter 4
Hypothesis Testing
4.1 Introduction
A hypothesis is an educated guess about something in the world around you. It should be testable, either by
experiment or observation.
Example. For example,
   •   A new medicine you think might work.
   •   A way of teaching you think might be better.
   •   A possible location of new species.
   •   A fairer way to administer standardized tests.
It can be anything as long as it is possible to put it to the test.
Denition 10. A statistical hypothesis is an assumption or a statement which may or may not be true con-
cerning one or more populations.
Example. Suppose in a political poll candidate, A claims that he will gain at least 50% of the votes in the
election. However, someone else, say B, may say A is not favoured by more than 50% of the votes. How do we
examine which one is correct?
The above claims can be written as
                                           A : p ≥ 0.5 and B : p < 0.5
and these are two conicting hypotheses.
    The purpose of hypothesis testing is to choose between two conicting hypotheses about the value of a
population parameter. The two hypotheses are referred to as the null hypothesis, denoted by H0 , and the
alternative hypothesis, denoted by H1 . They are mutually exclusive, so that when one is true the other one is
false. It is customary to call the hypothesis that we want to reject as the null hypothesis H0 or we write our
claim under H1 . Also equality sign should always come under H0 . In the above example
                                         H0 : p ≥ 0.5 and H1 : p < 0.5
Null hypothesis: A statistical hypothesis that is to be tested.
Alternative hypothesis: The alternative to the null hypothesis.
                                                         25
                                             Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                        26
4.1.1    Simple hypothesis
When a hypothesis species only one value for the population parameter, it is said to be a simple hypothesis.
4.1.2    Composite hypothesis
When a hypothesis species many possible values of the parameter then it is said to be a composite hypothesis.
Example.    H0 : μ = μ0 vs. H1 : μ = μ0 =⇒ H0 is simple, H1 is composite.
         H0 : μ ≥ μ0 vs. H1 : μ < μ0 =⇒ Both H0 and H1 are composite.
         H0 : μ ≤ μ0 vs. H1 : μ > μ0 =⇒ Both H0 and H1 are composite.
4.2 Types of Errors
A test of statistical hypothesis is a two actions decision problem. The actions are being the acceptance or
rejection of the hypothesis under consideration. When making decisions two types of errors are possible.
  1.   Type I error: Rejecting H0 when it is true
  2.   Type II error: Does not reject H0 when it is false
Probabilities of making these errors are denoted by
                                         α = P(Reject H0 | H0 is true )
                                    β = P(Does not reject H0 | H0 is false )
Therefore (1 − α)and (1 − β) indicate the probabilities of making correct decisions. Both α and β are conditional
probabilities. α is also known as the level of signicance. For a given sample size, β will normally increase
when α decreases. Therefore in usual practice α is controlled at a predetermined low level.
   Usuallyα is selected as 10%, 5% or 1%. Then the probability of making a correct decision will be 90%, 95%
and 99% respectively. The only way which we can reduce both α and β is to increase the sample size.
4.3 Critical Region (Rejection Region)
This is the region where H0 is rejected. This region is based on α, and therefore α is also called the size of the
critical region. Based on the critical region the test is called a two sided or one sided test.
Example.    Suppose we want to test the population mean μ, and we select the size of the critical region (level
of signicance) α= 5%.
                                             Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                        27
4.3.1   Two sided test (Two tailed test)
Suppose H0 : μ = μ0 vs. H1 : μ = μ0 .
We reject H0 when μ < μ0 or μ > μ0 . To test the hypothesis we need two rejection regions one for the positive
side and the other one for the negative side.
   Since α = 0.05 ⇒ α/2 = 0.025.
If the sampling distribution is normal the critical regions can be shown as follows.
                                          Figure 4.3.1: Two tailed test
Since there are two rejection regions in two tails the test is called the two sided or two tailed test.
4.3.2   One sided test (Right tailed test)
Suppose H0 : μ ≤ μ0 vs. H1 : μ > μ0 .
We reject H0 when μ > μ0 , and the rejection region is in the positive (right) side. If the sampling distribution
is normal the critical region can be shown as follows.
                                         Figure 4.3.2: Right tailed test
Since the rejection region is in the right tail the test is called the one sided or right tailed test.
                                             Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                        28
4.3.3   One sided test (Left tailed test)
Suppose H0 : μ ≥ μ0 vs. H1 : μ < μ0 .
We reject H0 when μ < μ0 , and the rejection region is in the left side. If the sampling distribution is normal
the critical region can be shown as follows.
                                          Figure 4.3.3: Left tailed test
   Since the rejection region is in the left tail the test is called the one sided or left tailed test.
4.4 Power function Π(θ)
Power function is the probability of rejecting H0 when it is false.
   Π(θ) =P(Reject H0 |H0 is false) = P (x ∈ C | H1 ) = 1 − β(θ)
Example 44. Suppose a single value is taken from X ∼ N (μ, 0.16). The rejection region C = { X < 14.3 or
X > 15.9 } is used to test H0 : μ = 15.1 vs. H1 : μ = 15.1 . Find the signicance level α and the power
function.
                                          Figure 4.4.1: Power Function
                                             Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                         29
The value of the power function at a parameter point is called the power of the test at that point. In usual
practice we control α at a predetermined low level, and subject to this level we choose a test which minimizes
β or maximizes the power function 1 − β(θ). β(θ) is known as the operating characteristic function, and
when β(θ) is plotted for various values of θ under H1 , an operating characteristic curve (OC) curve is obtained.
Example 45.   Suppose that we can tolerate a size of type I error up to 0.06 when testing H0 : μ = 10 vs. H1 :
μ > 10. Assume that observations come from a normal distribution with standard deviation σ = 1.4, and we
take a random sample of size 25. Compare the following tests if X is the test statistic.
        Test A: Reject H0 if X > 10.65
        Test B: Reject H0 if X > 10.45
        Test C: Reject H0 if X > 10.25
                               Figure 4.4.2: Operating Characteristic(OC) Curve
Example 46. Suppose you wish to test the hypothesis H0 : θ = 5 vs. H1 : θ = 8 by means of a single observed
value of a random variable with probability density function
                                                   1
                                         f (x, θ) = exp(−x/θ) ; x > 0.
                                                   θ
If the maximum size of type I error that can be tolerated is 0.15, which of the following tests is best for choosing
between the hypothesis?
    (i) Test A: Reject H0 if X ≥ 9.
    (ii) Test B: Reject H0 if X ≥ 10.
    (iii) Test C: Reject H0 if X ≥ 11.
Note that when e−x > 0.15 when x < 1.9
                                              Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                                      30
4.5 Most powerful critical region
The critical region C is called the         best critical region (most powerful critical region) of size α for testing
H0 : θ = θ0 vs. H1 : θ = θ1 if
                                                           P (x ∈ C | H0 ) = α                                                 (4.5.1)
and
                                                 P (x ∈ C | H1 ) ≥ P (x ∈ C1 | H1 )
i.e. 1 − β ≥ 1 − β1 for every other critical region C1 satisfying (4.5.1) where x=(x1 , x2 , ..., xn ) represents the
sample observations.
4.6 Most powerful tests
A critical region for testing a simple H0 : θ = θ0 vs. simple. H1 : θ = θ1 is said to be                 best or most powerful
if the power of the test at θ = θ0 is a maximum.
    To obtain the most powerful critical region the following lemma can be used.
Lemma 2. Neyman Pearson Lemma
If C is a critical region of size α and k is a constant such that,
      L0                          L0
      L1   < k ; inside C , and   L1   ≥ k ; outside C ,
where L0 and L1 are the likelihood functions under H0 and H1 respectively, then C is a most                      powerful critical
region of size α for testing H0 : θ = θ0 vs. H1 : θ = θ1 .
Example 47.     Let Xi ∼ N (θ, 2) for i = 1, 2, ..., n. Suppose 25 observations are selected and H0 : θ = 0 vs. H1 :
θ = 1. is to be tested.
    (i) Obtain the best critical region if α = 0.05.
    (ii) Find the power of the test.
Example 48.    Let X1 , X2 , ..., Xn be a random sample of size 5 from a Poisson distribution with mean λ. Use
the Neymann Pearson lemma to nd an α = 0.05 level most powerful test for H0 : λ = 1 vs. H1 : λ = 2. Find
the power of the test.
In real situations we are seldom presented with the problem of testing two simple hypotheses.Typically one or
both hypotheses are composite, and the Neymann-Pearson lemma does not apply. However in some situations
this lemma can be extended to include composite one-sided hypotheses.
    For example if H0 : μ = μ0 vs. H1 : μ > μ0 , then select a particular value μ1 > μ0 . Then H1 can be written as
H1 : μ = μ 1 . Now the lemma can be applied to obtain a most powerful test for H0 : μ = μ0 vs. H1 : μ = μ1 > μ0 for
any single value μ1 where μ1 > μ0 . Then for H0 : μ = μ0 vs. H1 : μ = μ1 > μ0 , the most powerful test rejects H0 when
X > C where C depends on μ0 , n and σ 2 but not on μ1 . Since this test is most powerful and is the same for every simple
alternative in H1 , it is said to be uniformly most powerful.
Example 49.    Suppose Xi ∼ N (μ, σ 2 ) for i = 1, 2, ..., n. Find the most powerful test of size α for testing
H0 : μ = μ0 vs. H1 : μ > μ0 .
                                                       Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP
CHAPTER 4. HYPOTHESIS TESTING                                                                                         31
4.6.1    Likelihood Ratio Tests
Since Neymann Pearson lemma cannot be applied always for composite hypotheses, this general method is
introduced for constructing critical regions for tests of composite hypotheses. But likelihood ratio tests are not
necessarily uniformly most powerful.
   We consider the test for one parameter and continuous case, but all results can be extended to multi-
parameter and discrete case.
Denition 11.     If ω and ω  are complementary subsets of the parameter space Ω , and if λ = max       L0
                                                                                                     max L , where
max L0 and max L are the maximum values of the likelihood function for all values of θ in ω and Ω respectively,
then the critical region λ ≤ k , where 0 < λ < 1 denes a likelihood ratio test of H0 : θ ∈ ω vs. H1 : θ ∈ω  . Here
ω is the parameter space under H0 , and Ω is the total parameter space.
Note
  1. If H0 is a simple hypothesis, k is chosen so that the size of the critical region equals α.
  2. If H0 is composite, k is chosen so that the
         • P(Type I error)≤ α for all θ ∈ ω , and
         • P(Type I error) =α, if possible, for at least one value of θ ∈ ω .
Example 50.     Suppose Xi ∼ N (μ, σ 2 ) for i = 1, 2, ..., n, with known μ and σ 2 . Find the critical region of the
likelihood ratio test of size α for testing H0 : μ = μ0 vs. H1 : μ = μ0 .
Example 51.     Suppose Xi ∼ N (μ, σ 2 ) for i = 1, 2, ..., n, with unknown μ and σ 2 . Find the critical region of
the likelihood ratio test of size α for testing H0 : μ = μ0 vs. H1 : μ > μ0 .
Theorem 6. Let X1 , X2 , ..., Xn have the joint likelihood function L (θ). Let r0 denote the number of free
parameters under H0 : θ ∈ ω, and let r denote the number of free parameters in the parameter space Ω. Then
for large n
                                              −2lnλ ≈ χ2r0 −r
where λ = max
            max L is the likelihood ratio.
               L0
Example 52. Suppose an engineer wishes to compare the number of complaints per week lled by workers
for two dierent shifts at a manufacturing plant. One hundred independent observations on the number of
complaints gave means x = 20 for shift 1 and y = 22 for shift 2. Assume that the number of complaints per
week on the ith shift has a poisson distribution with mean θi ; i = 1, 2. Test H0 : θ1 = θ2 vs. H1 : θ1 = θ2 by
the likelihood ratio method with α = 0.05.
Example 53.     Let Xi ∼ exp (θ1 ); i = 1, 2, ..., m and Yj ∼ exp (θ2 ); i = 1, 2, ..., n . Test H0 : θ1 = θ2 vs. H1 :
θ1 = θ2 by the likelihood ratio method with α = 0.05.
                                              Dr. L.S. Nawarathna, Department of Statistics & Computer Science, UoP