[go: up one dir, main page]

0% found this document useful (0 votes)
34 views101 pages

Estimation and Hypothesis Testing

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 101

ST.

LIDETA HEALTH SCIENCE COLLEGE


DEPARTMENT OF PHARMACY

Biostatistics for pharmacy students


Abrham Tesfaye
abrhamtesfaye95@gmail.com
(BSc in Mid, MSc in RH and MSc in Epi & Bio)

February Addis Ababa.


2024 Ethiopia

11/02/2024 Biostatistics By: Abrham T. 1


Objectives
At the end of this chapter, students will be able to:
 Understand the concepts of sample statistics and population
parameters
Understand the principles of estimation and differentiate
between point and interval estimations
Compute appropriate confidence intervals for population means
and proportions and interpret the findings
Describe methods of sample size calculation for cross –
sectional studies
11/02/2024 Biostatistics By: Abrham T. 2
Contents
 Chapter 1. Introduction to Biostatistics
 Chapter 2. Methods of Data Collection and Presentation
 Chapter 3. Measures of Central Tendency
 Chapter 4. Measures of variation (Dispersion)
 Chapter 5. Demography and vital statistics
 Chapter 6. Elementary Probability and Distributions
 Chapter 7. Sampling and Sampling Distributions
 Chapter 8. Estimation and Hypothesis Testing
 Chapter 9. Linear Regression and Correlation

11/02/2024 Biostatistics By: Abrham T. 3


11/02/2024 Biostatistics By: Abrham T. 4
Inferential Statistics

 Inferential Statistics: Are statistical


methods used for drawing conclusions
about a population based on the
information obtained from the sample of
observations drawn from that population

11/02/2024 Biostatistics By: Abrham T. 5


Inferential Statistics … Cont’d
 Involves
 Estimation Population?

 Hypothesis testing

 Purpose
o Make decisions about
population characteristics

11/02/2024 Biostatistics By: Abrham T. 6


Inferential Statistics … Cont’d
Inferential
statistics

Hypothesis
Estimation
testing

One Point
sample estimation

Two Interval
samples estimation
11/02/2024 Biostatistics By: Abrham T. 7
Inferential process

11/02/2024 Biostatistics By: Abrham T. 8


Statistical Estimation

 Estimation is the process of determining a likely value


of population parameter, based on information collected
from the sample.

 Estimation is the use of sample statistics to estimate the


corresponding population parameters

 The objective of estimation is to determine the


approximate value of unknown population parameter
on the basis of a sample statistic
11/02/2024 Biostatistics By: Abrham T. 9
Statistical Estimation … Cont’d
 Definition: A parameter is a numerical descriptive
measure of a population ( μ is an example of a parameter).

 A statistic is a numerical descriptive measure of a sample


( X is an example of a statistic).

 To each sample statistic there corresponds a population


parameter.

 We use X ,S2 , S , p, etc. to estimate μ, σ2, σ, P (or π), etc.

11/02/2024 Biostatistics By: Abrham T. 10


Sample statistic & their corresponding
population parameter

Statistic Parameter
Mean: X estimate 
Variance: s2 estimate
Standard
2
deviation:
s estimate 

Proportion: p estimate 
:
From entire
From sample
population
11/02/2024 Biostatistics By: Abrham T. 11
Estimation

Estimation

Point Interval
estimation estimation

11/02/2024 Biostatistics By: Abrham T. 12


Estimation: (Two types)
1. Point estimation

 A single numerical value used to estimate the corresponding


population parameter

 Point estimation ignores sampling error.

2. Interval estimation

 Using a sample statistic to estimate a population parameter by making


allowance for sample variation (error) .

 It is a statement that describes a population parameter has a value lying


in between two specified limits with a certain confidence interval.
11/02/2024 Biostatistics By: Abrham T. 13
Estimation Process
Interval estimate
Population Point estimate
Mean I am 95%

Mean, , is X = 50  confident that 


is between 40 &
 unknown 60.


 
Random 
Sample
 

11/02/2024 Biostatistics By: Abrham T. 14


1) Point estimation

 A single numerical value used to estimate the


corresponding population parameter

 Gives little information about how close the value is


to the unknown population parameter
 Example: Sample mean X = 3 is point
estimate of unknown population mean

11/02/2024 Biostatistics By: Abrham T. 15


2) Interval estimation
 A single-valued estimate conveys little information about
the actual value of the population parameter, is not
reasonable
 Assume that a sample statistic value is not exactly equal to
the corresponding population parameter

 An interval estimate which locates the population


parameter within an interval, with a level of confidence is
needed
11/02/2024 Biostatistics By: Abrham T. 16
Marginal Error
 A value calculated from the sample is the best guess
when estimating corresponding population value

 Estimate is still uncertain due to sampling error


 Marginal Error is a measure of uncertainty
 Using marginal error you can state:
Confidence interval = Estimate + Marginal error

11/02/2024 Biostatistics By: Abrham T. 17


Confidence interval for a single mean
Using a formula

11/02/2024 Biostatistics By: Abrham T. 18


Example (1)
The mean reading speed of a random sample of 81
adults is 325 words per minute. Find a 90% C.I. For
the mean reading speed of all adults (μ) if it is known
that the standard deviation for all adults is 45 words
per minute.
Answer = (316.8, 333.2)

Interpretation; We are 90% confident that the truth value in


population is between the above CI.
11/02/2024 Biostatistics By: Abrham T. 19
Example (2)
 A physical therapist wished to estimate, with 99%
confidence, the mean maximal strength of a particular
muscle in a certain group of individuals. He assume that
strength scores are approximately normally distributed
with a variance of 144. A sample of 15 subjects who
participated in the experiment yielded a mean of 84.3,
find 99% C.I _____? Interpretation; we are 99% sure/confident
that the mean maximal strength of a particular
muscle in population is between CI ( 76.3,
Answer ( 76.3, 92.3) 92.3)

11/02/2024 Biostatistics By: Abrham T. 20


Reading Assignment!!
C.I. For The Difference Between
two Population Means

Known variance (2 independent samples)

A 100( 1- α)% C.I. for μ1 - μ2 is

11/02/2024 Biostatistics By: Abrham T. 21


C.I. for a Single Population Proportion

 For large sample size

 A 100(1 - α)% C.I. for π is:

o Where, P = proportion of success in a population,

1-P = proportion of failure and n = sample size


11/02/2024 Biostatistics By: Abrham T. 22
Example (1)
 An epidemiologist is worried about the ever increasing trend of
malaria in a certain locality and wants to estimate the proportion of
persons infected in the peak malaria transmission period. If he
takes a random sample of 150 persons in that locality during the
peak transmission period and finds that 60 of them are positive for
malaria, then find 99% confidence intervals for the proportion of
the whole infected people in that locality during the peak malaria
transmission period . Answer (0.297, 0.503)

11/02/2024 Biostatistics By: Abrham T. 23


Example (2)

A study on dental health practice. Of 300 adults


interviewed, 123 said that they regularly had a dental
checkup twice a year. What is the 95% C.I. for π(P)?

o Interpretation = ???
Ans: (0.36, 0.46)
Interpretation; We are 95% sure/confident that the proportion of
populations who had regular dental checkup twice a year is between CI
(0.36, 0.46)

11/02/2024 Biostatistics By: Abrham T. 24


Reading Assignment!!

C.I. for the difference between two population


proportions
 For large sample size; (n >= 30)

 A 100(1- α)% C.I. for π1 - π2 or (P1 – P2) is:-

11/02/2024 Biostatistics By: Abrham T. 25


11/02/2024 Biostatistics By: Abrham T. 26
 An essential part of planning any study is to
decide:
 How many people need to be studied and
 How to choose them.

11/02/2024 Biostatistics By: Abrham T. 27


Sample Size (= n)

 Sample size: The number of study subject


selected to represent a given study population.
 Important to make inferences based on the
findings from the sample.
 Should be sufficient to represent the characteristics
of interest of the study population.

11/02/2024 Biostatistics By: Abrham T. 28


 Common questions:
"How many subjects should I study?”
 Too small sample = Waste of time and resources
= Results have no practical use
 Too large sample = Waste of resources
= Data quality compromised
= Any small difference can be
statistically significant

11/02/2024 Biostatistics By: Abrham T. 29


When deciding on sample size:

PRECISION COST

Sample size = Precision = Cost

 Precision is related to confidence level & CI

11/02/2024 Biostatistics By: Abrham T. 30


Sample size determination depends on:

 Objective of the study


 Design of the study
 Descriptive/Comparative
 Degree of precision or accuracy – the allowed
deviation from the true population parameter (can
be within 1% or 5%, etc)
 Plan for statistical analysis
 Degree of confidence level required, usually
specified as 95%:

11/02/2024 Biostatistics By: Abrham T. 31


Sample size for single sample includes:
32

1) Sample size for estimating a single population


mean
2) Sample size to estimate a single population
proportion

11/02/2024 Biostatistics By: Abrham T. 32


A) Sample size for estimating a
single population mean
A 100(1-)% C.I. for  is:

We use a sample
mean as a point
estimate of 

 is to be chosen by the researcher, the most common


values of  are 0.05, 0.01 and 0.1.

11/02/2024 Biostatistics By: Abrham T. 33


A) Sample size for estimating a
single population mean

• AIM: Estimate µ
• WANT: Estimate ( ) ± d units
where d = Margin of error =
= Absolute precision
= Half of the width (w) of CI
Steps:
1. Specify d (or w = 2d)
2. Use known σ2 or estimate using s2

11/02/2024 Biostatistics By: Abrham T. 34


estimator of the
Standard error of the
parameter of
interest
3.

Where d = e in some text books

11/02/2024 Biostatistics By: Abrham T. 35


Example (1)
 Find the minimum sample size (n) needed to estimate the
drop in heart rate (µ) for a new study using a higher dose of
propranolol than the standard one. We require that the two-
sided 95% CI for µ be no wider than 5 beats per minute and
the sample sd for change in heart rate equals 10 beats
per minute.
n = (1.96)2102/(2.5)2 = 62 patients
 What could be the sample size when we increase the confidence level?

To change the confidence level, the multiplier (1.96)2 as


follows: 90% CL=(1.64)2, 99% CL=(2.58)2

11/02/2024 Biostatistics By: Abrham T. 36


Example (2)
 Suppose that for a certain group of cancer patients, we are
interested in estimating the mean age at diagnosis. We
would like a 95% CI of 5 years wide. If the population SD
is 12 years, how large should our sample be?

11/02/2024 Biostatistics By: Abrham T. 37


What could be the sample size when we decrease margin
of error ?

 Suppose d=1
 Then the sample size increases

11/02/2024 Biostatistics By: Abrham T. 38


Exercise!!
 A hospital administrator wishes to estimate the
mean weight of babies born in the hospital.
How large a sample of birth records should be
taken if she/he wants a 95% CI of 0.5 wide?
Assume that a reasonable estimate of  is 2.

11/02/2024 Biostatistics By: Abrham T. 39


B) Sample size to estimate a
single population proportion
• Aim: Estimate p
• Want: Estimate ± d units where d = Z•SE
(95% CI of width=2d)
Steps:
1. Specify d (or w = 2d)
2. Use estimated p (use p=0.5 if no
information)

11/02/2024 Biostatistics By: Abrham T. 40


3. Solve for n

11/02/2024 Biostatistics By: Abrham T. 41


Example (1)
 Suppose that you are interested to know the
proportion of infants who breastfed >18 months of
age in a rural area. Suppose that in a similar area,
the proportion (p) of breastfed infants was found to
be 0.20. What sample size is required to estimate
the true proportion within ±3% points with 95%
confidence. Let p=0.20, d=0.03, α=5%

11/02/2024 Biostatistics By: Abrham T. 42


 Suppose there is no prior information
about the proportion (p) who breastfeed
 Assume p=q=0.5 (most conservative)
 Then the required sample size increases

11/02/2024 Biostatistics By: Abrham T. 43


Example (2)
 Suppose that we wish to estimate the prevalence of
asthma in an adult population with the width of the
95% confidence interval of 0.10, an accuracy of ±
0.05. An estimate of the prevalence of asthma is
0.10.
Ans: 138

11/02/2024 Biostatistics By: Abrham T. 44


Example (3)
 P = 0.26 , d = 0.03 , Z = 1.96 ( i.e., for a 95% C.I.)

Thus, the study should include at least 822 subjects.

11/02/2024 Biostatistics By: Abrham T. 45


Points for Consideration
1. Sample size estimates might need to be adjusted to compensate for non-
response rate, patient dropout or loss to follow-up, lack of
compliance, etc.
2. If sampling is from a finite population of size N, then:
n0
n=
1 + n 0 
 
N
where n0 is the sample from an infinite population. When N is large
in comparison to n, (i.e., n/N ≤ 0.05), the finite population correction
may be ignored.
3. Design effect for complex cluster sampling. Common values:
multiply n by 2, 3, …5.

11/02/2024 Biostatistics By: Abrham T. 46


Example:
 If the above sample(Example -3) is to be taken from a
relatively small population (say N = 3000), the
required minimum sample will be obtained from the
above estimate by making some adjustments (if the
population is less than 10,000 then a smaller sample
size may be required.).
821.25
n final   644.7  645
821.25
1
3000

11/02/2024 Biostatistics By: Abrham T. 47


48 Using A Census For Small Populations

♠ Use the entire population as the sample.

♠ Attractive for small populations ( 200 or less).

♠ Eliminates sampling error and provides data on all


the individuals in the population.

♠ Financial considerations make this impossible for


large populations.
11/02/2024 Biostatistics By: Abrham T. 48
11/02/2024 Biostatistics By: Abrham T. 49
Definition

 A statistical hypothesis is an assumption or a statement


which may or may not be true concerning one or more
populations.
 The significance test or hypothesis test.
 A significance test enables us to measure the strength of the
evidence which the data supply concerning some proposition of
interest.

11/02/2024 Biostatistics By: Abrham T. 50


Hypothesis testing

The purpose of hypothesis testing is to determine


whether enough statistical evidence exists to enable us
to conclude that a belief or hypothesis about a
parameter is reasonable

11/02/2024 Biostatistics By: Abrham T. 51


Hypothesis Testing Process

Assume the
population
mean age is 50.
( H 0 :   50) Identify the Population

Is X  20 likely if   ?
Take a Sample
No, not likely!

REJECT H0
X  20
11/02/2024 Biostatistics By: Abrham T. 52
Type of statistical hypotheses

 There are two hypotheses:


 Null hypotheses

 Alternative hypotheses

11/02/2024 Biostatistics By: Abrham T. 53


1) Null hypotheses

 Null hypothesis – called the hypothesis of no


difference or no association or no effect

 States that ‘’there’s no difference’’ between the


hypothesized value and the population parameter
value

 Is always about a population parameter, not about


a sample
11/02/2024 Biostatistics By: Abrham T. 54
Null Hypothesis: H0

 The null hypothesis (denoted by H0) is a statement


that the value of a population parameter (such as
proportion, mean, or standard deviation) is equal to
some claimed value.
 Always contains the “=” “≤” or “≥” sign
 We test the null hypothesis directly.

 Either reject H0 or fail to reject H0.


11/02/2024 Biostatistics By: Abrham T. 55
2) Alternative hypotheses

 Alternate to null hypothesis


 Says’’ there’s a difference between the
hypothesized value and the population parameter
value

 It is what we are trying to prove, i.e. the reason for


the research question.

11/02/2024 Biostatistics By: Abrham T. 56


Alternative Hypothesis: H1 or HA

 The alternative hypothesis (denoted by H1 or HA) is the


statement that the parameter has a value that somehow
differs from the null hypothesis.

 The symbolic form of the alternative hypothesis


must use one of these symbols: ≠, < or >.

 May or may not be accepted

11/02/2024 Biostatistics By: Abrham T. 57


Hypothesis
Example: Consider population mean

H0: μ = μ0 H0: μ ≤ μ0 H0: μ  μ0


HA: μ  μ0 HA: μ > μ0 HA: μ < μ 0

Two- tailed One - tailed One- tailed

11/02/2024 Biostatistics By: Abrham T. 58


Exercises
 State HO and HA for each of the following
1) Is the average height of health science students 1.65 m or is it more?

2) Is the average height of MPH students 1.65 m or is it less?

3) Is the average height of MPH students 1.65 m or is it something


different?

4) Are men and women infected with malaria in equal proportions, or is a


higher proportion of men get malaria in Ethiopia?

11/02/2024 Biostatistics By: Abrham T. 59


Steps in hypothesis testing

1) State the statistical hypotheses


 There are two hypotheses:
 Null hypotheses

 Alternative hypotheses

11/02/2024 Biostatistics By: Abrham T. 60


2) Select the level of significance and
determine critical value
Level of Significance (): Defines rejection
region H0 of the sampling distribution
 Called rejection region of sampling distribution
 Is designated by , (level of significance)
 Typical values are .01, .05, or .10
 Is selected by the researcher at the beginning
 Provides the critical value(s) of the test

11/02/2024 Biostatistics By: Abrham T. 61


Determine the critical value
 Find the critical value in the distribution table for each test
statistics(tabulated value) using the α - value and the degrees of
freedom(where applicable)

 The critical value separates the acceptance zone


from the rejection zone of H0
 One tailed test – area of rejection is in either the lower or
upper tail of the distribution

 Two- tailed – 2 areas of rejection, one in each tail of the


distribution

11/02/2024 Biostatistics By: Abrham T. 62


Rejection Regions
(Two - tail test)

H 0: =
H A: 

11/02/2024 Biostatistics By: Abrham T. 63


Rejection Region
(One - tail test: Left tail test)

H 0: =
H 1: <

11/02/2024 Biostatistics By: Abrham T. 64


Rejection Region
(One - tail test: Right tail test)

H 0: =
H 1: >

11/02/2024 Biostatistics By: Abrham T. 65


3) Decide on the appropriate test statistic for
the hypothesis (Z, t, χ2, F, etc ) and
compute the test statistic value

11/02/2024 Biostatistics By: Abrham T. 66


Cont’d
Apply the formula of the test statistics to get the
calculated value

 Compare the calculated value to the tabulated or the


critical value

 Check which zone the calculated value falls into

11/02/2024 Biostatistics By: Abrham T. 67


4) Make decision as per the decision rule

There are two possible decisions:


I) Reject the null hypothesis
o Conclude that there is enough evidence to support the
alternative hypothesis
II) Do not reject the null hypothesis
o Conclude that there is not enough evidence to support the
alternative hypothesis
 If the numerical value of the test statistic falls in the
rejection region, we reject the null hypothesis
 If the test statistic does not fall in the rejection region, we do not
reject H0
11/02/2024 Biostatistics By: Abrham T. 68
Decision rule
 
.5 .5
2 2
Form of Ha: 0
2-tail hypothesis
 
2
If |z|>|z/2| 2

Then reject the null hypothesis. 0

.5
Form of Ha: < 0
1-tail hypothesis
 .5
If z< -z

Then reject the null hypothesis. 0

.5
Form of Ha: > o
1-tail hypothesis

.5
If z> z

Then reject the null hypothesis 


P-value
 P-value: is the probability of obtaining a test
statistic at least as extreme ( ≤ or  ) as the
observed sample value, given H0 is true
 Also called observed level of significance
 Smallest value of  for which H0 can be
rejected
 Decision : Compare the p-value with 
 If p-value ≥ α, do not reject H0
 If p-value < α, reject H0

11/02/2024 Biostatistics By: Abrham T. 70


Cont’d

♦ 5) Draw conclusion

11/02/2024 Biostatistics By: Abrham T. 71


Hypothesis testing on a single
population mean

Zcal =

11/02/2024 Biostatistics By: Abrham T. 72


Example (1)
 A researcher claims that the mean of the IQ for 64 patients is 110 and
the expected value for all population is 100 with standard deviation
of 10. Test the hypothesis that the population mean is different from
100(α=5%).
Solution:

1) Ho:µ=100 Vs HA:µ≠100

2) Assume α=0.05, Z-critical at 0.025 is equal

to 1.96(Z-tab).

3) Z-cal=(110-100)8/10=8

4) 8≥1.96 There fore, the decision is reject the null hypothesis


5) We have sufficient evidence to support that the mean IQ of patients
is different from 100.
Example (2)
 Researchers are interested in the mean level of some enzyme in a
certain population. They are asking: can we conclude that the mean
enzyme level in this population is different from 25? They collect a
sample of size 10 from a normally distributed population with a known
variance, σ2 = 45. The calculated sample mean is = 22 (α=5%).
Step 1: H0: μ= 25
HA: μ ≠25

Step 2: α= 0.05, z-critical at 0.025 is equal to 1.96.

Step3:Testing a hypothesis about population mean.

The population is normally distributed. Population variance is known


Z= ( x - O ) / (   n )
11/02/2024 Biostatistics By: Abrham T. 74
Step 5: Since -1.41 falls in the acceptance region we accept the null
hypothesis.

 The mean enzyme level in the population is not different from 25.
11/02/2024 Biostatistics By: Abrham T. 75
Reading Assignment!!

Hypothesis testing about differences


between population means
(normally distributed)

11/02/2024 Biostatistics By: Abrham T. 76


Hypothesis testing about a single
population proportion

11/02/2024 Biostatistics By: Abrham T. 77


Example:

 A survey was conducted to study the dental


health practices, and attitudes of a certain urban
adult population. Of 300 adults interviewed, 123
said that they regularly had a dental check up
twice a year. Can we conclude that the population
proportion π = 0.5? (α=5%)

11/02/2024 Biostatistics By: Abrham T. 78


11/02/2024 Biostatistics By: Abrham T. 79
Solution

11/02/2024 Biostatistics By: Abrham T. 80


Solution

 Since -3.11 < -1.96, we reject H0. We conclude


that not 50% of the population regularly have a
dental check up twice a year.

11/02/2024 Biostatistics By: Abrham T. 81


Reading Assignment!!
Hypothesis testing for differences
between two population proportion
♦ The standard error of the difference is given by:

♦ The test statistic becomes Z =

11/02/2024 Biostatistics By: Abrham T. 82


Errors in making Decision
1. Type I Error
 Probability of rejecting null hypothesis when its true
 Probability of accepting a false alternative hypothesis
 Probability of Type I Error is  (Alpha)
• Called level of significance/confidence

2. Type II Error
 Probability of accepting null hypothesis when its false
 Probability of failing to reject null hypothesis when its
false
 Probability of rejecting a true alternative hypothesis
 Probability of Type II Error is  (Beta)
11/02/2024 Biostatistics By: Abrham T. 83
Power of a statistical test

 Power : The probability of rejecting the null hypothesis


when it is false.
 Power = 1- β.

 Clearly if the test maximizes power, it minimizes the


probability of Type 2 error β.

11/02/2024 Biostatistics By: Abrham T. 84


Factors Affecting Type II Error

 Significance level
–  Increases when  decreases
 
 Population standard deviation
 
–  Increases when  increases

 Sample size

– Increases when n decreases
n
11/02/2024 Biostatistics By: Abrham T. 85
Decision Results
H0: Innocent
Jury Trial H0 Test
Actual Situation Actual Situation

Verdict Innocent Guilty Decision H0 True H0


False
Type II
Innocent Correct Error Accept 1– Error
H0
()
Reject Type I Power
Guilty Error Correct H0 Error () (1 – )

11/02/2024 Biostatistics By: Abrham T. 86


11/02/2024 Biostatistics By: Abrham T. 87
Exercises for:

Estimation
Hypothesis testing
Sample size determination

11/02/2024 Biostatistics By: Abrham T. 88


Exercise!!

 The mean serum cholesterol level in a certain population of


normal healthy men is 245 mg/dl and the standard deviation is
45 mg/dl. A clinical researcher interested in comparing
cholesterol levels in this healthy population with those in men
with coronary artery disease measured serum cholesterol level
in a random sample of 100men who had undergone coronary
bypass surgery during the preceding two year period. The
mean serum cholesterol level for sampling was 267 mg/dl. can
the researcher conclude that the mean serum cholesterol of
men undergoing coronary bypass surgery differs from that of
healthy men? Alpha=5%

11/02/2024 Biostatistics By: Abrham T. 89


Exercise!!
 A survey of school children to determine the population of immunized
children against polio, an investigator determined the maximum
discrepancy b/n sample and population proportion of immunized to be
0.04, at level of confidence of 99%.further the investigator had a previous
knowledge on the prevalence among children in a similar community to
be 90% and the total population of school children is 800.
 A health officer wishes to estimate the mean serum cholesterol in a
population of men. From previous similar studies a standard deviation of
40 mg/100ml was reported. If he is willing to tolerate a marginal error of
up to 5 mg/100ml in his estimate, how many subjects should be included
in his study? ( =5%, two sided).
11/02/2024 Biostatistics By: Abrham T. 90
Exercise!!
 A hospital administrator wishes to know what proportion of
discharged patients are unhappy with the care received during
hospitalization . If 95% Confidence interval is desired to
estimate the proportion within 5%, how large a sample should
be drawn ?
 A population of cancer patients has survival standard deviation
of 43.4 months. If one wants to conduct a study on these
populations how large sample size is needed, so that 95% of the
sample mean of this size will be within ±6 months of the
population mean. Population size is 480 patients.

11/02/2024 Biostatistics By: Abrham T. 91


Exercise!!
 If a random sample of 50 non-smokers have a mean life of
76 years with a standard deviation of 8 years, and a random
sample of 65 smokers live mean of 68 years with a standard
deviation of 9 years,

A. Find a 95% C.I. for the difference of mean lifetime of non-


smokers and smokers.

B. Interpretation?

11/02/2024 Biostatistics By: Abrham T. 92


Exercise!!

 An anthropologist who wanted to study the heights of


adult men and women took a random sample of 128
adult men and 100 adult women and found the following
summary results [n1= 128; n2= 100; mean1= 170cm;
mean2 = 164cm; SDm=8; SDw=6]. Find point and
interval estimate and interpret the result.

11/02/2024 Biostatistics By: Abrham T. 93


Exercise!!
Two hundred patients suffering from a certain disease were
randomly divided into two equal groups. Of the first group,
who received the standard treatment, 78 recovered within
three days. Out of the other 100, who were treated by a new
method, 90 recovered within three days. The physician
wished to estimate the true difference in the proportions
who would recovered within three days. Assume he wants
to estimate 95% C.I
11/02/2024 Biostatistics By: Abrham T. 94
Exercise!!
1) The mean weight of 100 children who are 5 years old in a
certain locality is found to be 14 kg. A clinician wants to
know the mean weight of all the children in that locality
with 95 % confidence interval, if it is known that the SD for
all children is 4kg. find 95% C.I?
2) A survey conducted on a reprehensive sample of 900
newborn babies in A/A and it is found that their average
weight at birth is 3.5 kg with SD of 0.5Kg estimate the wt of
newborn babies in A/A at the 95% level of confidence.
3) A random sample of 100 people shows that 25 are left-
handed. Find a 95% CI for the true proportion of left-
handers
 Calculate CI and interpret each finding?
11/02/2024 Biostatistics By: Abrham T. 95
Exercise!!

 A study was to investigate the oral status of a group of


patients diagnosed with thalassemia major (TM). One of the
outcome measures was the decayed, missing, and filled teeth
index (DMFT). In a sample of 18 patients the mean DMFT
index value was 11.3 with a standard deviation of 6.3. Is this
sufficient evidence to allow us to conclude that the mean
DMFT index is greater than 9.0 in a population of similar
subjects? Let alpha=5%

11/02/2024 Biostatistics By: Abrham T. 96


Exercise!!

A health officer is trying to study the malaria situation of


Ethiopia. From the records of seasonal blood survey (SBS)
results he came to understand that the proportion of people
having malaria in Ethiopia was 3.8% in 1978 (Eth. Cal). The
size of the sample considered was 15000. He also realised that
during the year that followed (1979), blood samples were
taken from 10,000 randomly selected persons. The result of the
1979 seasonal blood survey showed that 200 persons were
positive for malaria. Help the health officer in testing the
hypothesis that the malaria situation of 1979 did not show any
significant difference from that of 1978 (take the level of
significance, α =0.01).

11/02/2024 Biostatistics By: Abrham T. 97


Exercise!!
 In a large hospital for the treatment of : mentally
retarded, a sample of 12 individuals with mongolism
yielded a mean serum uric acid value of = 4.5 mg/100 ml.
In a general hospital a sample of 15 normal individuals of
the same age and sex were found to have a mean value of
=3.4. If it is reasonable to assume that the two populations
of values are normally distributed with variance equal to
1, do these data provide sufficient evidence to indicate a
difference in mean serum uric acid levels between normal
individuals and individuals with mongolism? (α=5%)

11/02/2024 Biostatistics By: Abrham T. 98


Exercise!!
 Two hundred patients suffering from a certain disease were
randomly divided into two equal groups. Of the first group, 78
recovered within three days. Out of the other 100, who were
treated by a new method, 90 recovered within three days.

 The physician wished to know whether the data provide


sufficient evidence to indicate that the new treatment is more
effective than the standard assuming that they have common
population proportion. (α=5%)

11/02/2024 Biostatistics By: Abrham T. 99


Reading assignment !!

CI for population mean: Large-sample


size and when is unknown.
CI for population mean: small sample
size (n<30) and when is unknown
Sample size for estimating a double
population mean
Sample size for estimating a double
population proportion?

11/02/2024 Biostatistics By: Abrham T. 100


Class end!!

11/02/2024 Biostatistics By: Abrham T. 101

You might also like