Statistical Methods
Statistical Methods
Methods
By
Prof. Dr/ Monir Hussein Bahgat
Professor of Internal Medicine, Mansoura University.
Trainer of SPSS and statistical analysis skills, Mansoura University.
Quality manager, Specialized Medical Hospital, Mansoura University.
Acknowledgments
This two-year curriculum was developed through a participatory and collaborative approach
between the Academic faculty staff affiliated to Egyptian Universities as Alexandria University, Ain
Shams University, Cairo University, Mansoura University, Al-Azhar University, Tanta University, Beni
Souef University, Port Said University, Suez Canal University and MTI University and the Ministry of
Health and Population(General Directorate of Technical Health Education (THE). The design of this
course draws on rich discussions through workshops. The outcome of the workshop was course
specification with Indented learning outcomes and the course contents, which served as a guide to
the initial design.
We would like to thank Prof.Sabah Al- Sharkawi the General Coordinator of General Directorate
of Technical Health Education, Dr. Azza Dosoky the Head of Central Administration of HR
Development, Dr. Seada Farghly the General Director of THE and all share persons working at
General Administration of the THE for their time and critical feedback during the development of this
course.
Special thanks to the Minister of Health and Population Dr. Hala Zayed and Former Minister
of Health Dr. Ahmed Emad Edin Rady for their decision to recognize and professionalize health
education by issuing a decree to develop and strengthen the technical health education curriculum
for pre-service training within the technical health
Contents
Acknowledgments ................................................................. ii
Preface .............................................................................. 9
Appendix: A ....................................................................... 88
Appendix: B ....................................................................... 91
Appendix: C ....................................................................... 95
Appendix: D ....................................................................... 99
Course Description
This course is prepared for undergraduate students aiming at introducing the student
both to statistical reasoning and to the most commonly used statistical techniques. It
should also support the students who master this material to be able to select,
implement, and interpret the most common types of analyses as they undertake
research in their own disciplines.
Core Knowledge
4
Statistical Methods
1- Remedial lectures.
2- Remedial practicing lessons.
3- Simplified sketches for different statistical methods.
4- Training to answer model question exercises.
5
Statistical Methods
Core Skills
بٌانات المقرر
ً شعبة تسجٌل طب/ الثانٌة: المستوى/ الفرقة Statistical Methods طرق إحصائٌة:اسم المقرر : الرمز الكودى
تقوٌم الطالب
التوقٌت
Final theoretical written exam: 3-hours.
6
Statistical Methods
7
Statistical Methods
Course Overview
ID Theory Practice
Introduction: Define and classify
statistical methods. Outline
1st week Plot a flow chart for statistical methods
importance of studying statistical
methods.
Essential terminology to
The student is asked to create explanatory
2nd week understand common statistical
example for each statistical term.
terms.
5th week Descriptive statistics: Measures of Problem solving for mean, median and mode
central Tendency.
Descriptive statistics: Measures of Problem solving for rate, ratio, proportion, and
7th week
frequencies. percentage.
5-step procedure for hypothesis The student is asked to perform the procedure
10th week
testing working on an explanatory example.
13th week Revision for part III Revision for part III
8
Statistical Methods
Preface
The goal is that students who master this material will be able to select,
implement, and interpret the most common types of analyses as they undertake
research in their own disciplines.
Moreover, they should be able to read research articles and in most cases
understand the descriptions of the statistical results and how the authors used
them to reach their conclusions.
They should understand the pitfalls of collecting statistical data, and the roles
played by the various mathematical assumptions.
9
Statistical Methods
On one hand, students can learn by repetition how to plug numbers into formulas,
or more often now, into a computer program, and draw a number with a neat
circle around it as the answer. This limited approach is mind distressing, and
rarely leads to the kind of understanding that allows students to critically select
methods and interpret results.
On the other hand, there are numerous textbooks that provide introductions to
the elegant mathematical backgrounds of the methods. Although this is a much
deeper understanding than the first approach, its prerequisite mathematical
understanding closes it to practitioners from many other disciplines.
This book seems to take a middle way by presenting enough of the formulas to
motivate the techniques, and by illustrating their numerical application in a small
example.
However, the focus of the discussion is on the selection of the technique, the
interpretation of the results, and a critique of the validity of the analysis.
10
Statistical Methods
Statistical
Methods
Descriptive Inferential
Statistics Statistics
Measures of
Measures of Hypothesis Estimation
Central
Dispersion Testing Statistics
Tendency
11
Statistical Methods
Definition:
A population is a data set representing the entire entity of interest while a sample
is a data set consisting of a portion of a population. Normally a sample is obtained
in such a way as to be representative of the population.
Explanatory example:
12
Statistical Methods
- Variables:
Definition:
Explanatory example:
When a study involves the two types of sex (coded as 1 if male and 2 if female),
sex in this study is considered a variable. In another study performed on males
only, sex is constant and is not considered as a variable.
Variable types:
Quantitative variables are of two types: Continuous (data are continuous points on
the scale and therefore include fractions) and Discrete (data are discrete points
on the scale and therefore always come in integer form).
13
Statistical Methods
definition of zero. When the variable equals 0.0, there is none of that variable).
When working with ratio variables, but not interval variables, you can look at the
ratio of two measurements.
Explanatory examples:
Number of children for each female in a study is a discrete variable because it has
a measurement unit (children number; e.g., 2 children, 3 children, and so on) and
fractions are not accepted (a female can‟t have 3.5 children).
14
Statistical Methods
- Data set:
Definition:
Explanatory example:
You asked a 11 patients about their age (in years), sex (coded as 1 if male and 2 if
female) and education level (coded as 1 if illiterate or just read & write, 2 if
educated to a level below university and 3 if educated at a university level or
more). You tabulated the results as follows:
1 21 1 1
2 24 1 2
3 23 2 2
4 25 2 3
5 26 1 1
6 30 2 2
7 31 1 3
8 32 1 3
9 20 2 1
10 21 2 1
11 24 2 2
15
Statistical Methods
Definition:
Descriptive statistics intend to describe a data set with summary charts and
tables, but do not attempt to draw conclusions about the population from which
the sample was taken. You are simply summarizing the data you have like telling
someone the key points of a book (executive summary) as opposed to just handing
them a thick book (raw data).
Conversely, with inferential statistics, you are testing a hypothesis and drawing
conclusions about a population, based on your sample.
Explanatory example:
Descriptive statistics:
Let‟s say you‟ve tested 50 university students for hepatitis C antibody. You have a
bunch of data plugged into your spreadsheet and now it is time to share the
results with someone. You could hand over the spreadsheet and say “here‟s what
16
Statistical Methods
I learned” (not very informative), or you could summarize the data with some
charts and graphs that describe the data and communicate some conclusions (e.g.
10% of university students have positive anti-HCV antibody). This would sure be
easier for someone to interpret than a big spreadsheet. There are hundreds of
ways to visualize data, including data tables, pie charts, line charts, etc. That is
the idea of descriptive statistics. Note that the analysis is limited to your data
and that you are not extrapolating any conclusions about the full population (the
whole university students).
Inferential statistics:
Let‟s continue with HCV example. Let‟s say you wanted to know the prevalence
in Mansoura University students. Well, there are >110,000 students and it would
be impossible to test every single student for anti-HCV. Instead, you would try to
test a representative sample of students and then extrapolate your sample results
to the entire population. While this process is not perfect and it is very difficult
to avoid errors, it allows researchers to make well-reasoned inferences about the
population in question. This is the idea behind inferential statistics. Getting a
representative sample is really important. There are many methods of sampling
strategies, including random sampling. A true random sample means that
everyone in the target population has an equal chance of being selected for the
sample. Another key component of proper sampling is the sample size. Obviously,
the larger the sample size, the better, but there are trade-offs in time and money
when it comes to obtaining a large sample.
17
Statistical Methods
There are online calculators as well as software (free and commercial) that help
determine appropriate sample sizes.
https://www.surveysystem.com/sscalc.htm
https://www.calculator.net/sample-size-calculator.html
http://clincalc.com/stats/samplesize.aspx
https://www.cdc.gov/epiinfo/index.html (free)
http://www.gpower.hhu.de/en.html (free)
https://www.ncss.com/software/pass/ (commercial)
When it comes to inferential statistics, there are generally two forms: estimation
statistics and hypothesis testing.
Estimation Statistics:
“Estimation statistics” is a way of saying that you are estimating population values
based on your sample data. Let‟s think back to our sample HCV data. First, let‟s
assume that we had a true random sample of 50 students from this university and
that our full target population is >110,000 students. Let‟s say that 10% of
students in our sample had positive anti-HCV. Can we safely extrapolate that 10%
of all university students also will have positive anti-HCV? Is that the true value of
the university? Well, we can‟t say with 100% confidence, but–using inferential
18
Statistical Methods
Hypothesis Testing
With hypothesis testing, one uses a test such as T-Test, Chi-Square, or ANOVA to
test whether a hypothesis about the mean is true or not. Again, the point is that
this is an inferential statistic method to reach conclusions about a population,
based on a sample set of data.
19
Statistical Methods
Practical exercises
20
Statistical Methods
21
Statistical Methods
Answers:
1. D.
2. A.
3. B.
4. A.
5. B.
6. C.
7. C.
8. A.
22
Statistical Methods
Statistical methods are mathematical formulas, models, and techniques that are
used in statistical analysis of raw research data. The application of statistical
methods extracts information from research data and provides different ways to
assess the strength of research outputs.
To properly select which statistical method to use, we need to know our aim
(goal) and the type of data (or variable) to be tested. The goal may be just
descriptive or it might be performing a comparison or looking for a relationship.
Variables may be nominal (including the dichotomous variable), ordinal, or
quantitative. For quantitative variables, we need to know if this variable is
normally distributed (Gaussian distribution) or not (non-Gaussian distribution).
This can be done by many statistical methods including Kolmogorov-Smirnov test
(for large data sets) and Shapiro-Wilk test (for small data sets <50) where data
will be considered normally distributed if the test result is insignificant (p value >
0.050) and will be considered non-normally distributed (skewed) if the test result
is significant (p value ≤ 0.050).
23
Statistical Methods
The normal distribution is bell shaped and symmetric about the mean.
The following table simplify the choice of the proper statistical method:
24
Statistical Methods
- Quantitative
Aim (goal) Quantitative
(skewed distribution) Nominal
(normal distribution)
- or ordinal
Independent-
Compare two-groups Mann-Whitney test Chi-Square test
samples t-test
Compare > two groups One-Way ANOVA test Kruskal Wallis H test Chi-Square test
Contingency
Association Pearson‟s correlation Spearman‟s correlation
coefficient
Logistic
Prediction Linear regression e.g., ordinal regression
regression
25
Statistical Methods
The two most important aspects of describing a data set are the location (central
tendency) and the dispersion (spread) of the data.
In other words, we need to find a number that indicates where the observations
are on the measurement scale and another to indicate how widely the
observations vary.
26
Statistical Methods
Practical exercises
1. You collected the age in years for a sample of 40 patients. To test for
normality, you will use:
a. Kolmogorov-Smirnov test.
b. Shapiro-Wilk test.
c. Mann-Whitney test.
d. ANOVA test.
2. You tested the normality of age in years of a sample of 300 patients using
Kolmogorov-Smirnov test and you found that p value = 0.475. this means
that the age distribution is:
a. Normal.
b. Non-Gaussian.
c. Positively-skewed.
d. Negatively-skewed.
27
Statistical Methods
5. You measured the sleeping hours for 50 patients first after taking a placebo
and then after taking a sleeping pill that is claimed to be better than
placebo. Sleeping hours were normally distributed after both placebo and
sleeping pill. To test this hypothesis, you will use:
a. One-sample t-test.
b. Paired-samples t-test.
c. Independent-samples t-test.
d. ANOVA test.
6. You measured the sleeping hours for a group of 50 patients after taking a
placebo and for another group of 50 patients after taking a sleeping pill that
is claimed to be better than placebo. The two groups are of similar age and
sex (matched groups). Sleeping hours were normally distributed in both
groups. To test this hypothesis, you will use:
a. One-sample t-test.
b. Paired-samples t-test.
c. Independent-samples t-test.
d. ANOVA test.
28
Statistical Methods
7. You measured the sleeping hours after taking one of three sleeping pills to
compare their effect. Each pill was tested in a separate group of 50
patients; and the three groups were of similar age and sex (matched
groups). Sleeping hours were normally distributed in each of the three
groups. To test this hypothesis, you will use:
a. One-sample t-test.
b. Paired-samples t-test.
c. Independent-samples t-test.
d. ANOVA test.
29
Statistical Methods
9. You want to test for a possible linear association between the heart rate
(measured in beats / minute) of a 100 healthy women and their age
(measured in years). You assumed that as age goes up, heart rate will go
down (negative correlation). You found that both heart rate and age were
normally distributed. To test this hypothesis, you will use:
a. Pearson‟s correlation.
b. Spearman‟s correlation.
c. Contingency coefficient.
d. Chi-square test.
10. You want to test if it is possible to predict the heart rate of a healthy
woman (measured in beats / minute) by knowing her age (measured in
years). To test this hypothesis, you will use:
a. Pearson‟s correlation.
b. Spearman‟s correlation.
c. Linear regression.
d. Logistic regression.
30
Statistical Methods
Answers:
1. B.
2. A.
3. C.
4. C.
5. B.
6. C.
7. D.
8. D.
9. A.
10. C.
11. D.
31
Statistical Methods
- Mean
- Median
- Mode
Mean:
Definition:
The mean is the sum of all the observed values divided by the number of
values.
32
Statistical Methods
Formula:
n = Sample size
Explanatory example:
For the example in table (1) for the 15 patients, their mean age will be
26.07 years:
Mean = (21+24+23+25+26+30+31+32+20+21+24+26+28+29+31)/15
Median:
Definition:
33
Statistical Methods
Explanatory example:
34
Statistical Methods
This phenomenon can be explained by the fact that the mean can be
interpreted as the center of gravity of the distribution. That is, if the
observations are viewed as weights placed on a plane, then the mean is the
position at which the weights on each side balance. It is a well-known fact
of physics that weights placed further from the center of gravity exert a
larger degree of influence (also called leverage); hence the mean must shift
toward those weights in order to achieve balance. However, the median
assigns equal weights to all observations regardless of their actual values;
hence, the extreme values have no special leverage.
35
Statistical Methods
Explanatory example:
For the following two datasets X and Y we‟ll calculate mean and median for
each dataset:
X: 1, 2, 3, 3, 4, 5
Y: 1, 1, 1, 2, 5, 8
Mean values:
For X = (1 + 2 + 3 + 3 + 4 + 5) / 6 = 18 / 6 = 3
For Y = (1 + 1 + 1 + 2 + 5 + 8) / 6 = 18 / 6 = 3
So, the mean value for these two different datasets is equal.
Median values:
For X = (3 + 3) / 2 = 3
For Y = (1 + 2) / 2 = 1.5
36
Statistical Methods
Pearl:
“Use the mean as the single measure of location unless the distribution of
the variable is skewed.”
Table (1)
Mean Median
The mean is calculated using the value of each observation, so all the
information available from the data is utilized. This is not so for the
median. For the median, we only need to know where the “middle” of the
data is. Therefore, the mean is the more useful measure and, in most cases,
the mean will give a better measure of the location of the data. However,
as we have seen, the value of the mean is heavily influenced by extreme
values and tends to become a distorted measure of location for a highly
skewed distribution. In this case, the median may be more appropriate.
Mode:
Definition:
This measure may not be unique in that two (or more) values may occur
with the same greatest frequency.
37
Statistical Methods
Also, the mode may not be defined if all values occur only once, which
usually happens with continuous numeric variables.
Explanatory example:
1, 1, 2, 1, 3, 2, 4, 3, 1, 3, 4, 2, 3, 1, 1, 1, 1, 3, 4, 5, 6, 3, 6, 2, 5, 1, 1
The most frequently occurring value is „1‟, which occurs 10 times. So the
mode is „1‟.
38
Statistical Methods
Practical exercises
2. For a dataset of 25, 18, 5, 12, 24, and 16, the mean is
a. 12
b. 16.7
c. 20.5
d. 18
3. For a dataset of 25, 18, 5, 12, 24, and 16, the median is
a. 16
b. 17
c. 18
d. 20
4. If 5, 3, 4 and 2 of your sample cases have age of 20, 23, 30, and 25
years, respectively, the age mode is:
a. 20 years.
b. 23 years.
39
Statistical Methods
c. 25 years.
d. 30 years.
5. The median is a better measure of central tendency than the mean if:
a. The variable is discrete.
b. The distribution is skewed.
c. The variable is continuous.
d. The distribution is symmetric.
40
Statistical Methods
Answers:
1. D.
2. B.
3. B.
4. A.
5. B.
6. B.
7. C.
41
Statistical Methods
For normally distributed data, using mean and standard deviation is the
most appropriate while for non-normally distributed data, using median and
interquartile range is more appropriate.
- Range.
- Variance.
- Standard deviation.
42
Statistical Methods
Range:
Definition:
Explanatory example:
Definition:
The interquartile range is the length of the interval between the 25th
and 75th percentiles and describes the range of the middle half of the
distribution.
Formula:
43
Statistical Methods
Steps to calculate:
Explanatory example:
44
Statistical Methods
Variance:
Definition:
Formula:
n = Sample size.
y¯ = mean value.
45
Statistical Methods
Explanatory example:
For a dataset of 1, 2, 3, 4, 5
- Calculate variance:
o Variance:
= Sum of squared deviations / (n - 1)
= 10 / (5 - 1)
= 10 / 4
= 2.5
46
Statistical Methods
Notes:
- Recall that we have already noted that the sum of deviations (yi - y¯)
= 0; hence, if we know the values of any (n - 1) of these values, the
last one must have that value that causes the sum of all deviations to
be zero.
47
Statistical Methods
Standard deviation:
Definition:
Formula:
s = √𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆
Explanatory example:
s=√
s = 1.58
Although the mean and standard deviation (or variance) are only two
descriptive measures, together the two actually provide a great deal of
information about the distribution of an observed set of values.
48
Statistical Methods
Note that for each of these intervals the mean is used to describe the
location and the standard deviation is used to describe the dispersion of a
given portion of the data.
Explanatory example:
49
Statistical Methods
Thus, we can roughly estimate the standard deviation by taking the range
divided by 4.
Explanatory example:
50
Statistical Methods
Practical exercises
51
Statistical Methods
52
Statistical Methods
d. 8
Answers:
1. C.
2. C.
3. B.
4. A.
5. A.
6. D.
7. B.
8. C (Range = 176 – 152 = 24, standard deviation = Range / 4 = 24 / 4 = 6).
53
Statistical Methods
Measures of frequency:
These measures show how often a value occurs
Frequency.
Ratio.
Rate.
Proportion.
Percentage.
Definition:
Absolute frequency: The number of times a certain value occurs in the data.
Relative frequency: The number of times a certain value occurs in the data
(absolute frequency) relative to the total number of values for that
variable.
54
Statistical Methods
Ratios:
Definition:
Ratios compare the frequency of one value for a variable with another value
for the same variable.
Explanatory examples:
Example (1):
Example (2):
Rate:
Definition:
Rate is the measurement of one value for a variable in relation to the entire
sample of values within a given period.
Explanatory example:
55
Statistical Methods
Proportion:
Definition:
Explanatory example:
Percentage:
Definition:
Explanatory example:
Example (1):
Example (2):
56
Statistical Methods
Graphical presentation:
The above measures of frequency are often expressed visually in the form of
tables, histograms (for quantitative variables), or pie or bar graphs (for
qualitative variables) to make the information more easily interpretable.
57
Statistical Methods
Practical exercises
58
Statistical Methods
Answers:
1. A.
2. B.
3. C.
4. D.
59
Statistical Methods
Hypothesis testing
Hypothesis testing
Explanatory example:
A test of the effect of a diet pill on weight loss would be based on observed
weight losses of a sample of healthy adults. If the test concludes the pill is
effective, the manufacturer can safely advertise to that effect.
60
Statistical Methods
The Hypotheses:
The two statements are exclusive and comprehensive, which means that one
or the other statement must be true, but they cannot both be true.
The first statement is called the null hypothesis and is denoted by H 0, and
the second is called the alternative hypothesis and is denoted by H 1.
Definitions:
Explanatory example:
61
Statistical Methods
Suppose that you want to conduct a study to test whether smoking causes
lung cancer. The current status at that time is that there is no evidence yet
to say that and this actually what motivated you to conduct this study. So,
the null hypothesis will be that smoking in NOT causing lung cancer
(statement of “no effect”) while the alternative hypothesis will be that
smoking causes lung cancer. You will not be able to reject the null
hypothesis (and accept the alternative hypothesis) until you finished your
research and found that the null hypothesis is false.
The rejection region (also called the critical region) is the range of values of
a sample statistic that will lead to rejection of the null hypothesis.
Definition:
Type I error:
Type II error:
A type II error occurs when we incorrectly fail to reject H 0, that is, when H0
is actually false, and our inference procedure fails to detect this fact.
62
Statistical Methods
H0 in the population
Decision
True False
Definition:
Importance:
Step 1:
This value is based on the seriousness or cost of making a type I error in the
problem being considered.
63
Statistical Methods
On the other hand, the alternative hypothesis describes conditions for which
something will be done. It is the action or research hypothesis. In an
experimental or research setting, the alternative hypothesis is that an
established (status quo) hypothesis is to be replaced with a new one. Thus,
the research hypothesis is the one we actually want to support, which is
accomplished by rejecting the null hypothesis with a sufficiently low level of
α such that it is unlikely that the new hypothesis will be erroneously
pronounced as true. The significance level represents a standard of
evidence. The smaller the value of α, the stronger the evidence needed to
establish H1.
Step 2:
64
Statistical Methods
Define a sample-based test statistic & rejection region for the specified H0.
The rejection region comprises the values of the test statistic for which
- The probability when the null hypothesis is true is less than or equal to
the specified α and
- The probabilities when H1 is true are greater than they are under H 0.
Step 3:
Step 4:
Step 5:
Interpret the results in the language of the problem in such a way that the
results be usable by the practitioner.
The p value:
65
Statistical Methods
However, just because that error is the more serious one, we cannot
completely ignore the type II error. There are many reasons for ascertaining
the probability of that error, for example:
- The probability of making a type II error may be so large that the test
may not be useful.
- Because of the trade-off between α and β, we may find that we may
need to increase α in order to have a reasonable value for β.
Power of a test:
Definition:
66
Statistical Methods
Two-tailed test:
If you are using a significance level of 0.05, a two-tailed test allots half of
your alpha to testing the statistical significance in one direction and half of
your alpha to testing statistical significance in the other direction.
This means that .025 is in each tail of the distribution of your test statistic.
67
Statistical Methods
One-tailed test:
If you are using a significance level of .05, a one-tailed test allots all of your
alpha to testing the statistical significance in the one direction of interest.
This means that .05 is in one tail of the distribution of your test statistic.
When using a one-tailed test, you are testing for the possibility of the
relationship in one direction and completely disregarding the possibility of a
relationship in the other direction.
68
Statistical Methods
Explanatory example:
We may wish to compare the mean of a sample to a given value „x‟ using a
t-test.
A two-tailed test will test both if the mean is significantly greater than x and
if the mean significantly less than x.
A one-tailed test will test either if the mean is significantly greater than x or
if the mean is significantly less than x, but not both.
69
Statistical Methods
Then, depending on the chosen tail, the mean is significantly greater than or
less than x if the test statistic is in the top 5% of its probability distribution or
bottom 5% of its probability distribution, resulting in a p-value less than 0.05.
The one-tailed test provides more power to detect an effect in one direction
by not testing the effect in the other direction.
Definition:
Explanatory example:
If you expect the proportion of male births to be 50 percent, but the actual
proportion of male births is 53 percent in a sample of 1000 births. Your aim
will be to test if this is significantly different from the hypothesized
population parameter.
Step 1:
Step 2:
70
Statistical Methods
Step 3:
Step 4:
Step 5:
√ [ ]
P1 = Sample proportion.
n = sample size.
Step 6:
Step 7:
Step 8:
71
Statistical Methods
Explanatory example:
Step 1
Research question:
Step 2:
To test this claim, 1000 deliveries were surveyed using „simple random
sampling‟. In this sample, 530 gave birth to male boys.
Step 3:
H0: p = 0.5
Step 4:
72
Statistical Methods
Step 5:
√ [ ]
73
Statistical Methods
Step 6:
Figure 6.3:
74
Statistical Methods
Step 7:
Step 8:
75
Statistical Methods
Practical exercises
76
Statistical Methods
7. You want to test if the mean serum bilirubin is higher in a group of viral
hepatitis patients as compared to a control group. This is:
a. One-tailed test.
b. Two-tailed test.
8. You want to test if the mean age is higher or lower in a group of viral
hepatitis patients as compared to a control group. This is:
a. One-tailed test.
b. Two-tailed test.
77
Statistical Methods
Answers:
1. A.
2. D.
3. A.
4. B.
5. B.
6. D.
7. A.
8. B.
78
Statistical Methods
Estimation statistics
Estimation statistics
Definition:
Formula:
79
Statistical Methods
Margin of error:
Definition:
Step 1:
Step 2:
Step 3:
Step 4:
Choose your desired confidence level. The probability used to construct the
interval is called the level of confidence or confidence coefficient.
Step 5:
80
Statistical Methods
Step 6:
Explanatory example:
Step 1:
Say that the mean body weight of a male student in Mansoura University is 70
kg and the standard deviation is 10 kg (70 ± 10 kg). You'll be testing how
accurately you will be able to predict the weight of male students in Mansoura
University within a given confidence interval.
Step 2:
Step 3:
The sample mean was 70 kg and sample standard deviation was 10 kg.
Step 4:
The most commonly used confidence levels are 90%, 95% and 99%.
81
Statistical Methods
Step 5:
= (Za/2) * σ/√(n)
Check out the z-table (see appendix) to find the corresponding value that
goes with 0.475.
You will see that the closest value is 1.96, at the intersection of row 1.9 and
the column of .06.
82
Statistical Methods
83
Statistical Methods
Standard error of the mean (SEM) = σ/√(n) where σ = Standard deviation and n
= sample size.
= 10/20
= 0.5
Therefore,
= 0.98
Step 6:
So, the Confidence limits (lower and upper boundary values of the interval)
are
= 69.02
= 70.98
This means that we are 95% confident that the population mean lies in the
interval 69.02–70.98
84
Statistical Methods
Practical exercises
2. If the confidence level chosen is 95%, the critical value is the value in z-
table that goes with:
a. 0.05
b. 0.95
c. 0.475
d. 0.095
85
Statistical Methods
c. 5 to 15
d. 6 to 16
5. To develop interval estimate of any parameter of population, value
which is added or subtracted from point estimate is classified as
a. margin of efficiency
b. margin of consistency
c. margin of biasedness
d. margin of error
6. for a chosen confidence level of 95%, the critical value is:
a. 1.63
b. 1.74
c. 1.85
d. 1.96
7. In a study involving 100 participants, the mean height was 160 cm and
the standard deviation was 10 cm. Considering 95% confidence interval
(with a critical value of 1.96), the margin of error is:
a. 0.98
b. 1.96
c. 3.92
d. 5.88
8. In a study involving 100 participants, the mean height was 160 cm and
the standard deviation was 10 cm. Considering 95% confidence interval
(with a critical value of 1.96), the confidence limits are:
a. 159.02 and 160.98
b. 158.04 and 161.96
c. 156.08 and 163.92
86
Statistical Methods
Answers:
1. C.
2. C.
3. D.
4. A.
5. D.
6. D.
7. B.
8. B.
SEM = 10 / 100
= 10 /10
=1
= 1.96 * 1
= 1.96
= 160 ± 1.96
87
Statistical Methods
Appendix A
88
Statistical Methods
89
Statistical Methods
90
Statistical Methods
Appendix B
91
Statistical Methods
92
Statistical Methods
93
Statistical Methods
94
Statistical Methods
Appendix C
95
Statistical Methods
96
Statistical Methods
97
Statistical Methods
98
Statistical Methods
Appendix D
99
Statistical Methods
Reference
100
Statistical Methods
9- https://onlinecourses.science.psu.edu/stat500/sites/onlinecourses.scie
nce.psu.edu.stat500/files/lesson14/summary_table/index.pdf. Accessed
online on 07-10-2018.
10- Bhaskar, S. B., & Manjuladevi, M. (2016). Methodology for research
II. Indian Journal of Anaesthesia, 60(9), 646–651.
http://doi.org/10.4103/0019-5049.190620
11- Ali Z, Bhaskar SB. (2016). Basic statistical tools in research and data
analysis. Indian J Anaesth. 2016;60:662–9.
101