0% found this document useful (0 votes)

31 views102 pages

Unit 9 Inferential Statistics

Uploaded by

mukundinekhavhambe4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views102 pages

Unit 9 Inferential Statistics

Uploaded by

mukundinekhavhambe4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 102

INFERENTIAL STATISTICS

Dr. Dalia El-Shafei

Assist.Prof., Community Medicine Department, Zagazig
University
Definition of statistics :

Branch of mathematics concerned with:

Collection, Summarization, Presentation, Analysis,
and Interpretation of data.

Collection Summarization Presentation Analysis Interpretation

TYPES OF STATISTICS
• Describe or summarize the data • Use data to make inferences or
of a target population. generalizations about population.
• Describe the data which is • Make conclusions for population
already known. that is beyond available data.
• Organize, analyze & present
• Compare, test and predicts future
data in a meaningful manner.
• Final results are shown in outcomes.
forms of tables and graphs. • Final results is the probability
• Tools: measures of central scores.
tendency & dispersion. • Tools: hypothesis tests

Descriptive Inferential
INFERENCE
Inference involves making a generalization about a larger group of
individuals on the basis of a subset or sample.
Inferential statistics

Hypothesis testing Estimation

Hypothesis Set level of Choosing Decision Point

Decision
formulation significance test approach estimate

Null Interval estimate

hypothesis Quantitative
α error p-value Accept H0 “Confidence
data interval”
“H0”

Alternative
hypothesis Qualitative
Critical value Reject H0
data
“H1”
CONFIDENCE LEVEL & INTERVAL “INTERVAL ESTIMATE”

Confidence • The range of values that is used to

interval
estimate the true value of the
“Interval
population parameter.
estimate”
• The probability that the confidence
interval does, in fact, contain the
Confidence
true population parameter, assuming
Level
that the estimation process is
repeated many times (1−𝛼 ).
HYPOTHESIS TESTING

To find out whether the observed variation among sampling is

explained by sampling variations, chance or is really a difference
between groups.

The method of assessing the hypotheses testing is known as

“significance test”.

Significance testing is a method for assessing whether a result is

likely to be due to chance or due to a real effect.
NULL & ALTERNATIVE HYPOTHESES:
 In hypotheses testing, a specific hypothesis is formulated & data is
collected to accept or to reject it.
 Null hypotheses means: H0: x1=x2 this means that there is no
difference between x1 & x2.
 If we reject the null hypothesis, i.e there is a difference between the 2
readings, it is either H1: x1< x2 or H2: x1> x2
 Null hypothesis is rejected because x 1 is different from x2.
Compared the smoking cessation rates for smokers randomly
assigned to use a nicotine patch versus a placebo patch.

Null hypothesis: Smoking cessation rate in nicotine patch group =

smoking cessation rate in placebo patch group.

Alternative hypothesis: Smoking cessation rate in nicotine patch

group ≠ smoking cessation rate in placebo patch group (2 tailed) OR
smoking cessation rate in nicotine patch group is higher than
smoking cessation rate in placebo patch group (1 tailed).
DECISION ERRORS
Type I error “α” = False +ve = Rejection of true H0
Type II error “β” = False –ve = Accepting false H0
In statistics, there are 2 ways to determine whether the evidence is likely or
unlikely given the initial assumption:
 Critical value approach (favored in many of the older textbooks).
 P-value approach (what is used most often in research, journal articles, and

statistical software).
 If the data are not consistent with the null hypotheses, the difference is said
to be “statistically significant”.
 If the data are consistent with the null hypotheses it is said that we accept it
i.e. statistically insignificant.

 In medicine, we usually consider that differences are significant if the

probability is <0.05.
 This means that if the null hypothesis is true, we shall make a wrong decision
<5 in a 100 times.
CRITICAL VALUE
CRITICAL VALUE

A point on the test distribution that is compared to the test statistic to

determine whether to reject the null hypothesis.

If the absolute value of your test statistic is greater than the

critical value, you can declare statistical significance and reject
the null hypothesis.

Critical values correspond to α, so their values become fixed when

you choose the test's α.
Critical Value is the z-score that separates sample statistics likely to occur
from those unlikely to occur. The number 𝑍𝛼⁄2 is the z-score that separates a
region of 𝛼⁄2 from the rest of the standard normal curve
TESTS OF
SIGNIFICANCE
Tests of significance
Qualitative
Quantitative variables
variables

Proportion
1 Mean 2 Means >2 Means X2 test
Z-test

One One Large

sample Z- sample t- sample Small sample “<30” ANOVA
test test “>30”

Paired t-
Z-test t-test
test
ANALYSIS OF
QUANTITATIVE VARIABLES
Tests of significance
Qualitative
Quantitative variables
variables

Proportion
1 Mean 2 Means >2 Means X2 test
Z-test

One One Large

sample Z- sample t- sample Small sample “<30” ANOVA
test test “>30”

Paired t-
Z-test t-test
test
Z TEST OR SND
“STANDARD NORMAL DEVIATE”
Z TEST OR SND STANDARD NORMAL DEVIATE

 Used for Comparing 2 means of large samples (>60) using the

normal distribution.
Student’s t-test
STUDENT'S T-TEST
 Used for Comparing two means of small samples (<60) by the t
distribution instead of the normal distribution.
UNPAIRED T-TEST

 X1= mean of the 1st sample X2=mean of the 2nd sample

 n1= sample size of the 1st sample n2= sample size of the 2nd sample
 SD1= SD of the 1st sample SD2 = SD of the 2nd sample.

 Degree of freedom (df) = (n1+n2)-2

STUDENT'S T-TEST
 The value of t will be compared to values in the specific table of "t
distribution test" at the value of the degree of freedom.
 If the calculated value of t is less than that in the table, then the difference
between samples is insignificant.
 If the calculated t value is larger than that in the table so the difference is
significant i.e. the null hypothesis is rejected.
Statistical
significance
Small P-
value

Big t-value
Suppose that you calculate t test= 1.75
STUDENT'S T-TEST Suppose that df = 3
Calculated t (1.75) < Tabulated t (3.182), then the difference between samples is
insignificant. i.e. Null hypothesis is accepted.
PAIRED T-TEST
Comparing repeated observation in the same individual or difference
between paired data.
The analysis is carried out using the mean & SD of the difference
between each pair.
ANALYSIS OF VARIANCE
(ANOVA)
 Used for Comparing several means.
 To compare >2 means, this can be done by use of several t-tests that can consume
more time & lead to spurious significant results. So, we must use analysis of
variance or ANOVA.
ANALYSIS OF VARIANCE (ANOVA)

There are two main types:

One-way ANOVA

• When the subgroups to be compared are defined by just one factor

• Comparison between means of blood glucose levels among 3 groups of
diabetic patients (1st group was on insulin, 2nd group was on oral
hypoglycemic drugs, & 3rd group was on lifestyle modification)

Two-way ANOVA

• When the subdivision is based upon more than one factor.

• The above-mentioned example the groups were divided into males & females.
The main idea in the ANOVA is that we have to take into account the variability
within the groups and between the groups and value of F is equal to the ratio
between the means sum square of between the groups and within the groups.
F = between-groups MS / within-groups MS.
ANALYSIS OF
QUALITATIVE VARIABLES
Tests of significance
Qualitative
Quantitative variables
variables

Proportion
1 Mean 2 Means >2 Means X2 test
Z-test

One One Large

sample Z- sample t- sample Small sample “<30” ANOVA
test test “>30”

Paired t-
Z-test t-test
test
Chi -square test
CHI -SQUARE TEST

Test relationships between categorical independent variables.

Qualitative data are arranged in table formed by rows & columns.

Variables Obese Non-Obese Total

Diabetic 62 63 125
Non-diabetic 51 44 105
Total 113 107 220
O = Observed value in the table
E = Expected value
Row total Χ Column total
Expected (E) =
Grand total

Degree of freedom =
(row - 1) (column - 1)
EXAMPLE HYPOTHETICAL STUDY
 Two groups of patients are treated using different spinal
manipulation techniques
 Gonstead vs. Diversified

 The presence or absence of pain after treatment is the outcome

measure.

 Two categories
 Technique used
 Pain after treatment
GONSTEAD VS. DIVERSIFIED EXAMPLE - RESULTS

Pain after treatment

Technique Yes No Row Total

Gonstead 9 21 30
Diversified 11 29 40
Column Total 20 50 70
Grand Total

9 out of 30 (30%) still had pain after Gonstead treatment

and 11 out of 40 (27.5%) still had pain after Diversified,
but is this difference statistically significant?
 FIRST FIND THE EXPECTED VALUES FOR EACH
CELL
Row total Χ Column
Expected (E)
total
= Grand total
Pain after treatment
Technique Yes No Row Total

Gonstead 9 21 30 Multiply
Multiplyrow
rowtotal
total

Diversified 11 29 40
Column Total 20 50 70 Divide
Divideby
bygrand
grandtotal
total
Grand Total
Times
Timescolumn
columntotal
total

 To find E for cell a (and similarly for the rest)

 Find E for all cells
Pain after treatment
Yes No Row Total
Technique

Gonstead
9 21 30
E = 30*20/70=8.6 E = 30*50/70=21.4
11 29
Diversified 40
E=40*20/70=11.4 E=40*50/70=28.6
Column Total 20 50 70
Grand
Total
 2
Use the Χ formula with each cell and then add them together

(9 - 8.6)2 (21 - 21.4)2 0.018

0.0168
8.6 21.4 6
=
(11 - 11.4) (29 - 28.6)
2 2
0.031
0.0056
11.4 28.6 6
Χ2 = 0.0186 + 0.0168 + 0.0316 + 0.0056 = 0.0726
Evidence-based Chiropractic

Calculated χ2 value (0.0726) < Tabulated value

2 (7.815) at df = 1.
Therefore, Χ is not statistically significant
So, we will accept null hypothesis
Z TEST FOR COMPARING 2
PERCENTAGES
“PROPORTION Z-TEST”
Z TEST FOR COMPARING 2 PERCENTAGES “PROPORTION Z-
TEST”

Z= p1 – p2 /√(p1q1/n1 + p2q2/n2).

p1=% in the 1st group. p2 = % in the 2nd group

q1=100-p1 q2=100-p2
n1= sample size of 1st group
n2=sample size of 2nd group .
Z test is significant (at 0.05 level) if the result >2.
EXAMPLE
If the number of anemic patients in group 1 which includes 50 patients
is 5 and the number of anemic patients in group 2 which contains 60
patients is 20. if groups 1 & 2 are statistically different in prevalence of
anemia we calculate z test.
p1=5/50=10% p2=20/60=33% q1=100-10=90 q2=100-33=67
Z= 10 – 33/ √ (10x90/50 + 33x67/60)
Z= 23 / √ (18 + 36.85) Z= 23/ 7.4 Z= 3.1

So, there is statistically significant difference between percentages of

anemia in the studied groups (because Z>2).
CORRELATION & REGRESSION
CORRELATION & REGRESSION
Correlation measures the closeness of the association between 2
continuous variables, while Linear regression gives the equation of
the straight line that best describes & enables the prediction of one
variable from the other.
CORRELATION IS NOT CAUSATION!!!
LINEAR REGRESSION
Same as correlation Differ than correlation

Determine the relation & prediction of the The independent factor has to be
change in a variable due to changes in specified from the dependent
other variable.
t-test is also used for the assessment of the
variable.
level of significance. The dependent variable in linear
regression must be a continuous
one.
Allows the prediction of dependent
variable for a particular independent
variable “But, should not be used
outside the range of original data”.
CORRELATION
 Measured by the correlation coefficient, r. The values of r ranges between +1
and -1.
 “1” means perfect correlation while “0” means no correlation.
 If r value is near the zero, it means weak correlation while near the one it
means strong correlation. The sign - and + denotes the direction of correlation
REGRESSION
LINEAR REGRESSION
 Used to determine the relation & prediction of the change in a
variable due to changes in another variable.
 For linear regression, the independent factor (x) must be specified
from the dependent variable (y).
 Also allows the prediction of dependent variable for a particular
independent variable
SCATTERPLOTS
 An X-Y graph with symbols that represent the values of 2 variables

Regression
Regression
line
line
LINEAR REGRESSION
 However, regression for
prediction should not be used
outside the range of original
data.
 t-test is also used for the
assessment of the level of
significance.
 The dependent variable in
linear regression must be a
continuous one.
MULTIPLE LINEAR REGRESSION
 The dependency of a dependent variable on several independent
variables, not just one.
 Test of significance used is the ANOVA. (F test).
EXAMPLE
 If neonatal birth weight depends on these factors: gestational age, length of
baby and head circumference. Each factor correlates significantly with baby
birth weight (i.e has +ve linear correlation).
 We can do multiple regression analysis to obtain a mathematical equation by
which we can predict the birth weight of any neonate if we know the values of
these factors.

Inferentialstatistics 210411214248
No ratings yet
Inferentialstatistics 210411214248
102 pages
CAMI16 - Data Analytics
No ratings yet
CAMI16 - Data Analytics
55 pages
D. Biostatistics (Inferential Statistics)
No ratings yet
D. Biostatistics (Inferential Statistics)
42 pages
Inferential Statistics
No ratings yet
Inferential Statistics
26 pages
Tests of Significance
No ratings yet
Tests of Significance
19 pages
Inferential Statistics: DR Abrar Umar
No ratings yet
Inferential Statistics: DR Abrar Umar
28 pages
Hypothesis Testing Essentials
100% (1)
Hypothesis Testing Essentials
27 pages
Phân Tích Dữ Liệu Và Xác Định Phép Kiểm Thống Kê
No ratings yet
Phân Tích Dữ Liệu Và Xác Định Phép Kiểm Thống Kê
50 pages
Statistical Inferences
No ratings yet
Statistical Inferences
46 pages
Inferential Statistics for Educators
No ratings yet
Inferential Statistics for Educators
101 pages
Comparing Independent Groups, T-Tests and Anova
No ratings yet
Comparing Independent Groups, T-Tests and Anova
39 pages
Theory
No ratings yet
Theory
7 pages
Test of Significance DR - NJ (5!10!23)
No ratings yet
Test of Significance DR - NJ (5!10!23)
32 pages
PSM 201 Sampling Distributions and Hypothesis Testing
No ratings yet
PSM 201 Sampling Distributions and Hypothesis Testing
31 pages
Parametric Tests
No ratings yet
Parametric Tests
50 pages
Week 6 - Result and Analysis 2 (UP)
No ratings yet
Week 6 - Result and Analysis 2 (UP)
7 pages
Hypothesis Testing & Analysis Guide
No ratings yet
Hypothesis Testing & Analysis Guide
26 pages
Tests of Significance
No ratings yet
Tests of Significance
37 pages
Commed Revision Course Biosta2
No ratings yet
Commed Revision Course Biosta2
44 pages
Chapter 5 Hypothesis Testing
No ratings yet
Chapter 5 Hypothesis Testing
27 pages
Hns 2321 Biostatistics Lecture Notes On Inferential Statistics
No ratings yet
Hns 2321 Biostatistics Lecture Notes On Inferential Statistics
25 pages
Inferential Statistics: DR Abrar Umar
No ratings yet
Inferential Statistics: DR Abrar Umar
21 pages
Statppt2 - Test Statistic, Z-Critical & T-Critical
No ratings yet
Statppt2 - Test Statistic, Z-Critical & T-Critical
44 pages
Inferential Statistics
No ratings yet
Inferential Statistics
35 pages
Statistics for Analysts
No ratings yet
Statistics for Analysts
52 pages
Bios O6s A4
No ratings yet
Bios O6s A4
23 pages
Bioe Week 8 Reviewer
No ratings yet
Bioe Week 8 Reviewer
7 pages
Prob Stat Lesson 9
No ratings yet
Prob Stat Lesson 9
44 pages
2-Basic Statistics For Pharmacology Practicals
No ratings yet
2-Basic Statistics For Pharmacology Practicals
38 pages
CG8 Data-Analysis
No ratings yet
CG8 Data-Analysis
63 pages
9-Sig. Tests Workshop 5-2025
No ratings yet
9-Sig. Tests Workshop 5-2025
49 pages
T Test
No ratings yet
T Test
29 pages
Inferential Statistics
No ratings yet
Inferential Statistics
40 pages
One Sample T Test 112022
No ratings yet
One Sample T Test 112022
26 pages
Test of Significanc
No ratings yet
Test of Significanc
24 pages
STATISTICS AND PROBABILITY Unit 1-2
No ratings yet
STATISTICS AND PROBABILITY Unit 1-2
6 pages
Lecture 7.descriptive and Inferential Statistics
100% (1)
Lecture 7.descriptive and Inferential Statistics
44 pages
Data Visualization Notes Ou
No ratings yet
Data Visualization Notes Ou
125 pages
DV Unit 1&2 Notes
No ratings yet
DV Unit 1&2 Notes
50 pages
Hypothesis Python
No ratings yet
Hypothesis Python
42 pages
Inferential Statistics
100% (2)
Inferential Statistics
16 pages
Statistics
No ratings yet
Statistics
66 pages
Intro to Inferential Statistics
100% (5)
Intro to Inferential Statistics
28 pages
6 - Hypothesis Testing
No ratings yet
6 - Hypothesis Testing
27 pages
Inferenatial Assign, of Iqra Sajid
No ratings yet
Inferenatial Assign, of Iqra Sajid
8 pages
The Statistical Tools
No ratings yet
The Statistical Tools
66 pages
PHPS30020 Week1 (5) - 29nov2023 (Test Decisions & Assumptions, Hypothesis, Compare 2 Groups)
No ratings yet
PHPS30020 Week1 (5) - 29nov2023 (Test Decisions & Assumptions, Hypothesis, Compare 2 Groups)
16 pages
L7-Hypothesis Testing
No ratings yet
L7-Hypothesis Testing
44 pages
Week 13
No ratings yet
Week 13
33 pages
Business Research Methods: MBA - FALL 2014
No ratings yet
Business Research Methods: MBA - FALL 2014
32 pages
Researchers' Guide to T-Tests
No ratings yet
Researchers' Guide to T-Tests
13 pages
CH 21
No ratings yet
CH 21
32 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
35 pages
T Test
No ratings yet
T Test
35 pages
T Test
No ratings yet
T Test
38 pages
02 Intrro Continued
No ratings yet
02 Intrro Continued
34 pages
ML Unit 3
No ratings yet
ML Unit 3
46 pages
Biogeography Individual Assignment
No ratings yet
Biogeography Individual Assignment
7 pages
Geo5222 Group Assignment
No ratings yet
Geo5222 Group Assignment
8 pages
Week Three - Study Guide
No ratings yet
Week Three - Study Guide
9 pages
Geo5122 2025
No ratings yet
Geo5122 2025
39 pages
PRA Data Collection
No ratings yet
PRA Data Collection
4 pages
MANOVA Guide for Stats Students
No ratings yet
MANOVA Guide for Stats Students
29 pages
Phase3 PDF
No ratings yet
Phase3 PDF
4 pages
The Unscrambler Tutorials
No ratings yet
The Unscrambler Tutorials
179 pages
Basic Financial Econometrics PDF
No ratings yet
Basic Financial Econometrics PDF
170 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
RDI Assignment On ANOVA
No ratings yet
RDI Assignment On ANOVA
5 pages
Ases311 Midrev
No ratings yet
Ases311 Midrev
17 pages
How To Create Data Analytics Slides
No ratings yet
How To Create Data Analytics Slides
3 pages
Statistics For Traffic Engineers
No ratings yet
Statistics For Traffic Engineers
55 pages
Eregress Predict - Predict After Eregress and Xteregress
No ratings yet
Eregress Predict - Predict After Eregress and Xteregress
6 pages
Quiz 4 Statistics Measures On Central Tendency and Dispersion (Grouped Data)
100% (1)
Quiz 4 Statistics Measures On Central Tendency and Dispersion (Grouped Data)
1 page
Introduction To Cronbach Alpha
No ratings yet
Introduction To Cronbach Alpha
4 pages
Edu 802 Sati
No ratings yet
Edu 802 Sati
9 pages
Assignment 2 AAMD by Om
No ratings yet
Assignment 2 AAMD by Om
10 pages
Main Notes
No ratings yet
Main Notes
227 pages
Econ 303
No ratings yet
Econ 303
2 pages
AAAI-2023 教程用于因果推断的机器学习
No ratings yet
AAAI-2023 教程用于因果推断的机器学习
145 pages
Assessment On Correlation and Regression - S-MATH201LA - STATISTICAL ANALYSIS WITH COMPUTER APPLICATION - BSA21 - 1S - AY20-21 - DLSU-D College - GS
0% (1)
Assessment On Correlation and Regression - S-MATH201LA - STATISTICAL ANALYSIS WITH COMPUTER APPLICATION - BSA21 - 1S - AY20-21 - DLSU-D College - GS
2 pages
Hayashi CH 1 Answers
No ratings yet
Hayashi CH 1 Answers
4 pages
DESCRIPTIVE ANALYTICS NOTESmarks-q-and-ans
No ratings yet
DESCRIPTIVE ANALYTICS NOTESmarks-q-and-ans
28 pages
CL202: Introduction To Data Analysis: MB+SCP
No ratings yet
CL202: Introduction To Data Analysis: MB+SCP
33 pages
Tyagi Et Al. 2021
No ratings yet
Tyagi Et Al. 2021
19 pages
How To Use SPSS
100% (7)
How To Use SPSS
134 pages
Correlation Between Effective Cohesion and Plasticity Index of Clay
No ratings yet
Correlation Between Effective Cohesion and Plasticity Index of Clay
5 pages
MGT 337-Project Management-Lecture 3
No ratings yet
MGT 337-Project Management-Lecture 3
22 pages
Unit 2 Machine Learning
No ratings yet
Unit 2 Machine Learning
32 pages
Statistical Inference: Single Population Estimation
No ratings yet
Statistical Inference: Single Population Estimation
33 pages
2 Year Data Science Roadmap
No ratings yet
2 Year Data Science Roadmap
3 pages
CFA 2024 L1 Hypothesis Testing
No ratings yet
CFA 2024 L1 Hypothesis Testing
19 pages
Project Management Formulas
No ratings yet
Project Management Formulas
2 pages

Unit 9 Inferential Statistics

Uploaded by

Unit 9 Inferential Statistics

Uploaded by

INFERENTIAL STATISTICS

Dr. Dalia El-Shafei

Branch of mathematics concerned with:

Collection Summarization Presentation Analysis Interpretation

Hypothesis testing Estimation

Hypothesis Set level of Choosing Decision Point

Null Interval estimate

Confidence • The range of values that is used to

To find out whether the observed variation among sampling is

The method of assessing the hypotheses testing is known as

Significance testing is a method for assessing whether a result is

Null hypothesis: Smoking cessation rate in nicotine patch group =

Alternative hypothesis: Smoking cessation rate in nicotine patch

 In medicine, we usually consider that differences are significant if the

A point on the test distribution that is compared to the test statistic to

If the absolute value of your test statistic is greater than the

Critical values correspond to α, so their values become fixed when

One One Large

One One Large

 Used for Comparing 2 means of large samples (>60) using the

 X1= mean of the 1st sample X2=mean of the 2nd sample

 Degree of freedom (df) = (n1+n2)-2

There are two main types:

• When the subgroups to be compared are defined by just one factor

• When the subdivision is based upon more than one factor.

One One Large

Test relationships between categorical independent variables.

Variables Obese Non-Obese Total

 The presence or absence of pain after treatment is the outcome

Pain after treatment

Technique Yes No Row Total

9 out of 30 (30%) still had pain after Gonstead treatment

 To find E for cell a (and similarly for the rest)

(9 - 8.6)2 (21 - 21.4)2 0.018

Calculated χ2 value (0.0726) < Tabulated value

p1=% in the 1st group. p2 = % in the 2nd group

So, there is statistically significant difference between percentages of

You might also like