[go: up one dir, main page]

0% found this document useful (0 votes)
24 views112 pages

Presentation of Data Lecture 2

Uploaded by

enoch taclan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views112 pages

Presentation of Data Lecture 2

Uploaded by

enoch taclan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Slide 1

Section 1-3
Critical Thinking

Course/Subject: Biostatistics and Epidemiology


Lecture/09.09.24 /BSMLS 3-A&B
Presented by: Enoch Taclan BS Biol, MSc Biol
Success in Statistics Slide 2

 Success in the introductory statistics course


typically requires more common sense than
mathematical expertise.

 This section is designed to illustrate how


common sense is used when we think critically
about data and statistics.
Definitions Slide 3

Sample
 Is a smaller set or subset of the population

 A representative sample is a subset that provides an


accurate picture of the whole population.

 The data obtained from a sample of subjects may not be


exactly the same as data collected on the whole population.
However, when the sample is representative, the sample
data should be very similar to what would be found in the
whole population.
Definitions Slide 4

Biased sample
 Only certain members of the population are chosen
so that sample systematically misrepresents the
population
Definitions Slide 5

Voluntary response sample


(or self-selected survey)

one in which the respondents themselves decide


whether to be included.

In this case, valid conclusions can be made only


about the specific group of people who agree to
participate.
Definitions Slide 6

Convenience sample
(investigator sample)

involves using respondents who are “convenient” to


the researcher.
Definitions Slide 7

Random sample
is a subset of a statistical population in which each
member of the subset has an equal probability of
being chosen.
Definitions Slide 8

3 Primary types of Random sampling


Simple –
Stratified –
Systematic –

Sampling frame
Random sample requires a list of the population
Ordered and every subject in the sample frame is assigned a
unique number
Unique numbers/identifiers a random number generator can
be used to randomly select the sample
Definitions Slide 9

Simple Random Sample


each subject in the population has the same chance of being
selected
Selected numbers are then matched with the subjects to
select the sample
Definitions Slide 10

Stratified random sampling


sampling technique in which the total population is divided
into homogenous groups (strata) to complete the sampling
process.
Total sample will be representative as long as the number of
subjects sampled in each region is proportional to the
population size of the region
Definitions Slide 11

Systematic random sampling


random sampling method that requires selecting samples
based on a system of intervals in a numbered population.
Number “s” is selected so that every sth
subject is selected to be in the sample
Misuses of Statistics Slide 12

Bad Samples
Biased Sampling
• Non-Representative Samples: If a
sample doesn’t accurately reflect the
population
• Selection Bias: This occurs when
certain members of a population
Prevention: Transparency and Documentation:
Clearly documenting the samples
Misuses of Statistics Slide 13

 Small Samples
Overgeneralization of Results
• Broad Conclusions from Limited
Data: Researchers or decision-makers might
take results from a small
• Ignoring Sample Limitations: Failing
to acknowledge
Misuses of Statistics Slide 14

 Small Samples
 Example: A study with a small sample of 15 patients
claims that a new treatment is highly effective. The
results are widely publicized, leading to excitement
and adoption of the treatment. However, when
larger studies are conducted, they fail to replicate
the results, revealing that the initial findings were
likely due to random chance rather than true
effectiveness.
 Prevention: Replication and Validation: Findings from
small samples should be replicated in larger studies to
validate the results
Misuses of Statistics Slide 15

 Misleading graphs
 graphs are a common way that
statistics can be misused to distort the
truth or create a false impression
Slide 16

Figure 1-1
Slide 17

To correctly interpret a graph,


we should analyze the numerical
information given in the graph
instead of being mislead by its
general shape.
Misuses of Statistics Slide 18

 Bad Samples
 Small Samples
 Misleading Graphs
 Pictographs
Misuses of Statistics Slide 19

 Pictographs
 are a visual representation of data
using pictures or symbols to represent
quantities. While they can be a helpful
tool for understanding data, they can
also be misleading if not used correctly.
Misuses of Statistics Slide 20
Misuses of Statistics Slide 21

 Bad Samples
 Small Samples
 Misleading Graphs
 Pictographs
Slide 22

Section 1-4
Design of Experiments

Created by Tom Wegleitner, Centreville, Virginia


Major Points Slide 23

 If sample data are not collected in an


appropriate way, the data may be so
completely useless that no amount of
statistical tutoring can salvage them.

 Randomness typically plays a critical


role in determining which data to
collect.
Slide 24

Your role as an investigator..

 Will you just observe?


 Will you intervene?
Slide 25

Will you observe or intervene?

 What people do?


Observational
How to make changes?
Interventional
Slide 26

Choosing an appropriate study design

Observational Interventional
 Case reports  Non- randomized
 Case series controlled trials
 Cross sectional  Randomized control
trial
 Case-control CORRELATIONAL
STUDY
 Cohort
ANALYTICAL
Definitions Slide 27

 Cross Sectional Study


Data are observed, measured, and collected
at one point in time.

 Retrospective (or Case Control) Study


Data are collected from the past by going
back in time.
 Prospective (or Longitudinal or Cohort) Study
Data are collected in the future from groups
(called cohorts) sharing common factors.
Definitions Slide 28

 Confounding
occurs in an experiment when the
experimenter is not able to distinguish
between the effects of different factors

Try to plan the experiment so confounding does not occur!


Controlling Effects Slide 29
of Variables
 Blinding

subject does not know he or she is receiving a


treatment or placebo

 Blocks

groups of subjects with similar characteristics

 Completely Randomized Experimental Design


subjects are put into blocks through a process
of random selection
 Rigorously Controlled Design
subjects are very carefully chosen
Replication and Slide 30
Sample Size
 Replication
repetition of an experiment when there are
enough subjects to recognize the differences
in different treatments

 Sample Size
use a sample size that is large enough to see
the true nature of any effects and obtain that
sample using an appropriate method, such as
one based on randomness
Random Sampling Slide 31
selection so that each has an
equal chance of being selected
Systematic Sampling Slide 32
Select some starting point and then
select every K th element in the population
Convenience Sampling Slide 33
use results that are easy to get
Stratified Sampling Slide 34
subdivide the population into at
least two different subgroups that share the same
characteristics, then draw a sample from each
subgroup (or stratum)
Cluster Sampling Slide 35
divide the population into sections
(or clusters); randomly select some of those clusters;
choose all members from selected clusters
Methods of Sampling Slide 36

 Random
 Systematic
 Convenience
 Stratified
 Cluster
Definitions Slide 37

 Sampling Error
the difference between a sample result and the true
population result; such an error results from chance
sample fluctuations

 Nonsampling Error
sample data that are incorrectly collected, recorded, or
analyzed (such as by selecting a biased sample, using a
defective instrument, or copying the data incorrectly)
Recap Slide 38

In this section we have looked at:


 Types of studies and experiments
 Controlling the effects of variables
 Randomization
 Types of sampling
 Sampling Errors
TYPES AND PRESENTATION
OF DATA

Course: Biostatistics and Epidemiology


Lecture/09.09.24 /BSMLS 3-A&B
Presented by: Enoch Taclan BS Biol, MSc Biol
Data
A set of values recorded on one or more observational
units i.e. Object, person etc
Types of data:
(A) Qualitative / Quantitative data
(B) Discrete / Continuous data
(C) Primary / Secondary data
(D) Nominal / Ordinal data
TYPES OF DATA

Qualitative Quantitative
Data Data

Discrete
Nominal Ordinal Continuous

Interval Ratio
 Qualitative data:
• also called as enumeration data .
• Represents a particular quality or attribute.
• There is no notion of magnitude or size of the
characteristic, as they can't be measured.
• Expressed as numbers without unit of
measurements . Eg: religion, Sex, Blood group etc.
 Quantitative data:
• Also called as measurement data.
• These data have a magnitude.
• Can be expressed as number with or without unit
of measurement. Eg: Height in cm, Hb in gm%, BP
in mm of Hg, Weight in kg.
Discrete / Continuous data:
Discrete data: Here we always get a whole number. Eg.
Number of beds in hospital, Malaria cases .
Continuous data : it can take any value possible to measure or
possibility of getting fractions. Eg. Hb level, Ht, Wt.

Quantitative data Qualitative data

Hb level in gm% Anemic or non anemic

Ht in cms Tall or short

BP in mm of Hg Hypo, normo or hypertensive

IQ scores Idiot, genius or normal


Primary/ Secondary data:
Primary data : Obtained directly from an individual, it
gives precise information .
Secondary data : Obtained from outside source,Eg: Data
obtained from hospital records, Census.

Nominal/ Ordinal data:


Nominal data: the information or data fits into one of the
categories, but the categories cannot be ordered one above
another.
Ordinal data: here the categories can be ordered, but the
space or class interval between two categories may not be the
same.
COLLECTION OF DATA

 Collect data carefully and thoroughly.


 Units of measurements should be clearly defined.
 Record should be correct , complete, clear, sufficiently
concise and arranged in a manner that is easy to
comprehend.
 Collected data should be
• Accurate (i.e. Measures true value of what is under study)
• Valid( i.e. Measures only what is supposed to measure)
• Precise(i.e. Gives adequate details of the measurement)
• Reliable(i.e. Should be dependable)
SOURCES FOR COLLECTION OF
DATA

 Census: Defined as “The total process of collecting,


compiling and publishing demographic, economic and
social data pertaining at a specific time or times, to all
persons in a country or delimited territory.”.
 Registration of vital events: Civil registration System.
Ex: Birth Registration: Philippine Statistics Authority
(PSA): The PSA is the central authority for compiling and
managing vital statistics and provides a National
Statistics Office (NSO) certification for birth certificates.
CONTINUED..

Sample Registration System (SRS): Dual record system,


consisting of continuous enumeration of births and deaths by
an enumerator and independent survey every 6 months by an
investigator-supervisor.
CONTINUED..

 Notification of diseases: Valuable source of morbidity data


such as incidence, prevalence and distribution of certain
specified diseases which are notifiable. Internationally
notifiable diseases: Cholera, Plague and Yellow fever.
 Hospital Records: Primary and basic source of
information about disease prevalent in the community.
CONTINUED..

 Epidemiological Surveillance: Special surveillance


activities are conducted for diseases like Malaria, Leprosy,
TB, Filariasis, AIDS, COVID-19 and etc.
 Surveys: Population surveys supplement routinely
collected statistics.
 Research Findings: Findings of various research or
investigations are helpful for planning and implementation
of health activities in general.
PRESENTATION OF DATA

Principles of presentation of data:


Data should be arranged in such a way that it will
stimulate interest in reader.
The data should be made sufficiently concise without
losing important details.
The data should be presented in simple form to enable
the reader to form quick impressions and to draw some
conclusion, directly or indirectly.
Should facilitate further statistical analysis .
It should define the problem and suggest its solution.
METHODS OF
PRESENTATION OF DATA
The first step in statistical analysis is to present
data in an easy way to be understood.
The two basic ways for data presentation are

 Tabulation
 Charts and diagram
RULES AND GUIDELINES FOR TABULAR
PRESENTATION

1. Table must be numbered


2. Brief and self-explanatory title must be given to each table.
3. The heading of columns and rows must be clear, sufficient,
concise and fully defined.
4. The data must be presented according to size of importance,
chronologically, alphabetically or geographically
5. If data includes rate or proportion, mention the denominator.
6. Table should not be too large.
7. Figures needing comparison should be placed as close as
possible.
Table of Patient Outcomes by Treatment Group

Table 1: Comparison of Patient Outcomes by


Treatment Group for Hypertension
Average
Systolic Percentage
Treatment Number of
Blood Achieving Denominator
Group Patients
Pressure Target BP (%)
(mmHg)
Group A:
150 130 80% 150 patients
Drug X
Group B:
140 140 60% 140 patients
Drug Y
Group C:
160 150 40% 160 patients
Placebo
CONTINUED..

8. The classes should be fully defined, should not lead to any


ambiguity.
9. The classes should be exhaustive i.e. should include all the
given values.
10. The classes should be mutually exclusive and non
overlapping.
11. The classes should be of equal width or class interval should
be same
12. Open ended classes should be avoided as far as possible.
13. The number of classes should be neither too large nor too
small.Can be 10-20 classes.
14. Formula for number of classes (K):
K=1+3.322 log10 N, where N is total frequency
SEATWORK: FREQUENCY
DISTRIBUTION

Suppose you have data on the systolic blood pressure (BP) of 200
patients, and you want to create a frequency distribution table.
Data Range: 90 mmHg to 180 mmHg
1. Calculate the Number of Classes =
Using the formula 𝐾=1+3.322log10 N
Total frequency 𝑁=200
Round to the nearest whole number =
K=
2. Determine Class Width:
Range of data = 180 mmHg - 90 mmHg =
Number of classes K=
Class width = 90/9 =
FREQUENCY DISTRIBUTION

3. Define Classes:
• Class 1: 90-99 mmHg
• Class 2: 100-109 mmHg
• Class 3: 110-119 mmHg
• Class 4: 120-129 mmHg
• Class 5: 130-139 mmHg
• Class 6: 140-149 mmHg
• Class 7: 150-159 mmHg
• Class 8: 160-169 mmHg
• Class 9: 170-179 mmHg
FREQUENCY DISTRIBUTION

Systolic BP (mmHg) Number of Patients

90-99 12

100-109 20
110-119 25

120-129 30

130-139 35

140-149 28

150 - 159 18

160-169 18

170-179 14
TABULATION
 Can be Simple or Complex depending upon the number of
measurements of single set or multiple sets of items.
 Simple table :
Title: Numbers of cases of various diseases in AMCM

Disease Cases

Malaria 1100

Acute GE 248

Leptospirosis 60

Dengue 100

Total 1308
FREQUENCY DISTRIBUTION TABLE WITH
QUALITATIVE DATA:

 Title: Cases of malaria in adults and children in the


months of June and July 2010 in AMCM Hospital.

Jun-10 Jul-10
Type of
malaria Adult Child Adult Child Total

P.Vivax 54 9 136 23 222


P.Falciparu
m 11 0 80 13 104
Mixed
malaria 11 4 36 12 63

Total 76 13 225 43 389


FREQUENCY DISTRIBUTION TABLE WITH
QUANTITATIVE DATA:

 Fasting blood glucose level in diabetics at the time of diagnosis


Fasting No of diabetics
glucose level Male Female Total
120-129 8 4 12
130-139 4 4 8
140-149 6 4 10
150-159 5 5 10
160-169 9 6 15
170-179 9 9 18
180-189 3 2 5
44 34 78
CHART AND DIAGRAM

Graphic presentations used to illustrate


and clarify information.

• Tables are essential in presentation of scientific data and


diagrams are complementary to summarize these tables
in an easy, attractive and simple way.
The diagram should be:
 Simple
 Easy to understand
 Save a lot of words
 Self explanatory
 Has a clear title indicating its content
 Fully labeled
 The y axis (vertical) is usually used for
frequency
VARIOUS CHARTS AND
DIAGRAMS
 Bar Diagram
 Histogram
 Frequency polygon
 Cumulative frequency curve
 Scatter diagram
 Line diagram
 Pie diagram
BAR DIAGRAM

• Widely used, easy to prepare tool for comparing


categories of mutually exclusive discrete data.
• 3 types of bar diagram:
Simple
Multiple or compound
Component or proportional
SIMPLE BAR DIAGRAM:

Malaria cases in AMCM Hospital in July 2010


120

100

80

60
Total No cases Male

40

20

0
P.Vivax P.Falciparum Mixed malaria
Multiple bar Diagram/Chart

 Each observation has more than one value,


represented by a group of bars.
Examples:
Percentage of males and females in different
countries
Percentage of deaths from heart diseases in old
and young age
Mode of delivery (cesarean or vaginal) in
different female age groups
MULTIPLE OR COMPOUND
DIAGRAM
Distribution of malaria cases in AMCM Hospital in
July 2010
120

100
102

80

60 Male
62
57 Female
40

31 29
20
19
0
P.Vivax P.Falciparum Mixed malaria
Component bar chart

• subdivision of a single bar to indicate the

composition of the total divided into sections

according to their relative proportion.


COMPONENT OR PROPORTIONAL BAR
DIAGRAM
Proportion of energy intake obtained from various food
stuff by poor and rich community
100%
90%
80%
70% 55
% of energy obtained Fats
60% 80
50% % of energy obtained
Protein
40%
% of energy obtained
30% 30 Carbohdrate

20%
10
10%
10 15
0%
Poor Community Rich Community
HISTOGRAM:

 It is very similar to the bar chart with the


difference that the rectangles or bars are
adherent (without gaps).
 It is used for presenting class frequency table
(continuous data).
 Each bar represents a class and its height
represents the frequency (number of cases),
its width represent the class interval.
HISTOGRAM

Distribution of studied group according to their height

30
number of individuals

25

20

15

10

0
100- 110- 120- 130- 140- 150-
height in cm
FREQUENCY POLYGON

 Derived from a histogram by connecting the mid


points of the tops of the rectangles in the
histogram.
 The line connecting the centers of histogram
rectangles is called frequency polygon.
 We can draw polygon without rectangles so we
will get simpler form of line graph.
 A special type of frequency polygon is the Normal
Distribution Curve.
Frequency polygon
Fasting blood glucose level in diabetics at the time of
diagnosis

20
18
16
14
12
10
8 No of diabetics
6
4
2
0
120- 130- 140- 150- 160- 170- 180-
129 139 149 159 169 179 189
SCATTER/ DOT DIAGRAM

 Also called as Correlation diagram ,it is useful to


represent the relationship between two numeric
measurements, each observation being
represented by a point corresponding to its value
on each axis.
 In negative correlation, the points will be scattered
in downward direction, meaning that the relation
between the two studied measurements is
controversial i.e. if one measure increases the
other decreases
 While in positive correlation, the points will be
scattered in upward direction.
Dengue cases During monsoon in
AMCM Hospital: Year 2010

500

450 august, 450


400

350

300 july, 304 *Dengue


250
Cases

200

150

100
june, 89
50
May, 30
0
LINE DIAGRAM:

IT IS DIAGRAM SHOWING THE RELATIONSHIP BETWEEN TWO


NUMERIC VARIABLES (AS THE SCATTER) BUT THE POINTS ARE
JOINED TOGETHER TO FORM A LINE (EITHER BROKEN LINE OR
SMOOTH CURVE. USED TO SHOW THE TREND OF EVENTS WITH
THE PASSAGE OF TIME.
Changes in body temperature of a patient after use of antibiotic

39.5

39

38.5
temperature

38

37.5

37

36.5

36
1 2 2 4 5 6 7

time in hours
PIE DIAGRAM:

 Consist of a circle whose area


represents the total frequency (100%)
which is divided into segments.
 Each segment represents a proportional
composition of the total frequency.
PIE DIAGRAM:
Distribution of malaria cases in AMCM Hospital in
july 2010

Mixed
malaria
15%

P.Falciparum P.Vivax
32% 53%
CATEGORICALVARIABLES

• Displays of Categorical
Data
– Frequencies
– Bar Graph
– Pie Chart
CATEGORICALVARIABLES
Variable (Sex) Frequenc Proportion
y
Male 609 0.61
Female 391 0.39
Total 1000 100
70
0
60
0
50
0
40
0
30 Mal Femal
0 e e
20 Bar
0 Graph Pie
BAR GRAPH
NUMERICAL VARIABLES

Central
Tendency

Numerical
Spread
MEASURES OF CENTRAL
TENDENCY

• The 3 M's
– Mean
– Median
– Mode
MEASURES OF CENTRAL
TENDENCY
Sample Mean
The sample mean, 𝑥, is the sum of all values in the
sample divided by the total number of observations, n,
in the sample.

∑𝑖=1
𝑛
𝑥𝑖
𝑥=
𝑛
EXAMPLE: SAMPLE MEAN

Mean systolic blood pressure

 Scenario 1: Subjects BP
1 120 (x1)
 Mean = (120 + 135 + 115 + 2 135 (x2)
3 115 (x3)
 110 + 105 + 140)/6
4 110 (x4)
 =121 5 105 (x5)
6 140 (x6)
SAMPLE MEAN

• The mean is affected by extreme


observations and is not a resistant
measure. Subjects BP
1 120 (x1)
2 135 (x2)
Scenario 2: 3 115 (x3)

Mean = (120 + 135 + 115 + 110 + 4 110 (x4)


5 105 (x5)
105 + 140 + 280)/7 =144
6 140 (x6)
7 280 (x7)
MEDIAN

• The sample median, M, is the number


such that “half" the values in the
sample are smaller and the other “half"
are larger.
• Use the following steps to find M.
– Sort the data (arrange in increasing order).
– Is the size of the data set n even or odd?
– If odd: M = value in the exact middle.
– If even: M = the average of the two
middle numbers.
EXAMPLE: SAMPLE MEDIAN

• Median systolic BP:


Scenario 1:
120 : 135 : 115 : 110 : 105 : 140
Median = (115 + 110) /2 = 112.5

Scenario 2:
120 : 135 : 115 : 110 : 105 : 140 : 280
Median = 110

• The median is not affected by


extreme observations and is a
resistant measure.
MODE
• The sample mode is the value that occurs
most frequently in the sample (a data set
can have more than one mode).

• This is the only measure of center which


can also be used for categorical data.

• The population mode is the highest point


on the population distribution.
SYMMETRIC DATA DISTRIBUTION
6

4
Frequen

3
cy

0 1 2 3 4 5
0 0 0 0 0
Valu
e
RIGHTWARD SKEWNESS OF DATA
6 Mode Median
Mean
5

4
Frequen

3
cy

0 1 2 3 4 5
0 0 0 0 0
Valu
e
LEFTWARD SKEWNESS OF DATA
6 Mean Median
Mode
5

4
Frequen

3
cy

0
1 2 3 4 5
0 0 0 0 0
Valu
e
NUMERICAL MEASURES OF
SPREAD

• Range
• Sample Variance
• Inter Quartile Range (IQR)
NUMERICAL MEASURES OF
SPREAD
Range: The range of the data set is the
difference between the highest value and
the lowest value.

Range = highest value - lowest value


– Easy to compute BUT ignores a great
deal of information.
– Obviously the range is affected by
extreme observations and is not a
resistant measure.
NUMERICAL MEASURES OF
SPREAD
• Variance: equal to the sum of squared deviations
from the sample mean divided by n - 1, where n is the
number of observations in the sample.
VARIANCE

Consider this data set : 2, 4, 4, 6, 8

 Calculate the sample mean (𝑥): Sum all the data


points and divide by the number of observations (𝑛).
 Subtract the mean from each data point to get the
deviation of each data point from the mean.
 Square each deviation to eliminate negative values.
 Sum all the squared deviations.
 Divide the sum of squared deviations by 𝑛−1, where 𝑛
is the number of observations.
NUMERICAL MEASURES OF
SPREAD
• Percentile: The percentile of a distribution
is the value at which observations fall at or
below it.
PERCENTILE

Consider this data set : 15, 20, 35, 40, 50

1. Sort the data set in ascending order.


2. Determine the rank (index) for the percentile using
the formula:

Where P is the desired percentile, and n is the


number of observations in the data set

3. If the rank is an integer, the value at that position is


the percentile.
4. If the rank is not an integer, interpolate between
the closest ranks.
NUMERICAL MEASURES OF
SPREAD
• The most commonly used percentiles are
the quartiles.

1st quartile Q1 = 25th percentile.


2n quartile Q2 = 50th percentile.
d
quartile Q1 = 75th percentile.
3rd
NUMERICAL MEASURES OF
SPREAD
Inter Quartile Range (IQR)
A simple measure spread giving the range
covered by the middle half of the data is the
(IQR) defined below.

IQR = Q3 - Q1

The IQR is a resistant measure of spread.


INTER QUARTILE RANGE

Consider the data: {4,8,15,16,23,42}

•Sort the data set in ascending order


•Find the first quartile (Q1): This is the
25th percentile of the data set
•Find the third quartile (Q3): This is the
75th percentile of the data set
•Calculate the IQR by subtracting Q1
from Q3
NUMERICAL MEASURES OF
SPREAD
Outliers: extreme observations that fall
well outside the overall pattern of the
distribution.

• An outlier may be the result of a


– Recording error,
– An observation from a different population,
– An unusual extreme observation
(biological diversity)
OUTLIERS

Consider the data set: {4,8,15,16,23,42}


1.Calculate Q1 and Q3 (as shown in the previous
example).
2. Calculate the IQR:
IQR=𝑄3−𝑄1
3. Determine the lower and upper bounds for outliers:

Lower Bound=𝑄1−1.5×IQR
Upper Bound=𝑄3+1.5×IQR

4.Identify any data points that fall below the lower bound
or above the upper bound. These points are considered
outliers.
NUMERICAL MEASURES OF
SPREAD
ASSOCIATION BETWEEN
VARIABLES
• Explanatory (exposure) variable
“X”

• Response (outcome) variable


“Y”
ASSOCIATION BETWEEN
VARIABLES
ASSOCIATION BETWEEN
VARIABLES
ASSOCIATION BETWEEN
VARIABLES
MEASUREMENT OF
CORRELATION
CORRELATION IS NOT
ASSOCIATION
REGRESSION

You might also like