DATA PRESENTATION
Methods of Presenting Data
Textual
Tabular
Graphical
Qualitative Data Ordinal Scale: Frequency Table
Nominal Scale: Frequency table Level of pain frequency %
Sex frequency % Mild 8 26.7
Moderate 12 40
Male 12 40
Severe 10 33.3
Female 18 60
Total 30 100
Total 30 100
14
Bar Graph 12
Pie Graph 20 10 Bar Graph: (one color)
8
15
6
10 4
5 2
0 0
Category 1 Mild Moderate Severe
Male Female
Male Female Column1
Cross Tabulation and Contingency Table
• The study of pattern that • A contingency table
may exist between two or presents the result of two
more categorical variables categorical variables.
is common in almost all
fields.
Contingency Table
Gender
Opinion Total
Male Female
f % f % f %
Agree 130 56.52 50 18.52 180 36
Undecided 60 26.08 70 25.93 130 26
Disagree 40 17.4 150 55.55 190 38
Total 230 100 270 100 500 100
Contingency Table
Civil Quality of Work Life Total
Status
High Moderate
f % f % f %
Single 48 94.1 3 5.9 51 100
Married 70 92.1 6 7.9 76 100
Total 118 92.9 9 7.1 127 100
Quantitative Data: Graphical Presentation
Histogram and Frequency Polygon
Frequency Polygon
• Histogram Chart Title
30
25
20
15
10
Series 1 1
Summary Measures for Quantitative Data
Location Variation Skewness
Range
Kurtosis
Maximum
Minimum
Standard Deviation
Central Variance
Tendency
Mean Mode
Median
Measures of Location
A Measure of Location summarizes a data set
by giving a “typical value” within the range of
the data values that describes its location
relative to entire data set.
Some Common Measures:
Minimum, Maximum
Central Tendency
Maximum and Minimum
• Minimum is the smallest value in
the data set, denoted as MIN.
• Maximum is the largest value in the
data set, denoted as MAX.
Measure of Central Tendency
• A single value that is used to identify the
“center” of the data
• it is thought of as a typical value of the
distribution
• precise yet simple
• most representative value of the data
Mean
• Most common measure of the center
• Also known as arithmetic average
X i
X1 + X 2 + + XN
Population Mean = i =1
=
N N
n
x
x1 + x2 +i
+ xn
Sample Mean x= = i =1
n n
Properties of the Mean
• may not be an actual observation
in the data set
• can be applied in at least interval
level
• easy to compute
• every observation contributes to
the value of the mean
Properties of the Mean
• subgroup means can be combined to come
up with a group mean
• easily affected by extreme values
𝟏+𝟑+𝟓+𝟕+𝟗 𝟏+𝟑+𝟓+𝟕+𝟏𝟒
Mean = =5 Mean = = 6
𝟓 𝟓
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5
Mean = 6
Median
• Divides the observations into two
equal parts
• If the number of observations is
odd, the median is the middle
number.
• If the number of observations is
even, the median is the average of
the 2 middle numbers.
Properties of a Median
• may not be an actual observation in the
data set
• can be applied in at least ordinal level
• a positional measure; not affected by
extreme values
Locating the median: Locating the median:
𝒏+𝟏 𝟓+𝟏 𝒏+𝟏 𝟔+𝟏
= 𝟐 = 3 or 3rd term = 𝟐 = 3.5 or 3.5th term
𝟐 𝟐
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5 Median = 5
Mode
• occurs most frequently
• nominal average
• may or may not exist
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
No Mode
Mode = 9
Properties of a Mode
• can be used for qualitative as well
as quantitative data
• may not be unique
• not affected by extreme values
• can be computed for ungrouped
and grouped data
Mean, Median & Mode
Use the mean when:
• sampling stability is desired
• other measures are to be computed
Mean, Median & Mode
Use the median when:
• the exact midpoint of the distribution is desired
• there are extreme observations
Mean, Median & Mode
Use the mode when:
• when the "typical" value is desired
• when the dataset is measured on a nominal scale
Measures of Variation
• A measure of variation is a single value that is used to
describe the spread of the distribution
• A measure of central tendency alone does not uniquely
describe a distribution
Measures of Dispersion:
Range
Standard Deviation
Variance
Range (R)
The difference between the maximum and
minimum value in a data set, i.e.
R = MAX – MIN
Example: Pulse rates of 15 male residents of a
certain village
54 58 58 60 62 65 66 71
74 75 77 78 80 82 85
R = 85 - 54 = 31
Some Properties of the Range
The larger the value of the
range, the more dispersed the
observations are.
It is quick and easy to
understand.
A rough measure of
dispersion.
Standard Deviation (SD)
• most important measure of variation
• square root of Variance
• has the same units as the original data
N
(X i − )2
Population SD = i =1
N
Sample SD (x − x) i
2
s= i =1
n −1
Variance
• important measure of variation
• shows variation about the mean
N
Population variance (X i − )2
2 = i =1
N
Sample variance
n
(x − x)i
2
s2 = i =1
n −1
Standard Deviation and Variance
Standard Deviation (SD) = √(𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒)
2
Variance = (𝑆𝐷)
n X (X-𝑥ഥ ) (X-𝑥)ҧ 2 n X (X-𝑥)ҧ (X-𝑥)ҧ 2
1 75 -1 1 1 74 -3 9
2 75 -1 1
2 74 -3 9
3 76 0 0
3 77 0 0
4 77 1 1
4 80 3 9
5 77 1 1 5 80 3 9
n=5 𝑥ҧ = 𝟕𝟔 (X-𝑥)ҧ 2 = 4 n=5 𝑥ҧ = 𝟕𝟕 (X-𝑥)ҧ 2 = 36
n n
(x − x)
i
2
(x − x) i
2
s= i =1
s= i =1
n −1 n −1
4 36
s.d. = = 𝟏=1 s.d. = = 𝟗=3
4 4
Variance = 1 Variance = 9
Standard Deviation and Variance
Standard Variance
Deviation Standard Variance
1 1 Deviation
6 ?
0 0
? 49
5 25
8 ?
10 100
Remarks on Standard Deviation
◼ If there is a large amount of variation,
then on average, the data values will be
far from the mean. Hence, the SD will be
large.
◼ If there is only a small amount of
variation, then on average, the data
values will be close to the mean. Hence,
the SD will be small.
Comparing Standard Deviation
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Comparing Standard Deviation
Example: Team A - Heights of five marathon players in inches
Mean = 65
S =0
65 “ 65 “ 65 “ 65 “ 65 “
Comparing Standard Deviation
Example: Team B - Heights of five marathon players in inches
Mean = 65”
SD = 5.0”
60 “ 60 “ 65 “ 70 “ 70 “
Properties of Standard Deviation
• It is the most widely used measure of dispersion.
• It is based on all the items and is rigidly defined.
• It is used to test the reliability of measures
calculated from samples.
• The standard deviation is sensitive to the presence
of extreme values.
• It is not easy to calculate by hand (unlike the range).
Measure of Skewness
▪ Describes the degree of departures of the
distribution of the data from symmetry.
▪ The degree of skewness is measured by the
coefficient of skewness, denoted as SK and
computed as,
3(Mean − Median)
SK =
SD
What is Symmetry?
A distribution is said to be
symmetric about the mean, if
the distribution to the left of
mean is the “mirror image” of
the distribution to the right of
the mean. Likewise, a
symmetric distribution has
SK=0 since its mean is equal to
its median and its mode.
Measure of Skewness
SK > 0
positively skewed
“Skewed to the Right”
(many are low)
SK < 0
negatively skewed
“Skewed to the left”
(many are high)
Measure of Kurtosis
• Describes the extent of peakedness or
flatness of the distribution of the data.
• Measured by coefficient of kurtosis (K)
computed as,
N
( X − )
4
i
K= i =1
−3
N
4
Measure of Kurtosis
K=0
mesokurtic
“Normal”
K>0 K<0
leptokurtic Platykurtic
(many of the scores are close to the mean) (many of the scores are away from the mean)
End of Chapter 3
Thank you!!