INTRODUCTION
Statistics –
The foundation of statistical methods provided
by mathematics. Statistics is the study of techniques
and procedures for collecting, classifying,
summarizing, analysis and interpreting the numerical
data.
The success in research is possible only with
the help of proper knowledge of statistics. It help in
interpretation and proper use of the results of
research.
1
BIOSTATITICS
Application of statistics to the analysis of
biological and medical data. It is the term
used when tools of statistics are applied to
data that is derived from biological sciences
like medical and paramedical data.
2
DEFINITION
According to Croxton and
Cowden statistics may be defined as a “science of
collection, presentation, analysis of numerical
data”.
According to W.G. Sulcliffe statistics
comprises the collection, tabulation, presentation
and analysis of an aggregate facts collected in
methodical manner without bias related to pre
determined purposes.
3
USE OF STASTITICS IN the
FIELD OF NURSING
Record all the vital events like Wt, Ht…
To keep complete records
To educate public
To help in planning , summarizing and
intrpretating of data.
To survey, collect and tabulate the
information.
To maintain health records
To draw general conclusions
To predict .
4
FUNCTIONS OF STASTITICS
Present facts in a definitive forms ( Numerical
figures)
Present complex figures in simplified forms
Facilitates comparisons
Helps in formulating and testing hypothesis
Helps in forecasting
Helps in formulating suitable policies (Decision
making)
5
Types
Descriptive statistics
Inferential statistics
6
Descriptive Statistics
Descriptive Statistics include the
techniques that are used to summarize and
describe numerical data. These methods can
either be graphical or involve computational
analysis.
7
Inferential Statistics
Inferential statistics include those techniques
by which decisions about a statistical population are
made based on a sample having been observed, or
possibly, by the use of managerial judgments
because such statements are made under
conditions of uncertainty,
the use of probability concepts
is required.
8
MEASURES OF CENTRAL
TENDENCY
Introduction
A measure of central tendency is a single value
that attempts to describe a set of data by identify
the central position within that set of data. As
such measure of central tendency are sometimes
called Measures of centre location.
9
NEED OF MCT
The average represents all the
measurements made on a group and given
a concise description of a group as a whole.
When two or more groups are measure the
CT provide the basis of comparisons
between them.
10
WAYS of MCT
MEAN
◦ This is the mathematical average of a set of
numbers
MEDIAN
◦ This is the middle value of a set of data that has
been arranged from lowest to highest
MODE
◦ The value that occurs the most in a set of data
11
Mean
The Mean or average is the most popular and
important measures of CT. Mean can be defined as
the sum of all the values of all the items in a series
divided by the number of items.
Formula - x
x
n
Mean
Where:
◦ X-bar is the mean
◦ x are the data points
◦ n is the sample size
12
Arithmetic Mean (AM)-
AM = X = Sum of the value/ no. of the value
The mean is the sum of all of the values in the
data set divided by the number of values.
Ex. The mean of 120, 80, 90, 110 & 95
systolic BP.
Solution-
1. Add all the values together
2. Divide by the no. of the values to obtain
the mean.
13
Example: Mean
n= 5 Systolic blood pressures (mmHg)
X1=120
X2=80 x
x
X3=90 n
X4=110 Mean
X5=95
14
Mean (Individual Data)
The Ht. of 10 B. Sc. Nursing students are in cm as :
160, 162, 175, 158, 156, 169, 173, 192, 165 ,
167cm. Find the mean Ht of the students.
Formula- x
x
n
=160+162+175+158+156+169+173+ 192+165+167/ 10
=1657/ 10
Answer-165.7cm mean Ht
15
Mean of grouped data (discrete)
Calculate the average number of children
per family from this data
No. of children No. of families
0 30
1 52
2 60
3 65
4 18
5 10
6 05
16
Mean: Grouped data
No. of children No. families (f) Total (fx)
(x)
0 30 0 × 30 = 0
1 52 1 × 52 = 52
2 60 2 × 60 = 120
3 65 3 × 65 = 195
4 18 4 × 18 = 72
5 10 5 × 10 = 50
6 5 6 × 5 = 30
Total 240 519
17
Solution
first we have to calculate fx that is by
multiplying no. of subjects (x) with
frequency (f).
fx
x
f
Answer-
= 519 = 2.1625 no. of children / family
240
18
Mean: Grouped data (continuous)
Example: The following data marks obtained by
nursing students in research. Calculate the mean marks
obtained by the students.
MARKS 40-44 44-48 48-52 52-56 56-60
STUDENTS 8 10 20 15 7
Formula- fx
x
f
19
Mid point= L+U/2
MARKS F X (Mid point) Fx
40-44 8 42 336
44-48 10 46 460
48-52 20 50 1000
52-56 15 54 810
56-60 7 58 406
∑f = 60 ∑fx= 3012
= 3012 = 50.2
60
Ans. The mean marks obtained is 50.2
20
What is the Median?
If the data has been sorted (ascending or
descending), the median is the middle value
(for an odd number of points) or the average of
the two middle values (for an even number of
points).
median is used to characterize data sets with a
few extreme values that distort the relevance of
the mean, such as house values or family
incomes.
n+1
Median = ( 2 ) th item in the data array
21
Sample MEDIAN
The median is the middle number
SAMPLE MEDIAN
22
The sample median is not sensitive to extreme
values.
For example: If 120 became 200, the median would
remain the same, but the mean would change to 115.
SAMPLE MEDIAN
23
Exercise
Following are the marks of 7 students. Find
out the median marks.
Roll No. 1 2 3 4 5 6 7
SOLUTION-
40 30 18 55 60 25 45
Arrange marks in ascending orders.
ROLL NO. 1 2 3 4 5 6 7
18 25 30 40 45 55 60
Median = (7+1)/ 2
= (8)/2
=4th item
Ans. 40 is median.
24
Median from Even Series
Find out the median from following data
60, 55 62, 40 35, 65, 70 , 68
SOLUTION-
Arrange data in ascending orders.
35,40,55,60,62,65,68,70
Median = size of items
= Size of (8+1) / 2
= 4.5th item
Size of 4.5th item = (4th item+5th item)/2
=(60+62) /2
= (122)/2
=61
Answer- median mark is =61
25
Calculation of median for
discrete frequency table
Example – daily Wages of employees
Wage (x) 145 170 180 190 200 210
No. of 3 16 8 20 6 2
Employees (f)
SOLUTION-
n+1
Median = ( 2 ) th item in the data array
Continue….
26
x f cf
145 3 3
170 16 19
180 8 27
190 20 47
200 6 53
210 2 55
Total 55
=(55+1)/ 2th observation array
=(56)/ 2th
=28th observation array
That means 28th observation that is value , lies after .
27th cumulative frequency that is 47 & x value in front
of 47th cf is 190 .
Answer - Hence median = 190
27
Calculating Median for
continuous frequency table
Wages 80-100 100-120 120-140 140-160 160-180
workers 8 12 16 8 6
( N / 2 C. F.)
Median = L Xh where L is lower limit of Median Class; N is total Frequency,
F
C.F. id cumulative frequency of class preceding median class, F is frequency of median class
and h is class width.
28
Class interval f Cf
80-100 8 8
100-120 12 20
120-140 16 36
140-160 8 44
160-180 6 50
Median class = class which contains (N/2) th observation
= (50)/ 2th item
= 25th
120th -140th class interval
L= 120, h= 140 , h= 20, F= 16, cf= 20
=120+ (25-20)20 / 16
= 120 + (5*20) / 16
= 120 + (100)/ 16
= 120 + 6.3
= 126.3
Inference: Median is = 1267.5
29
What is the Mode?
If the data is discrete, or has been
grouped into discrete intervals, the
mode is that value that occurs the most
often.
In other words it is the value most likely
to occur.
Ex.. :Data gives the age of 10 patients.
46,47, 49, 47, 52,47, 53, 49, 46, 47
Solution- the value of 47 is occurring
highest number of the time.
Hence Z= 47
30
mode for discrete frequency table
Following data gives wages of employees in a hospital-
Z= most occurring frequency=20
WAGES 145 170 180 190 200 210
EMPLOYEES 3 16 8 20 6 2
Answer = 190 is the value for highest occurring frequency
i.e. 20 therefore mode(Z) is 190.
31
calculating the mode for continuous
frequency table
32
Example – calculating the mode for continuous frequency table
Calculate mode for following data-
Hb% 8-9 9-10 10-11 11-12 12-13 13-14 14-15
Mother 8 14 21 25 15 10 7
Solution- f0 = 21, f1= 25, f2= 15, l= 11, h=1
Z= 11 + {(25-21)1}/ (2*25-21-15)
= 11+(4) /50-36
= 11+ (4) /14
= 11 + 0.28
= 11.28 is the mode(Z)
33
Measure of variations
The spread of data set the
departure from central tendency.
OR
The degree to which numerical
data tend to spread about an average
value , is called variation or dispersion
of data…
34
Types
The following measures of variations are
commonly used:
Range
Variance
Quartile Deviation
Standard deviation
35
What Is the Range?
Range: the distance between the lowest
and the highest values in the set.
For example: The time to drive to
Church gate is 105 to 135 minutes. Thus
the range is 30 minutes.
36
Measures of Dispersion or Spread
(ungrouped)
Range = H-L
◦ H = The highest value
◦ L= The lowest value….
◦ From our last example, the
range would be: 135 – 105 = 30
37
What is the Variance?
The Variance of a population is the sum of the
squares of the differences between the mean and
the individual data points divided by the number of
data points.
The Variance of a sample is the sum of the
squared differences divided by the number of data
points less one.
38
Standard Deviation
This is the average distance your
values have from the mean score.
The Standard Deviation is the square
root of the variance
SD
(X X ) 2
N
39
40
41
Uses of Std. Deviation
Summarizes the deviations of a large
distribution from mean into one figure as a unit
of variation
Helps in computation of standard error
Used in sample size calculations
It is widely used in biological studies. It is used
in fitting normal curve to a frequency distribution.
It is most widely used measures of dispersion,.
42
43
Merits of Std. Deviation
It is rigidly defined
It is based on all observations
It does not ignore the algebraic sum of
deviations
It is capable of further mathematical treatment
It is not affected by sampling fluctuations
It is an abstract number , which gives us the
idea of the “spread” of the variable of interest in
the sample
Most commonly used measure of variability
The larger the SD, the greater the dispersion of
values about the mean
44
45
Demerits of Std. Deviation
Calculation is difficult to understand and not
easier as Range.
It cannot be calculated for qualitative data and
distribution with open end classes
It is unduly affected due to extreme deviations
It is always depends upon AM
46
47