0% found this document useful (0 votes)

81 views13 pages

Descriptive Stat

This document provides an overview of descriptive statistics, including measures of central tendency (mean, median, mode), measures of spread/dispersion (standard deviation, variance, percentile, quartiles, interquartile range), skewness, and kurtosis. It defines each concept and provides examples of calculating and interpreting various descriptive statistics for sample data sets. The goal of descriptive statistics is to summarize and organize data in a way that is easily understood, without making inferences to the overall population.

Uploaded by

Joshrel Cielo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views13 pages

Descriptive Stat

Uploaded by

Joshrel Cielo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Statistics is a branch of mathematics that deals with

collecting, interpreting, organization and interpretation

of data.

Initially, when we get the data, instead of applying fancy

algorithms and making some predictions, we first try to read and
understand the data by applying statistical techniques. By doing
this, we are able to understand what type of distribution data has.

This blog aims to answer following questions:

1. What is Descriptive Statistics?

2. Types of Descriptive Statistics?

3. Measure of Central Tendency (Mean, Median, Mode)

4. Measure of Spread / Dispersion (Standard Deviation, Mean

Deviation, Variance, Percentile, Quartiles, Interquartile Range)

5. What is Skewness?

6. What is Kurtosis?

7. What is Correlation?

Today, let’s understand descriptive statistics once and for all. Let’s
start,
What is Descriptive Statistics?
Descriptive statistics involves summarizing and organizing the
data so they can be easily understood. Descriptive statistics, unlike
inferential statistics, seeks to describe the data, but do not attempt
to make inferences from the sample to the whole population. Here,
we typically describe the data in a sample. This generally means
that descriptive statistics, unlike inferential statistics, is not
developed on the basis of probability theory.

Types of Descriptive Statistics?

Descriptive statistics are broken down into two categories.
Measures of central tendency and measures of variability (spread).

Measure of Central Tendency

Central tendency refers to the idea that there is one number that
best summarizes the entire set of measurements, a number that is
in some way “central” to the set.

Mean / Average
Mean or Average is a central tendency of the data i.e. a number
around which a whole data is spread out. In a way, it is a single
number which can estimate the value of whole data set.

Let’s calculate mean of the data set having 8 integers.

Image 2

Median
Median is the value which divides the data in 2 equal parts i.e.
number of terms on right side of it is same as number of terms on
left side of it when data is arranged in either ascending or
descending order.

Note: If you sort data in descending order, it won’t affect median

but IQR will be negative. We will talk about IQR later in this blog.

Median will be a middle term, if number of terms is odd

Median will be average of middle 2 terms, if number of terms is

even.

Image 3

The median is 59 which will divide set of numbers into equal two
parts. Since there are even numbers in the set, the answer is
average of middle numbers 51 and 67.

Note: When values are in arithmetic progression (difference

between the consecutive terms is constant. Here it is 2.), median
is always equal to mean.

Image 4

An mean of these 5 numbers is 6 and so median.

Mode
Mode is the term appearing maximum time in data set i.e. term
that has highest frequency.

Image 5

In this data set, mode is 67 because it has more than rest of the
values, i.e. twice.

But there could be a data set where there is no mode at all as all
values appears same number of times. If two values appeared
same time and more than the rest of the values then the data set
is bimodal. If three values appeared same time and more than
the rest of the values then the data set is trimodal and for n
modes, that data set is multimodal.

Measure of Spread / Dispersion

Measure of Spread refers to the idea of variability within your
data.

Standard deviation
Standard deviation is the measurement of average distance
between each quantity and mean. That is, how data is spread out
from mean. A low standard deviation indicates that the data points
tend to be close to the mean of the data set, while a high standard
deviation indicates that the data points are spread out over a wider
range of values.

There are situations when we have to choose between sample or

population Standard Deviation.
When we are asked to find SD of some part of a population, a
segment of population; then we use sample Standard Deviation.

Image 6

where x̅ is mean of a sample.

But when we have to deal with a whole population, then we use

population Standard Deviation.

Image 7

where µ is mean of a population.

Though sample is a part of a population, their SD formulas should

have been same, but it is not. To find out more about it, refer
this link

As you know, in descriptive statistics, we generally deal with a data

available in a sample, not in a population. So if we use previous
data set, and substitute the values in sample formula,

Image 8
And answer is 29.62.

Mean Deviation / Mean Absolute Deviation

It is an average of absolute differences between each value in a set
of values, and the average of all values of that set.

Mean Deviation [Image 9] (Image courtesy: My Photoshopped Collection)

So if we use previous data set, and substitute the values,

Image 10

And answer is 23.75.

Variance
Variance is a square of average distance between each quantity
and mean. That is it is square of standard deviation.

Image 11

And answer is 877.34.

Range
Range is one of the simplest techniques of descriptive statistics. It
is the difference between lowest and highest value.

Image 12

Range is 99–12 = 87

Percentile
Percentile is a way to represent position of a values in data set. To
calculate percentile, values in data set should always be in
ascending order.

Image 13

The median 59 has 4 values less than itself out of 8. It can also be
said as: In data set, 59 is 50th percentile because 50% of the total
terms are less than 59. In general, if k is nth percentile, it implies
that n% of the total terms are less than k.

Quartiles
In statistics and probability, quartiles are values that divide your
data into quarters provided data is sorted in an ascending
order.

Quartiles [Image 14] (Image courtesy: https://statsmethods.wordpress.com/2013/05/09/iqr/)

There are three quartile values. First quartile value is at 25
percentile. Second quartile is 50 percentile and third quartile is 75
percentile. Second quartile (Q2) is median of the whole data. First
quartile (Q1) is median of upper half of the data. And Third
Quartile (Q3) is median of lower half of the data.

So here, by analogy,

Q2 = 67: is 50 percentile of the whole data and is median.

Q1 = 41: is 25 percentile of the data.

Q3 = 85: is 75 percentile of the date.

Interquartile range (IQR) = Q3 - Q1 = 85 - 41 = 44

Note: If you sort data in descending order, IQR will be -44. The
magnitude will be same, just sign will differ. Negative IQR is fine,
if your data is in descending order. It just we negate smaller values
from larger values, we prefer ascending order (Q3 - Q1).

Skewness
Skewness is a measure of the asymmetry of the probability
distribution of a real-valued random variable about its mean. The
skewness value can be positive or negative, or undefined.

In a perfect normal distribution, the tails on either side of the

curve are exact mirror images of each other.
When a distribution is skewed to the left, the tail on the curve’s
left-hand side is longer than the tail on the right-hand side, and
the mean is less than the mode. This situation is also called
negative skewness.

When a distribution is skewed to the right, the tail on the curve’s

right-hand side is longer than the tail on the left-hand side, and
the mean is greater than the mode. This situation is also called
positive skewness.

Skewness [Image 16] (Image courtesy: https://www.safaribooksonline.com/library/view/clojure-

for-data/9781784397180/ch01s13.html)

How to the skewness coefficient?

To calculate skewness coefficient of the sample, there are two

methods:

1] Pearson First Coefficient of Skewness (Mode skewness)

Image 17

2] Pearson Second Coefficient of Skewness (Median skewness)

Image 18

Interpretations

 The direction of skewness is given by the sign. A zero means no

skewness at all.
 A negative value means the distribution is negatively skewed. A
positive value means the distribution is positively skewed.
 The coefficient compares the sample distribution with a normal
distribution. The larger the value, the larger the distribution
differs from a normal distribution.

Sample problem: Use Pearson’s Coefficient #1 and #2 to find the

skewness for data with the following characteristics:

 Mean = 50.
 Median = 56.
 Mode = 60.
 Standard deviation = 8.5.

Pearson’s First Coefficient of Skewness: -1.17.

Pearson’s Second Coefficient of Skewness: -2.117.

Note: Pearson’s first coefficient of skewness uses the mode.

Therefore, if frequency of values is very low then it will not give a
stable measure of central tendency. For example, the mode in both
these sets of data is 9:
1, 2, 3, 4, 4, 5, 6, 7, 8, 9.

In the first set of data, the mode only appears twice. So it is not a
good idea to use Pearson’s First Coefficient of Skewness. But in
second set,

1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 12, 12, 13.

the mode 4 appears 8 times. Therefore, Pearson’s Second

Coefficient of Skewness will likely give you a reasonable result.

Kurtosis
The exact interpretation of the measure of Kurtosis used to be
disputed, but is now settled. Its about existence of outliers.
Kurtosis is a measure of whether the data are heavy-tailed
(profusion of outliers) or light-tailed (lack of outliers) relative to a
normal distribution.

Kurtosis [Image 19] (Image

courtesy: https://mvpprograms.com/help/mvpstats/distributions/SkewnessKurtosis)

There are three types of Kurtosis

Mesokurtic

Mesokurtic is the distribution which has similar kurtosis as

normal distribution kurtosis, which is zero.

Leptokurtic

Distribution is the distribution which has kurtosis greater than a

Mesokurtic distribution. Tails of such distributions are thick and
heavy. If the curve of a distribution is more peaked than
Mesokurtic curve, it is referred to as a Leptokurtic curve.

Platykurtic

Distribution is the distribution which has kurtosis lesser than a

Mesokurtic distribution. Tails of such distributions thinner. If a
curve of a distribution is less peaked than a Mesokurtic curve, it is
referred to as a Platykurtic curve.
The main difference between
skewness and kurtosis is that the skewness
refers to the degree of symmetry, whereas the
kurtosis refers to the degree of presence of
outliers in the distribution.

Correlation
Correlation is a statistical technique that can show whether and
how strongly pairs of variables are related.
Correlation [Image 20] (Image courtesy: http://www.statisticshowto.com/what-is-correlation/)

The main result of a correlation is called the correlation

coefficient (or “r”). It ranges from -1.0 to +1.0. The closer r is to
+1 or -1, the more closely the two variables are related.

If r is close to 0, it means there is no relationship between the

variables. If r is positive, it means that as one variable gets larger
the other gets larger. If r is negative it means that as one gets
larger, the other gets smaller (often called an “inverse”
correlation).

Unit-3 DS Students
No ratings yet
Unit-3 DS Students
35 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Measures of Central Tendency Guide
No ratings yet
Measures of Central Tendency Guide
32 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
3 - Descriptive Stat
No ratings yet
3 - Descriptive Stat
70 pages
Intro to Descriptive Statistics
100% (2)
Intro to Descriptive Statistics
57 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Lecture 06-Describing Data Visual Information
No ratings yet
Lecture 06-Describing Data Visual Information
49 pages
Chapter 2 Descriptive Statistics
No ratings yet
Chapter 2 Descriptive Statistics
15 pages
Basic Statistics
No ratings yet
Basic Statistics
7 pages
EDA W3 Obtaining-Data
No ratings yet
EDA W3 Obtaining-Data
57 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Unit 3
No ratings yet
Unit 3
7 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
CH 2 Lecture Notes
No ratings yet
CH 2 Lecture Notes
12 pages
Freq. Distribution Characteristics
No ratings yet
Freq. Distribution Characteristics
13 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Statistical Measures 2024 (Part 2) - Word
No ratings yet
Statistical Measures 2024 (Part 2) - Word
8 pages
Descriptive Statistics & Data Prep
No ratings yet
Descriptive Statistics & Data Prep
113 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
DDDDDD 2
No ratings yet
DDDDDD 2
5 pages
Measure of Dispersion-Intro
No ratings yet
Measure of Dispersion-Intro
14 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
Exploring Numerical Data - Students
No ratings yet
Exploring Numerical Data - Students
97 pages
Statistics Basics for Data Science
100% (1)
Statistics Basics for Data Science
27 pages
Jerome Statistics
No ratings yet
Jerome Statistics
12 pages
Statistics
No ratings yet
Statistics
13 pages
Chapter-3ni Kamote Chua
No ratings yet
Chapter-3ni Kamote Chua
29 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Lecture Notes 2 - Descriptive Statistics-1720598791715
No ratings yet
Lecture Notes 2 - Descriptive Statistics-1720598791715
21 pages
Statistics: Types, Data, and Measures
No ratings yet
Statistics: Types, Data, and Measures
6 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
63 pages
Statistics 3: DR Taher
No ratings yet
Statistics 3: DR Taher
38 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
Stats
No ratings yet
Stats
109 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
9 pages
Statistics for Data Analysis
No ratings yet
Statistics for Data Analysis
59 pages
Descriptive Statistics Guide
No ratings yet
Descriptive Statistics Guide
5 pages
Theory and Formula
No ratings yet
Theory and Formula
42 pages
3-Measures of Dispersion
No ratings yet
3-Measures of Dispersion
33 pages
Lesson 3.2 Measures of Central Tendency Position and Variation
No ratings yet
Lesson 3.2 Measures of Central Tendency Position and Variation
62 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Central Tendency & Variability Guide
100% (1)
Central Tendency & Variability Guide
9 pages
Descriptive Statistics MBA
100% (3)
Descriptive Statistics MBA
7 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel
48 pages
Numerical Summary Statistics
No ratings yet
Numerical Summary Statistics
19 pages
Sps 2291 Lesson 3
No ratings yet
Sps 2291 Lesson 3
19 pages
Ib A&i 3.1
No ratings yet
Ib A&i 3.1
38 pages
Biostat Lec Part 4 (SV)
No ratings yet
Biostat Lec Part 4 (SV)
3 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Measuring Math Attitude Methods
No ratings yet
Measuring Math Attitude Methods
4 pages
ATMI Tapia
No ratings yet
ATMI Tapia
14 pages
SF 6 Summarized Report On Promotion and Learning Progress & Achievement
No ratings yet
SF 6 Summarized Report On Promotion and Learning Progress & Achievement
1 page
SF 5 Report On Promotion and Learning Progress & Achievement - 0
No ratings yet
SF 5 Report On Promotion and Learning Progress & Achievement - 0
1 page
MTH302
0% (1)
MTH302
117 pages
SRDS Lecture 2 Probability Distributions
No ratings yet
SRDS Lecture 2 Probability Distributions
35 pages
The Influence of Agile Practices On Project Outcomes: Performance, Stakeholder Satisfaction, and Team Dynamics
No ratings yet
The Influence of Agile Practices On Project Outcomes: Performance, Stakeholder Satisfaction, and Team Dynamics
11 pages
Sampling Distribution Mean & Variance
100% (1)
Sampling Distribution Mean & Variance
21 pages
Wa0013
No ratings yet
Wa0013
42 pages
Confidence Interval Review
No ratings yet
Confidence Interval Review
13 pages
E 643-84 - E517-00 - Standard Test Method For Ball Punch Test
No ratings yet
E 643-84 - E517-00 - Standard Test Method For Ball Punch Test
4 pages
Coding Simple Pendulum
No ratings yet
Coding Simple Pendulum
7 pages
Statistics & Probability LAS Guide
No ratings yet
Statistics & Probability LAS Guide
10 pages
Statistics Exam Practice
No ratings yet
Statistics Exam Practice
3 pages
A334 Carino, Patricia Andrea Assignment On Quality Control
No ratings yet
A334 Carino, Patricia Andrea Assignment On Quality Control
3 pages
Project Assignment 1 2
No ratings yet
Project Assignment 1 2
4 pages
Monitoring and Controlling Forecasts
No ratings yet
Monitoring and Controlling Forecasts
17 pages
Statistical Process Control QPSP
No ratings yet
Statistical Process Control QPSP
166 pages
Midterm Exam in Statistics and Probability Grade 11
No ratings yet
Midterm Exam in Statistics and Probability Grade 11
3 pages
Abs 300 - Descriptive Statistics
No ratings yet
Abs 300 - Descriptive Statistics
4 pages
Measuring Precarious Employment in Times of Crisis: The Revised Employment Precariousness Scale (Epres) in Spain
No ratings yet
Measuring Precarious Employment in Times of Crisis: The Revised Employment Precariousness Scale (Epres) in Spain
4 pages
20 TĐ Lê Viết Tình (Sửa Theo Ý Kiến 02 Phản Biện)
No ratings yet
20 TĐ Lê Viết Tình (Sửa Theo Ý Kiến 02 Phản Biện)
15 pages
A Study To Assess The Knowledge Regarding Prevention of Home Accidents Among Mothers of Under Five Children in Selected Hospital, Chennai
No ratings yet
A Study To Assess The Knowledge Regarding Prevention of Home Accidents Among Mothers of Under Five Children in Selected Hospital, Chennai
6 pages
FRCOphth Exam Report Jan 2020
No ratings yet
FRCOphth Exam Report Jan 2020
16 pages
Fall 2021 - MGT 2070 - Midterm Exam Review Questions (SOLUTIONS)
No ratings yet
Fall 2021 - MGT 2070 - Midterm Exam Review Questions (SOLUTIONS)
12 pages
Ureal: GLDH 4 2
No ratings yet
Ureal: GLDH 4 2
8 pages
The Problem
No ratings yet
The Problem
7 pages
Assignment.1 3rd Sem CSE
No ratings yet
Assignment.1 3rd Sem CSE
2 pages
Lesson Hypothesis Testing
No ratings yet
Lesson Hypothesis Testing
47 pages
Howtodoit: Sample Size Calculation With Simple Math For Clinical Researchers
100% (1)
Howtodoit: Sample Size Calculation With Simple Math For Clinical Researchers
3 pages
What Is A Standard Error, and How Should We Compute It - Jeffrey M. Wooldridge
No ratings yet
What Is A Standard Error, and How Should We Compute It - Jeffrey M. Wooldridge
8 pages
CFX Maestro For Mac Release Notes
No ratings yet
CFX Maestro For Mac Release Notes
9 pages
Mid Cap Mutual Funds Analysis
No ratings yet
Mid Cap Mutual Funds Analysis
9 pages
Week 002-003-Course Module Normal Distribution
No ratings yet
Week 002-003-Course Module Normal Distribution
43 pages

Descriptive Stat

Uploaded by

Descriptive Stat

Uploaded by

Statistics is a branch of mathematics that deals with

collecting, interpreting, organization and interpretation

Initially, when we get the data, instead of applying fancy

This blog aims to answer following questions:

1. What is Descriptive Statistics?

2. Types of Descriptive Statistics?

3. Measure of Central Tendency (Mean, Median, Mode)

4. Measure of Spread / Dispersion (Standard Deviation, Mean

Types of Descriptive Statistics?

Measure of Central Tendency

Let’s calculate mean of the data set having 8 integers.

Note: If you sort data in descending order, it won’t affect median

Median will be a middle term, if number of terms is odd

Median will be average of middle 2 terms, if number of terms is

Note: When values are in arithmetic progression (difference

An mean of these 5 numbers is 6 and so median.

Measure of Spread / Dispersion

There are situations when we have to choose between sample or

where x̅ is mean of a sample.

But when we have to deal with a whole population, then we use

where µ is mean of a population.

Though sample is a part of a population, their SD formulas should

As you know, in descriptive statistics, we generally deal with a data

Mean Deviation / Mean Absolute Deviation

Mean Deviation [Image 9] (Image courtesy: My Photoshopped Collection)

So if we use previous data set, and substitute the values,

And answer is 23.75.

And answer is 877.34.

Quartiles [Image 14] (Image courtesy: https://statsmethods.wordpress.com/2013/05/09/iqr/)

Q2 = 67: is 50 percentile of the whole data and is median.

Q1 = 41: is 25 percentile of the data.

Q3 = 85: is 75 percentile of the date.

Interquartile range (IQR) = Q3 - Q1 = 85 - 41 = 44

In a perfect normal distribution, the tails on either side of the

When a distribution is skewed to the right, the tail on the curve’s

Skewness [Image 16] (Image courtesy: https://www.safaribooksonline.com/library/view/clojure-

How to the skewness coefficient?

To calculate skewness coefficient of the sample, there are two

1] Pearson First Coefficient of Skewness (Mode skewness)

2] Pearson Second Coefficient of Skewness (Median skewness)

 The direction of skewness is given by the sign. A zero means no

Sample problem: Use Pearson’s Coefficient #1 and #2 to find the

Pearson’s First Coefficient of Skewness: -1.17.

Pearson’s Second Coefficient of Skewness: -2.117.

Note: Pearson’s first coefficient of skewness uses the mode.

1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10, 12, 12, 13.

the mode 4 appears 8 times. Therefore, Pearson’s Second

Kurtosis [Image 19] (Image

There are three types of Kurtosis

Mesokurtic is the distribution which has similar kurtosis as

Distribution is the distribution which has kurtosis greater than a

Distribution is the distribution which has kurtosis lesser than a

The main result of a correlation is called the correlation

If r is close to 0, it means there is no relationship between the

You might also like