0% found this document useful (0 votes)

57 views13 pages

Data Analysis 1

Uploaded by

bradleymakaure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views13 pages

Data Analysis 1

Uploaded by

bradleymakaure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 13

Data Analysis

The purpose of data analysis is to::

•Produce descriptive statistics to summarize the data.

•Create graphics which help to visualize data.

•Use inferential statistics to distinguish between significant and non-

significant effects .

•Create predictive models which can be used to predict future results within a
given experimental domain.
Data Analysis
Descriptive Statistics can be categorized in two groups:
1. Measures of centrality.
2. Measures of variation.

Measure of Advantage Disadvantage Formula

Centrality
Arithmetic Mean Can be used for Sensitive to
inferential outliers
statistics
Geometric mean Damp the effect of Cannot be used
outliers. useful for for inferential
changing data statistics
Harmonic mean Damp the effect of Cannot be used
outliers. Useful for for inferential
rates and ratios statistics
Median Insensitive to Insensitive to Exact center of
outliers the distribution distribution
of data
Data Analysis

Measure Advantage Disadvantage Formula

of
dispersion
Standard Very useful parameter, Not additive
deviation properties are well known
Variance Useful parameter. The Squares true
variance is additive dispersion
Relative Useful when comparing Cannot be used for
STD dissimilar data sets statistical inference
Standard Used when calculating Not additive
Error uncertainties
Range Simple to calculate Based on only two
data points
Types of Variables

Type of Variable Definition Examples

Continuous Variable which can take on Mass,
any value between two Concentration,
specified limits Temperature.
Nominal Categorical variable in which Type of catalyst,
there is no order Method of analysis,
Binary variable: Pass/Fail.
Ordinal Ordered categorical variable Rating scale,
Diagnosis.
For many methods of data analysis, it is important to identify the
independent variables (factors) and the dependent variable (response)
Exploratory Data Analysis (EDA)
EDA is used for the following purposes:
•To help the researcher to formulate relevant hypothesis.
•To suggest the appropriate statistical tools to analyze the data.

Many EDA techniques involve graphical displays of the data such as:

•Histograms,
•Box and whisker plots,
•Pareto charts,
•Stem-and-leaf plots,
•Multi-vari charts.
Exploratory Data Analysis (EDA)
Example

Histogram: Yield (g) B ox P lot: Y ield (g)

400 12

10
300
8

Yield (g)
No of obs

200 6

4
100
2

0 0
1 2 3 4 5 6 7 8 9 10 11 12
Y ie ld (g)
Exploratory Data Analysis (EDA)
example 2: Box plots and Correlation matrix
of IQ and 4 Test marks (2000 students)

Box & Whisker Plot

120
Correlation Matrix
100 T1 T2 T3 T4
80 IQ 0.51 0.82 0.02 0.52

60 T1 0.42 0.03 0.60

40 T2 0.04 0.55

20 T3 0.02

0
IQ T1 T2 T3 T4
Exploratory Data Analysis (EDA)

Other EDA techniques:

• Cluster Analysis Collects “similar” variables in

clusters.

• Principle Component Analysis Reduces the number of

independent variables to the
essential variables.

• Factor Analysis Used to detect the relationship

between variables.

• Discriminant Analysis Used to detect variables which

discriminate between naturally
occurring groups.

• Categorical data Analysis Studies the relationship

between nominal and
ordinal variables.
Exploratory Data Analysis (EDA)
Example : Cluster Analysis
Cluster Diagram: Four Tests

Test 1

Test 4

Test 2

Test 3

400 600 800 1000 1200 1400

Linkage Distance
Exploratory Data Analysis (EDA)
Example : Categorical Data Analysis
Contingency Tables

Diagnosis
Treatment No Little Good
Improvement Improvement Improvement

A 12* 25 30

B 4 7 8
C 34 35 36
* The number in the cells are patient counts
From this contingency table, we can determine, by
performing a chi-squared test, whether there is a significant
difference between the treatments.
Statistical Inference:
Estimating the parameters of a population from the
statistics of a representative sample.

Examples Statistics Parameters

Statistic (from sample) Parameter

Sample Mean :X Population mean μ

Sample STD: S Population STD: σ

Sample Proportion: p Population proportion: ρ

Statistical Inference
The following statement always applies:

Measurement =Parameter ± Experimental error

• Parameters can only be estimated within a calculated uncertainty.

• Whenever a estimated parameter is given, the uncertainty associated

with it, must be given as well.

• The actual calculation of the uncertainty depends on the distribution of

the data.

• The uncertainty can be visualized by using error bars

Statistical Inference
Analysis Wanted Methods Available
Compare 2 independent samples T-Test for normal data
Mann-Withney test for non-normal data
Compare 2 related samples Paired t-Test
Compare n (n>2) independent ANOVA for normal data
samples Friedmann ANOVA for non-normal data
Compare trends Regression with indicator variables
Detect the effects of factors on a
response Multiple regression
Find the levels of the factors for
which maximum or / and minimum Response Surface Modeling
responses are achieved.

Definition: Significant effect = An effect not caused by experimental error

Whether an effect is significant or not, is decided on by using p-values.

G20 English LL
No ratings yet
G20 English LL
317 pages
B777 Handling Engine Malfunctions
No ratings yet
B777 Handling Engine Malfunctions
19 pages
FULL CONVERSIONS 2045 - A Cyberware Mod For CYBERPUNK RED (v.1.1)
100% (1)
FULL CONVERSIONS 2045 - A Cyberware Mod For CYBERPUNK RED (v.1.1)
7 pages
Principles of Corporate Finance 14th Edition Brealey Full Download
100% (1)
Principles of Corporate Finance 14th Edition Brealey Full Download
402 pages
Business Analytics
No ratings yet
Business Analytics
42 pages
Cast and Tractions
100% (12)
Cast and Tractions
6 pages
CH-1 - Introduction-Updated
No ratings yet
CH-1 - Introduction-Updated
55 pages
Data Science Questions and Answers
No ratings yet
Data Science Questions and Answers
4 pages
2nd Unit - 2.2 - Data Analytics
No ratings yet
2nd Unit - 2.2 - Data Analytics
22 pages
Sample Question For Business Statistics
100% (2)
Sample Question For Business Statistics
12 pages
Y MX + C: Let Us Learn More Its Graph, and The Derivation From Other Forms of Equations of A Line
No ratings yet
Y MX + C: Let Us Learn More Its Graph, and The Derivation From Other Forms of Equations of A Line
10 pages
Asymptotic Notation
No ratings yet
Asymptotic Notation
66 pages
Data Analyst Masters Program
No ratings yet
Data Analyst Masters Program
34 pages
Regional Assessment of Tusanmi Risk Gulf of Mexico Ten Brink (2009)
No ratings yet
Regional Assessment of Tusanmi Risk Gulf of Mexico Ten Brink (2009)
98 pages
GAAS80
No ratings yet
GAAS80
14 pages
Data Analysis
No ratings yet
Data Analysis
37 pages
Exploratory Data Analysis - v3 - Part1
No ratings yet
Exploratory Data Analysis - v3 - Part1
36 pages
Eurovent REC 4-23 - Selection of EN ISO 16890 Rated Air Filter Classes - 2018 - English - Web
No ratings yet
Eurovent REC 4-23 - Selection of EN ISO 16890 Rated Air Filter Classes - 2018 - English - Web
11 pages
660 Final Assignment (Maruf)
No ratings yet
660 Final Assignment (Maruf)
29 pages
Method Chooser Basic Statistical Tests
100% (1)
Method Chooser Basic Statistical Tests
36 pages
ZAMANI (1 To End)
No ratings yet
ZAMANI (1 To End)
122 pages
Advanced Data Analytics Certificate Glossary
No ratings yet
Advanced Data Analytics Certificate Glossary
35 pages
06 Investment Decisions
No ratings yet
06 Investment Decisions
23 pages
Dinegurumi - Sandra Haupt - Bastet Egyptian ENG
No ratings yet
Dinegurumi - Sandra Haupt - Bastet Egyptian ENG
28 pages
Excel For Business Statistics
No ratings yet
Excel For Business Statistics
37 pages
Exploratory Data Analysis - Komorowski PDF
No ratings yet
Exploratory Data Analysis - Komorowski PDF
20 pages
Chapter 1 Data Analysis
No ratings yet
Chapter 1 Data Analysis
18 pages
History 2106
No ratings yet
History 2106
20 pages
Data Analysis
No ratings yet
Data Analysis
17 pages
Lecture Notes On Unit I Ecology
No ratings yet
Lecture Notes On Unit I Ecology
8 pages
Finance - Cameron Paff
No ratings yet
Finance - Cameron Paff
97 pages
Sets - Supreme Values
No ratings yet
Sets - Supreme Values
1 page
Intern
No ratings yet
Intern
14 pages
QUUIIMECANROS
No ratings yet
QUUIIMECANROS
29 pages
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
No ratings yet
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
20 pages
Mini 1969-2000 Catalogue
No ratings yet
Mini 1969-2000 Catalogue
169 pages
SAS Cluster Project Report
100% (1)
SAS Cluster Project Report
24 pages
Snails
No ratings yet
Snails
11 pages
12th Week Lesson Hypothesis Testing
100% (1)
12th Week Lesson Hypothesis Testing
24 pages
Science Engineering: Verified
No ratings yet
Science Engineering: Verified
8 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
7 pages
What Is A DSS?: Decision Support Systems Concepts, Methodologies, and Technologies: An Overview
No ratings yet
What Is A DSS?: Decision Support Systems Concepts, Methodologies, and Technologies: An Overview
9 pages
Tidy Data
No ratings yet
Tidy Data
21 pages
Evaluating Causal Arguments
No ratings yet
Evaluating Causal Arguments
15 pages
Chapter II
No ratings yet
Chapter II
26 pages
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
No ratings yet
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
42 pages
Statistical Process Analysis MINITAB 2019 I
No ratings yet
Statistical Process Analysis MINITAB 2019 I
152 pages
Science-7-2nd-Quarter wk2
No ratings yet
Science-7-2nd-Quarter wk2
4 pages
MMSA Story - 'To Catch A Thief - Part 3 - Cadet Campbell in The Gym With The Cane'
No ratings yet
MMSA Story - 'To Catch A Thief - Part 3 - Cadet Campbell in The Gym With The Cane'
3 pages
WOW-Statement Generator v3.39 2
No ratings yet
WOW-Statement Generator v3.39 2
2 pages
Data-Analysis Probability Midterm
No ratings yet
Data-Analysis Probability Midterm
56 pages
On The Theory of Scales of Measurement - S. S. Stevens
100% (3)
On The Theory of Scales of Measurement - S. S. Stevens
5 pages
Versatile Pro&AIO 2024
No ratings yet
Versatile Pro&AIO 2024
2 pages
Placenta Previa Maternal and Foetal Outcome
No ratings yet
Placenta Previa Maternal and Foetal Outcome
4 pages
KVVNS Sai Varun - SR Engineer
No ratings yet
KVVNS Sai Varun - SR Engineer
3 pages
Workshop 03 - S1 - 2020 - Solutions For Business Statistics
No ratings yet
Workshop 03 - S1 - 2020 - Solutions For Business Statistics
13 pages
Isolation of Caffeine Lab Report
100% (1)
Isolation of Caffeine Lab Report
5 pages
Fiche Split Gainable ON-OFF Triphasé - Hisense
No ratings yet
Fiche Split Gainable ON-OFF Triphasé - Hisense
1 page
Lee Gardens Gift Card Programme Acceptance List
No ratings yet
Lee Gardens Gift Card Programme Acceptance List
1 page
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
2 pages
Bae PDF Mps Indv Solar Plates
No ratings yet
Bae PDF Mps Indv Solar Plates
2 pages
Approaches To The Analysis of Survey Data PDF
No ratings yet
Approaches To The Analysis of Survey Data PDF
28 pages
Mark 1:9-11 - BAPTISM OF THE SERVANT
No ratings yet
Mark 1:9-11 - BAPTISM OF THE SERVANT
8 pages
MASTER Chef Portable Charcoal Kettle BBQ Canadian Tire
No ratings yet
MASTER Chef Portable Charcoal Kettle BBQ Canadian Tire
1 page
Fourth Edition: Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization
No ratings yet
Fourth Edition: Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization
66 pages
Lab 1 Light Spectrum & Efficacy
No ratings yet
Lab 1 Light Spectrum & Efficacy
10 pages
Not 1
No ratings yet
Not 1
8 pages
Perils of The Internal Rate of Return: Economics Interactive Tutorial
No ratings yet
Perils of The Internal Rate of Return: Economics Interactive Tutorial
13 pages
Proc Report
No ratings yet
Proc Report
32 pages
PSSC Maths Statistics Project Handbook Eff08 PDF
No ratings yet
PSSC Maths Statistics Project Handbook Eff08 PDF
19 pages
Tutorial 03 - S2 - 2017 - Solutions For Business Statistics
No ratings yet
Tutorial 03 - S2 - 2017 - Solutions For Business Statistics
15 pages
Exercises Problem 1 Ref and Air Con MamaclayRA
No ratings yet
Exercises Problem 1 Ref and Air Con MamaclayRA
3 pages
Practical Missing Data Analysis in SPSS
No ratings yet
Practical Missing Data Analysis in SPSS
19 pages
Course Content - Advance Excel & Macros PDF
No ratings yet
Course Content - Advance Excel & Macros PDF
7 pages
Statistical Infrences Lec 1
No ratings yet
Statistical Infrences Lec 1
35 pages
Statistical Packages - SPSS - ABH
No ratings yet
Statistical Packages - SPSS - ABH
68 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
26 pages
Econometrics Exam
No ratings yet
Econometrics Exam
8 pages
Project
No ratings yet
Project
14 pages
Power BI Case Study Meta Data Sheet-2
No ratings yet
Power BI Case Study Meta Data Sheet-2
1 page
POL BigDataStatisticsJune2014
No ratings yet
POL BigDataStatisticsJune2014
27 pages
Predictive Analytics - Unit 4 - Week 2 - Questions
No ratings yet
Predictive Analytics - Unit 4 - Week 2 - Questions
3 pages
Cheat Sheet: With Stata 15
No ratings yet
Cheat Sheet: With Stata 15
1 page
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
13 pages
Rice Data Science For Business Online Short Course Prospectus
No ratings yet
Rice Data Science For Business Online Short Course Prospectus
8 pages
Data Science Course Content Chapter 1: Introduction To Data Science
No ratings yet
Data Science Course Content Chapter 1: Introduction To Data Science
8 pages
Business Statistics
No ratings yet
Business Statistics
20 pages
Mental Agility Test
No ratings yet
Mental Agility Test
2 pages
Assignment 1&2
No ratings yet
Assignment 1&2
4 pages
Data Literacy Fundamentals: Understanding the Power & Value of Data
From Everand
Data Literacy Fundamentals: Understanding the Power & Value of Data
Ben Jones
No ratings yet

Data Analysis 1

Uploaded by

Data Analysis 1

Uploaded by

Data Analysis

The purpose of data analysis is to::

•Produce descriptive statistics to summarize the data.

•Create graphics which help to visualize data.

•Use inferential statistics to distinguish between significant and non-

Measure of Advantage Disadvantage Formula

Measure Advantage Disadvantage Formula

Type of Variable Definition Examples

Histogram: Yield (g) B ox P lot: Y ield (g)

Box & Whisker Plot

60 T1 0.42 0.03 0.60

Other EDA techniques:

• Cluster Analysis Collects “similar” variables in

• Principle Component Analysis Reduces the number of

• Factor Analysis Used to detect the relationship

• Discriminant Analysis Used to detect variables which

• Categorical data Analysis Studies the relationship

400 600 800 1000 1200 1400

Examples Statistics Parameters

Sample Mean :X Population mean μ

Sample STD: S Population STD: σ

Sample Proportion: p Population proportion: ρ

Measurement =Parameter ± Experimental error

• Parameters can only be estimated within a calculated uncertainty.

• Whenever a estimated parameter is given, the uncertainty associated

• The actual calculation of the uncertainty depends on the distribution of

• The uncertainty can be visualized by using error bars

Definition: Significant effect = An effect not caused by experimental error

Whether an effect is significant or not, is decided on by using p-values.

You might also like