Starting A Correlation Project

Uploaded by

Filipe Madeira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views3 pages

Starting A Correlation Project

Uploaded by

Filipe Madeira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Starting a correlation project

Correlation projects are very popular with students – the work is accessible and on the
syllabus, data can be collected from either primary or secondary sources, and simple and
further mathematical processes can be demonstrated.

The usual approach is to take two sets of data, assume a linear correlation, and proceed on
that basis. However, this approach does not always show that the student has a proper
understanding of the techniques being used.

The following is a suggested approach that would better demonstrate a sophisticated

understanding of the material.

1. Rather than looking only at two variables and establishing a correlation between
these, it is better that the student compares the effect of two independent variables on
a third by comparing the correlation coefficients between them. For example, if one
were interested in the rate of infant mortality in various countries, one could consider
the relative influence of “the number of doctors per 1000 people” compared to
“percentage access to potable water”.

2. It may well be that a better fit to the data is supplied by a non-linear model, for
example an exponential model. Once a correlation coefficient of sufficiently large
value has been found and a linear model has been considered, a different model can
easily be considered and assessed by the graphic display calculator (GDC).

3. It may well be that the correlation coefficient has such a low value that the conclusion
is that there is “no correlation” between the variables. In this case the student should
be encouraged to investigate whether the variables are in fact statistically independent
and use a chi-squared test to test this hypothesis.

Note: Should this approach be followed, care must be taken that enough data has been
collected to make the chi-squared test a valid option. In order to have two degrees of
freedom (to avoid the necessity of using Yates’s continuity correction) and enough
data in each category (to avoid expected frequencies of less than 5), more than 30 data
points are needed. This may well involve some further sampling.

The following approach is suggested to best fit the assessment criteria.

1. Decide upon the factor to be investigated and the two factors that might influence it.
At this point write down the title of the project in such a way that the investigation is
focussed.

2. Collect the data. If primary data is being used, ensure that enough is collected so that
a chi-squared test will be valid (50 in the set). Randomly selecting a subset of this
data set for an initial investigation is always an option. If a website is used to collect
data, further sampling is easier. The method of sampling – random, stratified – should
be stated and justified.

3. Before engaging in any calculations, a scatter diagram should be drawn. Not only
does this constitute simple mathematics, relevant to the investigation, but it also gives
some indication as to the direction of the project – two linear models, a different
model or independence. The student can also make an initial assessment of the levels
of correlation, thus making a conclusion based on a mathematical technique.

4. The correlation coefficient with one of the factors is then calculated.

There are many different formulas that can be used to calculate the correlation
coefficient r. It is suggested that the most useful formula is
1 n
r= ∑ zx z y
n k =1 k k
where n is the number of data points used, and
x−x y− y
zx = and z y =
sd x sd y
are the standardised scores of each data point.

The above formula makes use of two pieces of simple mathematics – the mean and
the standard deviation (sd) – and so makes relevant their calculation. Use of
spreadsheets is envisaged here.

Agreement with the answer obtained on the GDC can be shown.

Note: When using a spreadsheet, care must be taken with the above formula to use the
appropriate version of the standard deviation as specified in the mathematical studies
SL guide.

5. At this stage, the direction of the project becomes apparent. The equation of the
regression line y on x can be calculated if this is appropriate, or the χ 2 test for
independence can be used if the correlation coefficient is close to zero.

It may well be that neither route is applicable. This is part of the student’s assessment
of the appropriateness of the mathematics being applied.

6. Investigation of the second factor should now be undertaken. Full details of the
calculations for the chi-squared value, correlation coefficient or regression line will be
required the first time that they occur in the project. Thereafter the graphic display
calculator may be utilized.

7. A comparison of the results and final conclusion then complete the project.

A variation on the theme

Using the GDC, other models for the data set can be examined and the relative merits of each
assessed. For example, the effect of concentration of reactants and the time to complete a
chemical reaction, or the speed of microprocessor chips and the time to complete a given
task, might be better modelled as an exponential function.
The method of approach in this case would be to assume a linear model and then try the
different types of correlation as part of the validation process. The start of the method is
essentially the same as in the traditional approach, differing only at the very end.

Correlation
100% (1)
Correlation
39 pages
Pearson's Correlation Coefficient
No ratings yet
Pearson's Correlation Coefficient
7 pages
Pearson's Correlation Coefficient
No ratings yet
Pearson's Correlation Coefficient
7 pages
Statistcs Notes
No ratings yet
Statistcs Notes
6 pages
BS Chapter4
No ratings yet
BS Chapter4
7 pages
Correlation
No ratings yet
Correlation
19 pages
6th Sem Project
No ratings yet
6th Sem Project
48 pages
Correlation
No ratings yet
Correlation
22 pages
Correlation and Regression Analysis
0% (1)
Correlation and Regression Analysis
17 pages
Unit 4
No ratings yet
Unit 4
34 pages
Amy Corns - Connecting Scatter Plots and Correlation Coefficients Activity
No ratings yet
Amy Corns - Connecting Scatter Plots and Correlation Coefficients Activity
23 pages
4 Research Design Qod
No ratings yet
4 Research Design Qod
2 pages
b.2 Independent Samples T-Test - For Comparing A Variable (Interval or Ratio Scaled) With A Nominal Variable With 2 Categories Only
No ratings yet
b.2 Independent Samples T-Test - For Comparing A Variable (Interval or Ratio Scaled) With A Nominal Variable With 2 Categories Only
67 pages
Regression & Correlation
No ratings yet
Regression & Correlation
9 pages
Chapter 8. Correlation and Regression Analyses
No ratings yet
Chapter 8. Correlation and Regression Analyses
36 pages
QT Module II Correlation and Regression Analysis
No ratings yet
QT Module II Correlation and Regression Analysis
10 pages
DAV Manual
No ratings yet
DAV Manual
15 pages
Lecture 2
No ratings yet
Lecture 2
24 pages
Module-I Regression
No ratings yet
Module-I Regression
30 pages
Approach To Comparative Politics
No ratings yet
Approach To Comparative Politics
8 pages
1.6 Correlation
No ratings yet
1.6 Correlation
23 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
Statistics and Probability II-6
No ratings yet
Statistics and Probability II-6
8 pages
Business Statistics Project On Correlation: Submitted by N.Bavithran BC0140018
No ratings yet
Business Statistics Project On Correlation: Submitted by N.Bavithran BC0140018
17 pages
009 D 1 Correlation
No ratings yet
009 D 1 Correlation
29 pages
Correlation and Regression
No ratings yet
Correlation and Regression
11 pages
Correlation and Regression
No ratings yet
Correlation and Regression
17 pages
Quantitative Techniques:: OPJS University, Rajgarh - Churu Quantitative Method
No ratings yet
Quantitative Techniques:: OPJS University, Rajgarh - Churu Quantitative Method
20 pages
Determination of The Aluminium Content in Different Brands of Deodor
No ratings yet
Determination of The Aluminium Content in Different Brands of Deodor
14 pages
QTT201CA3MEHEDIHASAN
No ratings yet
QTT201CA3MEHEDIHASAN
7 pages
Regression and Correlation
No ratings yet
Regression and Correlation
37 pages
Covariance and Correlation
No ratings yet
Covariance and Correlation
6 pages
Statistical Data Analysis - 2 - Step by Step Guide To SPSS & MINITAB - Nodrm
No ratings yet
Statistical Data Analysis - 2 - Step by Step Guide To SPSS & MINITAB - Nodrm
83 pages
Block 5 MS 08 Correlation
No ratings yet
Block 5 MS 08 Correlation
13 pages
Correlation
No ratings yet
Correlation
13 pages
Correlation Analysis Notes-2
No ratings yet
Correlation Analysis Notes-2
5 pages
Understanding Correlation Basics
No ratings yet
Understanding Correlation Basics
40 pages
Statistics For Managers
No ratings yet
Statistics For Managers
15 pages
Correlation Theory
No ratings yet
Correlation Theory
34 pages
Business Project 12 Content
No ratings yet
Business Project 12 Content
33 pages
Correlation Analysis1
No ratings yet
Correlation Analysis1
25 pages
Unit - 4 - Quantitative Methods
No ratings yet
Unit - 4 - Quantitative Methods
8 pages
Unit 2 Correlation Analysis: 2.1. Definition
No ratings yet
Unit 2 Correlation Analysis: 2.1. Definition
9 pages
Business Statistics Unit 4 Correlation and Regression
No ratings yet
Business Statistics Unit 4 Correlation and Regression
27 pages
Stat and Probability Finals
No ratings yet
Stat and Probability Finals
7 pages
Unit 4
No ratings yet
Unit 4
10 pages
Correlation: Charcteristics
No ratings yet
Correlation: Charcteristics
6 pages
ABM 401 Lesson 12
No ratings yet
ABM 401 Lesson 12
14 pages
Behavioral Statistics in Action
100% (1)
Behavioral Statistics in Action
4 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
Correlation and Regression
No ratings yet
Correlation and Regression
64 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
DA Lab Manual
No ratings yet
DA Lab Manual
60 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Statistics Unit 8 Notes
No ratings yet
Statistics Unit 8 Notes
15 pages
Key Formulas: Confidential Appendix 1 (1) CS/STA408
No ratings yet
Key Formulas: Confidential Appendix 1 (1) CS/STA408
2 pages
Sohail DataScientist
No ratings yet
Sohail DataScientist
3 pages
3 Inference For One Population Proportion
No ratings yet
3 Inference For One Population Proportion
33 pages
Machine Learning PYQ 2022 Ans
No ratings yet
Machine Learning PYQ 2022 Ans
17 pages
Predictive Modelling
No ratings yet
Predictive Modelling
9 pages
Netherlands Geodetic Commission: Publications On Geodesy New Series
No ratings yet
Netherlands Geodetic Commission: Publications On Geodesy New Series
62 pages
Merged Presentation Choladeck
No ratings yet
Merged Presentation Choladeck
128 pages
1-6 (TD)
No ratings yet
1-6 (TD)
8 pages
Hypothesis Testing Basics
No ratings yet
Hypothesis Testing Basics
40 pages
STAT 5 Week 8 Part 1
No ratings yet
STAT 5 Week 8 Part 1
2 pages
Learning Task 8
No ratings yet
Learning Task 8
6 pages
Hypothesis Testing II
No ratings yet
Hypothesis Testing II
48 pages
Statistical Inference Workshop
No ratings yet
Statistical Inference Workshop
58 pages
Basic Concepts of Hypothesis Testing Discussion
No ratings yet
Basic Concepts of Hypothesis Testing Discussion
46 pages
Regression Analysis Course Guide
No ratings yet
Regression Analysis Course Guide
1 page
Probability Sam
No ratings yet
Probability Sam
98 pages
Introduction to PCA and Dimensionality Reduction
No ratings yet
Introduction to PCA and Dimensionality Reduction
20 pages
Module 3 - Statistical Inference-1
No ratings yet
Module 3 - Statistical Inference-1
19 pages
Linear Regression for Researchers
No ratings yet
Linear Regression for Researchers
41 pages
Sampling Distribution: Definition
No ratings yet
Sampling Distribution: Definition
39 pages
نمذجة المعادلات الهيكلية باستخدام المربعات الصغرى الجزئية مثال تطبيقي باستخدام r في بحوث المحاسبة والتدقيق
No ratings yet
نمذجة المعادلات الهيكلية باستخدام المربعات الصغرى الجزئية مثال تطبيقي باستخدام r في بحوث المحاسبة والتدقيق
18 pages
Triangular Distribution
No ratings yet
Triangular Distribution
3 pages
Mathematical-Economics Solved MCQs (Set-4)
No ratings yet
Mathematical-Economics Solved MCQs (Set-4)
8 pages
Time Series Models For Business and Economic Forecasting: Second Edition Philip Hans Franses, Dick Van Dijk and Anne Opschoor
No ratings yet
Time Series Models For Business and Economic Forecasting: Second Edition Philip Hans Franses, Dick Van Dijk and Anne Opschoor
12 pages
Daily Lesson Log Probability and Statistics
No ratings yet
Daily Lesson Log Probability and Statistics
23 pages
FDS Lecture Notes 2024 01 28
No ratings yet
FDS Lecture Notes 2024 01 28
217 pages
Quiz 2
No ratings yet
Quiz 2
22 pages
Business - Report-Comp-Fin - Data - Part A - Problem
No ratings yet
Business - Report-Comp-Fin - Data - Part A - Problem
17 pages
SLA - Class Test - 1 - AnswerKey
No ratings yet
SLA - Class Test - 1 - AnswerKey
4 pages

Starting A Correlation Project

Uploaded by

Starting A Correlation Project

Uploaded by

Starting a correlation project

The following is a suggested approach that would better demonstrate a sophisticated

The following approach is suggested to best fit the assessment criteria.

4. The correlation coefficient with one of the factors is then calculated.

Agreement with the answer obtained on the GDC can be shown.

A variation on the theme

You might also like