[go: up one dir, main page]

0% found this document useful (0 votes)
19 views44 pages

1004B Tutorial 7 Slides (With Answers) - 1

This tutorial focuses on correlation and regression methods in psychology, specifically using Jamovi for data analysis. It covers the differences between correlation and regression, the importance of scatterplots, and how to interpret correlation coefficients and regression outputs. Additionally, it emphasizes the significance of assumptions in these analyses and provides practical exercises for application.

Uploaded by

logan.wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views44 pages

1004B Tutorial 7 Slides (With Answers) - 1

This tutorial focuses on correlation and regression methods in psychology, specifically using Jamovi for data analysis. It covers the differences between correlation and regression, the importance of scatterplots, and how to interpret correlation coefficients and regression outputs. Additionally, it emphasizes the significance of assumptions in these analyses and provides practical exercises for application.

Uploaded by

logan.wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

PSYC1004B

Introduction to Quantitative Methods in


Psychology
2024 – 2025 2nd Semester
Tutorial 7
Correlation & regression

Please download Tutorial 7 materials on Moodle


So far in this course…
(one-sample) z-test

One-sample t-test
Categorical / group-based IV
Paired samples t-test (continuous DV)
➢ mean differences
Independent samples t-test

One-way between-subjects Analysis of Variance (ANOVA)

How is correlation/regression different from these approaches?


Continuous IV, and Continuous DV!
➢ seeing whether there is any association between them
Today’s Tutorial
Overview + Jamovi:

Scatterplots

Correlation

Bivariate linear regression a.k.a. Simple linear regression

For these approaches, should your data be independent or paired?

Data should be Paired!


Scatterplots
Scatterplots

● Used for visual inspection of data particularly for

association/relationship assumption (e.g. linearity, monotonic) prior

to statistical analyses

● For both Pearson’s Correlation and Simple Linear Regression→

inspection of linearity
Scatterplots - Jamovi
● Download ‘scatr’ module

○ ‘Analyses’ → ‘Modules’
Scatterplots - Jamovi
Exercise
Make a scatterplot for these pairs of variables:

1. Hours spent revising and final exam scores


2. Hours spent watching TV and final exam scores

Describe the relationship for each pair.


Scatterplots - Exercise
Describe the relationship for each pair of variables.
Scatterplots - Exercise
Which relationship seems to be stronger?
Correlation
Correlation

● Generally: Measures the association between two continuous variables

● Measure of correlation → correlation coefficient

● Many different correlation types

● Correlation ≠ causation!
Correlation - Pearson’s

● Linear association between two continuous variables

● Correlation coefficient → r

● Direction of relationship → look at the sign of r, is it positive or

negative?

● Strength of relationship → look at the magnitude of r (see

benchmarks in later slide)


Direction of correlation

Image from https://statistics.laerd.com/spss-tutorials/linear-regression-using-spss-statistics.php


Strength of correlation

● Value of r (vs 0)

● Based on the spread of the data points

● Ranges from –1.00 to +1.00

○ ± 1 → closely packed data points along a straight line

○ ≈ 0 → data points are randomly spread


Strength of correlation

Image from https://statistics.laerd.com/spss-tutorials/linear-regression-using-spss-statistics.php


Strength of correlation - benchmarks

Correlation coefficient Strength of correlation

≤ ±.19 No or very weak

±.2 - ±.39 Weak

±.4 - ±.59 Moderate

±.6 - ±.79 Strong

≥ ±.8 Very strong

BUT these are just very general benchmarks – whether or not a correlation is
considered ‘strong’ depends on the research topic/area!
Jamovi

Running correlational analyses on the following pairs of variables:

1. Hours spent revising and final exam scores


2. Hours spent watching TV and final exam scores
Some assumptions of Pearson’s correlation
(and simple linear regression)
➢ The two variables are measured on a continuous level (interval or ratio), and are paired

1. The two variables are linearly associated

2. Independence of errors (no particular association between residuals)


3. The residuals (errors) should be approximately normally distributed
4. The residuals (errors) should have approximately equal variances across values of the IV)

(+ no significant outliers present)


Correlation - Jamovi
● ‘Analyses’ → ‘Regression’ → ‘Correlation matrix’
Correlation - Jamovi
Null hypothesis: ρ = 0
Alternative/research hypothesis: ρ ≠ 0

(Note. ρ (greek letter rho) is used to


denote the population parameter,
r is used to denote the sample statistic.)

(df = n – 2)

Reporting in APA format: r(df) = r-value, p-value.

e.g., correlation between revision and final: r(33) = .959, p < .001.
Reporting correlation in APA format

A significant positive correlation between the number of hours spent

revising during the revision period and final exam scores was found,

r(33) = .959, p <.001. However, the correlation between the number of

hours spent watching TV during the revision period and final exam

scores was not significant, r(33) = –.100, p = .568.


Bivariate linear regression
a.k.a. Simple Linear Regression
Bivariate linear regression

● Predicting an outcome (Y) based on one predictor (X)

○ (More than one predictor → multiple regression)

● Prediction is based on the linear relationship between X and Y


Bivariate linear regression

Regression equation (unstandardised):

(Predicted Y, or ) Y’ = a + b(X)

Y-intercept/constant Regression coefficient/slope


Bivariate linear regression
Regression equation (standardised):

z(Y’) = 𝛽z(X)
(Predicted z(Y), or z( )

Beta: Standardised regression coefficient/slope

Note: the intercept is always 0 in a standardised regression equation.

r between
Also, for bivariate/simple linear regression, the correlation coefficient

your X and Y, is equivalent to the standardized regression coefficient 𝛽 !


Bivariate linear regression

● One of the key assumptions of simple linear regression → linearity

between the variables

● Our example meets this assumption, so we can try using the

number of hours spent revising to predict final exam score.

○ Is the number of hours spent revising a significant predictor?


Jamovi

Running a simple linear regression analysis to predict final exam


score using hours spent revising.
Bivariate linear regression
● Null hypothesis?

The number of hours spent revising during revision period cannot


predict the final exam score.

● Alternative/research hypothesis?

The number of hours spent revising during revision period can


predict the final exam score.
Bivariate linear regression - Jamovi

● ‘Analyses’ → ‘Regression’ → ‘Linear regression’


Bivariate linear regression - Jamovi
Bivariate linear regression - Jamovi
Interpreting the outputs

Proportion of variation in Y explained by X

Also equivalent to r2 for bivariate linear regression


Quick Recap: Variability in regression

Total variability: Explained variability: Unexplained variability: (a.k.a. “Residuals”, or ‘error’):


actual Y value(s) – mean of Y predicted Y value(s) – mean of Y actual Y value(s) – predicted Y value(s)

** For this course, you do not need to know how to hand-calculate R2


using this variances approach.
But you should have some idea of what the different variances reflect.
Interpreting the outputs:
Unstandardized regression coefficient

Unstandardized regression coefficient interpretation:


[ When X increases by 1 raw score, Y is predicted to increase by 1.17 raw score ]

Significance: (e.g., for alpha = .05)


p > .05 → not a significant predictor
p < .05 → significant predictor
*Note: the SE value shown here is the standard error of the unstandardized regression coefficient, which you don’t need to know.
Don’t confuse this with the standard error of the estimate (i.e., SE of the predicted Y value) taught in lecture, which
you do need to know!
Interpreting the outputs:
Unstandardized regression equation

Predicted final exam score = a + b(revision)


Predicted final exam score = 43.38 + 1.17(revision)

What would be the predicted final exam score of a student who studied 38
hours during revision week?
Predicted final exam score = 43.38 + 1.17(38)
Predicted final exam score = 87.84
Interpreting the outputs:
Standardized regression coefficient & equation

z(predicted final exam score) = 𝛽z(revision)


z(predicted final exam score) = 0.959 z(revision)
Standardised coefficient interpretation:
[ When X increases by 1 SD unit, Y is predicted to increase by 0.959 SD units ]

Remember: 𝛽 = Pearson’s r in bivariate regression!


Reporting the results in APA format

The number of hours spent revising during the revision period


significantly and positively predicted final exam score, b = 1.17,
β = .959, p < .001. The number of hours spent revising during the
revision period explains 91.9% of the variability in final exam score.

Note: Does doing a simple linear regression analysis ‘tell us more’ about causality than
doing a correlation analysis?

Not necessarily! Causality needs to be addressed using other ways,


(i.e., research design), not just about the statistical analysis approach.
Hand-calculation steps (correlation)
Raw data
Judging significance (if required)

• df = n – 2 (n = number of participants, or
number of pairs of data points)

• Compare whether calculated r value is more


extreme than ±critical r value (from table)

Note: there are multiple formulas you


can use to calculate correlation
coefficient r (see lecture slides!)
Hand-calculation steps (simple linear regression)
Raw data

AFTER calculating correlation coefficient r:

Unstandardized regression coefficients:

Standardized regression coefficient:

ß=r
(for simple linear regression)
FINAL QUIZ REMINDERS!
Check the announcement on Moodle!

Length and time: 100 minutes (12:30pm to 2:20pm, 9th May)

Venue: Meng Wah T1 (MWT1)

Content coverage: everything that is covered in this course (cumulative)

Format: similar to midterm: closed-book and closed-notes

You may also be required to interpret and extract information from Jamovi outputs!
If you have any questions, please do let us know

And keep checking your email! ☺


Additional practice questions
Answer the following questions based on the regression analysis
run in the previous exercise:

1. What is the proportion of variation in final exam score that is


explained by the number of hours spent revising?
2. If a student spent 25 hours revising, what is their expected final
exam score?
3. Can it be concluded that the number of hours spent revising is a
significant predictor of final exam score? Why?
Additional practice questions - Answers
1. What is the proportion of variation in final exams score that is
explained by the number of hours spent revising? 91.9%
2. If a student spent 25 hours revising, what is their expected final
exam score? 72.63
3. Can it be concluded that the number of hours spent revising is a
significant predictor of final exam score? Why? Yes, as the p value
of its regression coefficient is less than .05.

You might also like