0% found this document useful (0 votes)

42 views24 pages

Notes - Predicitve Analystics - Multiple Regression - S

Uploaded by

WaSifAliRajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views24 pages

Notes - Predicitve Analystics - Multiple Regression - S

Uploaded by

WaSifAliRajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

BUS9040 – DECISION ANALYSIS FOR

MANAGERS

SEMINAR 3
Predictive Analytics

MULTIPLE
REGRESSION
Types of Data Analytics
• Descriptive – “what happened?” - looks at data to examine,
understand, and describe something that’s already happened.
• Diagnostic analytics - “Why did this happen?” - goes deeper than
descriptive analytics by seeking to understand the “why” behind
what happened.
• Predictive – “what might happen in the future” - relies on
historical data, past trends, and assumptions to answer questions
about what will happen in the future.
• Prescriptive – “what should we do next” - identifies specific
actions an individual or organization should take to reach future
targets or goals.
Introduction

Regression looks at the relationships between quantitative

(continuous) variables. These might be
Advertising expenditure and sales revenue
Tourist arrival numbers and oil prices
Electricity consumption and temperature
• Auction price and number of bidders etc.

• We can examine the relationship between our two variables

through a scatter diagram (plot one against the other).

• In this seminar , we will look at how to examine such

relationships more formally.
Example: Price of wine

The price of a bottle of wine is thought to depend on many

factors, such as its age, the quality of the grapes used to produce
it, the amount of rainfall during the growing season, where the
wine was produced, etc.

The table below shows the price of 10 randomly selected bottles of

wine from an online wine merchant. Also shown is the age of each
wine selected.
Table 1: Price and age of wine
Bottle 1 2 3 4 5 6 7 8 9 10
Age (X ) 3 12 5 3 2 21 3 2 2 12 1 10 4
Price (Y ) 4.50 12.95 6.50 4.99 7.50 14.95 8.25 3.95 18.99 10.00

Is there any relationship between age and price of wine? If so, can you
DESCRIBE this relationship?
Dr. Lee Fawcett
Figure 1: Scatter plot of price and age of wine
Quantifying the relationship: Correlation

Scatterplots such as Figure 1 can be difficult to interpret using

words alone, since different people might say different things.

Some might think there is a moderate/fairly strong relationship

between X and Y here, whilst others might conclude that there is a
relatively weak relationship between these two variables.

Interpreting such relationships with words alone can be subjective;

quantifying such relationships numerically can circumvent this
problem of subjectivity.

Dr. Lee Fawcett

Quantifying the relationship: Correlation

The correlation coefficient r always lies between -1 and +1.

If r is close to +1, there is a strong positive linear relationship
If r is close to -1 there is a strong negative relationship
If r is close to zero, there is no linear relationship between the
variables.
Note that r ≈ 0 does not imply no relationship at all, simply no
linear relationship.
Quantifying the relationship: Correlation
• Figure 2 depicts the relationship between X and Y. These relationships were
quantified using correlation coefficients (r) as follows.
• Note that r is almost zero in the bottom right figure yet there apparently
strong non-linear relationship. Again, r ≈ 0 does not imply no
relationship, simply no linear relationship.
•
Figure 2: Scatter plots of two variables X and Y

r=1 r = −0.899

r = 0.699 r = 0.064
In Summary;
Simple Linear Regression
Simple linear regression

A correlation analysis helps to establish whether or not there is a

linear relationship between two variables. However, it doesn’t allow
us to use this linear relationship.

Regression analysis allows us to use the linear relationship between

variables. For example, with a regression analysis, we can predict
the value of one variable given the value of another.

To perform a regression analysis, we must assume that

the scatter plot of the two variables (roughly) shows a
straight line, and
the spread in the Y –direction is roughly constant with X .
Recall Figure 1: Scatter plot of price and age of wine
The Regression Equation
The simple (univariate) linear regression model
is given by
Y = β0 + β 1 X + ε

where
Y is the response variable (also called dependent variable) and
X is the explanatory variable (also called independent variable).
β0 represents the intercept of the regression line (the point where
the line “cuts” the Y –axis),
β1 represents the slope of the regression line (i.e. how steep the
line is), and
ε is known as “random error”
In practice, we assume ε is zero, and so the only things we need to find
are α and β. But how?
Note: Instead of β0 and β1, the regression equation may also be written as:
Y = α + βX + ε
Simple linear regression

• For the wine price data, we can find the values of α and β
a n d h e n c e t h e line of best fit
• In practice we use a computer (e.g. Excel) to find values of α
and β a n d h e n c e the line of best fit. (next slide)
• That is, the computer calculates α and β and gives us the regression
equation
• Recall that the regression equation is in the form of Y = β0 + β1 X
Simple linear regression

• Using Excel (see Excel output below), we obtain the regression

equation for the wine data as: Y = 3.905 + 1.467X

• The plot in Figure 3 (next slide) shows the scatter diagram for the
wine data again, but now with the regression line superimposed. We
can use the regression line (or equation) to make predictions of the
dependent variable (price of wine in this case)
Figure 3: Scatter plot of price and age of wine (with regression line)
Modelling the relationship: simple linear regression

We can use the estimated regression equation to make predictions

of wine price given a certain age.

for example, suppose we produce a bottle of wine that has been

ageing for 4 12 years. How much should we sell it for?

We can take a reading from our graph, or, more accurately, use our
regression equation!:

Y = 3.903 + 1.467 × 4.5

= 10.505,

i.e. about £10.50.

Dr. Lee Fawcett

Modelling the relationship: simple linear regression

Note that we should only use our regression equation to make

predictions using X –values that lie within the range of the data
observed.

So, for example, we should not use this regression equation to

estimate the selling price of a bottle of wine that has been ageing
for 12 years.

We can also interpret the regression equation in the following way:

for every one year increase in age, the selling price of a bottle of
wine increases by about £1.47.
Multiple Regression
Multiple Linear Regression: Wine Example
• Recall that we investigated the relationship between the price of wine
and its age.

• However, the price of a bottle of wine might also depend on other

factors, for example, the amount of rainfall during the growing
season, average temperature during the growing season, etc. Below is
the full dataset showing price of wine, with corresponding Age, total
rainfall and average temperature during growing season

Bottle 1 2 3 4 5 6 7 8 9 10
Price (Y) 4.5 12.95 6.5 4.99 7.50 14.95 8.25 3.95 18.99 10.00
Age (X1) 3.5 5 3 2.5 3 2 2.5 1 10 4
Rainfall (X2) 126 121 125 106 107 112 124 105 116 108
Temp (X3) 16 20 17 18 18 22 19 15 21 20
Multiple Linear Regression: Wine Example
• As before, the regression equation is:

• Y = β0 + β1 X1 + β2X2 + β3X3+ ε

The β’s are parameters that need to be estimated. But now we have four
β’s. Again, we use SPSS to calculate them for us (next slide).

• β0 can be thought of as the intercept term as before (labelled as

“constant” in SPSS output)
• β1 is the ‘age coefficient’
• β2 is the ‘rainfall coefficient’
• β3 is the ‘temperature coefficient’
• As before, ε is the ‘random error’ term, assumed to be zero on
average
• Y = β 0 + β1 X 1 + β2 X 2 + β3 X 3 Multiple Linear Regression
• Y = −22.54 + 0.81 X1 − 0.0004 X2 + 1.55X3 p-values

• β1 = 0.81 is positive, indicates a positive relationship between age and price (i.e.
generally, older wines are more expensive). It is statistically significant (p-value < 0.05),
which suggest age is an important predictor of wine price. For every one year increase in
age, the selling price of a bottle of wine increases by about £0.81.
• β2 = −0.0004 is negative, this indicates a negative relationship between rainfall and price
(i.e. generally, wines from higher rainfall regions are cheaper). However, it is not
statistically significant (p-value > 0.05)
• β3 = 1.55 is positive, this indicates a positive relationship between temperature and price
(i.e. generally, wines from warmer regions are more expensive). It is statistically significant
(p-value < 0.05) which suggest temperature is an important predictor of wine price. For
every one degree increase in temperature, the selling price of a bottle of wine increases by
about £1.55.
What about the rest of the output?
The E x c e l output (below) also gives R-Square=0.91 (or 91%).

R 2 measures the percentage of variability in the Y data that is

explained by X .
If all our data lie on a straight line, X tells us everything
about Y , with no deviations from the line, and so R 2 = 100%
The closer R 2 is to 100%, the better!
Here, we see that about 91% of the variation in wine price is
explained by the age of the wine, rainfall, and temperature. The rest
of the variation (9%) may be explained by other factors.

Dr. Lee Fawcett

Now apply your knowledge on
regression analysis to the seminar
case studies:
• Part 1: Deciding where to locate a
business

• Part 2: Predicting the cost of

running a business

Farlin Case Assignment 2 Final Draft
No ratings yet
Farlin Case Assignment 2 Final Draft
9 pages
BAM3 Lesson03.1 LinearRegression
No ratings yet
BAM3 Lesson03.1 LinearRegression
22 pages
Regression
No ratings yet
Regression
90 pages
Josh Rombach Case 2
No ratings yet
Josh Rombach Case 2
5 pages
Lab 2
No ratings yet
Lab 2
23 pages
Excel Output
No ratings yet
Excel Output
12 pages
Wine Quality Prediction Using Regression
No ratings yet
Wine Quality Prediction Using Regression
28 pages
Wine Pricing Strategy Report
No ratings yet
Wine Pricing Strategy Report
3 pages
Correlacion y Regresion 2
No ratings yet
Correlacion y Regresion 2
28 pages
Analytics Report TO: From: Subject: Date
No ratings yet
Analytics Report TO: From: Subject: Date
5 pages
Wine Quality Prediction with SVR
100% (1)
Wine Quality Prediction with SVR
6 pages
Marketing Analytics Case Study Report
No ratings yet
Marketing Analytics Case Study Report
12 pages
Wine Prediction
100% (1)
Wine Prediction
13 pages
BDM 2 - 15 Dec 2009
No ratings yet
BDM 2 - 15 Dec 2009
6 pages
Business Report Time Series
No ratings yet
Business Report Time Series
54 pages
Do More Expensive Wines Taste Better
No ratings yet
Do More Expensive Wines Taste Better
15 pages
Correlation and Regression
No ratings yet
Correlation and Regression
6 pages
Correlation Regression 1
No ratings yet
Correlation Regression 1
9 pages
Correlacion y Regresion Lineal
No ratings yet
Correlacion y Regresion Lineal
49 pages
Bnad Case Assignment 1 - Hunter Bona
No ratings yet
Bnad Case Assignment 1 - Hunter Bona
7 pages
Regression and Correlation
No ratings yet
Regression and Correlation
54 pages
Regression
No ratings yet
Regression
50 pages
Assn 3
No ratings yet
Assn 3
8 pages
Regression Monograph DSBA Final
No ratings yet
Regression Monograph DSBA Final
38 pages
Linear Regression Monograph
No ratings yet
Linear Regression Monograph
47 pages
SPSS Regression PC
No ratings yet
SPSS Regression PC
8 pages
Wine Quality Prediction Using Data Mining
No ratings yet
Wine Quality Prediction Using Data Mining
13 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Correlation and Regression: Predicting The Unknown
No ratings yet
Correlation and Regression: Predicting The Unknown
5 pages
Correlation Regression 1
No ratings yet
Correlation Regression 1
25 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
MCEN3030 Project1 Wine-Chemistry HZ4jcSg
No ratings yet
MCEN3030 Project1 Wine-Chemistry HZ4jcSg
3 pages
Logistic Regression for Wine Quality Analysis
No ratings yet
Logistic Regression for Wine Quality Analysis
7 pages
Cha 6
No ratings yet
Cha 6
8 pages
Statistics 4 - Statistical Inference
No ratings yet
Statistics 4 - Statistical Inference
86 pages
Intro to Linear Regression Basics
No ratings yet
Intro to Linear Regression Basics
34 pages
R Project
No ratings yet
R Project
22 pages
Regression-Time Series Summary
No ratings yet
Regression-Time Series Summary
55 pages
Correlation Regression
No ratings yet
Correlation Regression
9 pages
Beer Data Analysis Guide for Students
No ratings yet
Beer Data Analysis Guide for Students
14 pages
An Internship Project Report On: Avanthi'S Research and Technological Academy
No ratings yet
An Internship Project Report On: Avanthi'S Research and Technological Academy
34 pages
BRM Session 5 Slides
No ratings yet
BRM Session 5 Slides
19 pages
Lecture 11
No ratings yet
Lecture 11
62 pages
Regression
No ratings yet
Regression
32 pages
6.3 Linear Regression
No ratings yet
6.3 Linear Regression
4 pages
Reg Intro Sol
No ratings yet
Reg Intro Sol
6 pages
ch03 Regression
No ratings yet
ch03 Regression
10 pages
Econometrics Project AARYAN BHANOT
No ratings yet
Econometrics Project AARYAN BHANOT
13 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Lecture 4.3 Regression-1
No ratings yet
Lecture 4.3 Regression-1
30 pages
Statistics and Probability PROJECT 2
No ratings yet
Statistics and Probability PROJECT 2
8 pages
Ai Logistic Regression
No ratings yet
Ai Logistic Regression
2 pages
How Can We Explore The Association Between Two Quantitative Variables?
No ratings yet
How Can We Explore The Association Between Two Quantitative Variables?
7 pages
MRA Assignment - 2 Yogesh Sinkar (P41188) : Model Summary
No ratings yet
MRA Assignment - 2 Yogesh Sinkar (P41188) : Model Summary
2 pages
Regression and Correlation
No ratings yet
Regression and Correlation
12 pages
Stats for Students & Educators
No ratings yet
Stats for Students & Educators
15 pages
Object Detection Models
No ratings yet
Object Detection Models
36 pages
MidTerm MAD
No ratings yet
MidTerm MAD
10 pages
MidTerm HCI
No ratings yet
MidTerm HCI
20 pages
Java Terminal Exam
No ratings yet
Java Terminal Exam
55 pages
HCI Terminal Exam Notes
No ratings yet
HCI Terminal Exam Notes
67 pages
Discrete Math: Logic & Quantifiers
No ratings yet
Discrete Math: Logic & Quantifiers
233 pages
0560
No ratings yet
0560
55 pages
Psychology - Master of Arts
No ratings yet
Psychology - Master of Arts
2 pages
FM302 Financial Management in The Pacific Region: Week 4: Lecture 7 - Major Assignment Some Discussions and Directions
No ratings yet
FM302 Financial Management in The Pacific Region: Week 4: Lecture 7 - Major Assignment Some Discussions and Directions
15 pages
Ec 467 Pattern Recognition
0% (1)
Ec 467 Pattern Recognition
2 pages
Goodbelly Marketing Analysis Final
85% (13)
Goodbelly Marketing Analysis Final
32 pages
Homework 05 Answers
No ratings yet
Homework 05 Answers
3 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Stat4006 2022-23 PS3
No ratings yet
Stat4006 2022-23 PS3
3 pages
Lecture 6 - Spark ML
No ratings yet
Lecture 6 - Spark ML
31 pages
Design of Experiment
No ratings yet
Design of Experiment
5 pages
Analysis of Covariance (ANCOVA) vs. Moderated Regression (MODREG)
No ratings yet
Analysis of Covariance (ANCOVA) vs. Moderated Regression (MODREG)
9 pages
X Variable 1 Residual Plot: Regression Statistics
No ratings yet
X Variable 1 Residual Plot: Regression Statistics
6 pages
Financial Econometrics Assignment - Updated
No ratings yet
Financial Econometrics Assignment - Updated
10 pages
MANOVA Guide for Data Analysts
No ratings yet
MANOVA Guide for Data Analysts
12 pages
ML 2 Marks
No ratings yet
ML 2 Marks
7 pages
EKONOMETRIKA Dummy Susi
No ratings yet
EKONOMETRIKA Dummy Susi
6 pages
Econometrics Analysis Guide
No ratings yet
Econometrics Analysis Guide
14 pages
Advanced Regression Assignment
No ratings yet
Advanced Regression Assignment
5 pages
Holtz Eakin1988 PDF
No ratings yet
Holtz Eakin1988 PDF
11 pages
Stationary - Non-Stationary - White Noise Time Series
No ratings yet
Stationary - Non-Stationary - White Noise Time Series
21 pages
ML Notes
No ratings yet
ML Notes
52 pages
F Test Interpretation20240121212524
No ratings yet
F Test Interpretation20240121212524
24 pages
Ensemble Methods - Bagging, Boosting and Stacking - by Joseph Rocca - Towards Data Science
No ratings yet
Ensemble Methods - Bagging, Boosting and Stacking - by Joseph Rocca - Towards Data Science
20 pages
Chapter 7 Correlation
No ratings yet
Chapter 7 Correlation
6 pages
Homework 1 Answers - Student
No ratings yet
Homework 1 Answers - Student
2 pages
Bangla Political Cyberbullying Detection
No ratings yet
Bangla Political Cyberbullying Detection
22 pages
Kuliah 9 - Kesahan Dan Kebolehpercayaan Instrumen
100% (1)
Kuliah 9 - Kesahan Dan Kebolehpercayaan Instrumen
31 pages
SPSS Activity 3
No ratings yet
SPSS Activity 3
8 pages
13 LinearFactorModels
No ratings yet
13 LinearFactorModels
37 pages
ML UNIT - 2 Part 2
No ratings yet
ML UNIT - 2 Part 2
20 pages
Basketball Success Metrics
No ratings yet
Basketball Success Metrics
12 pages

Notes - Predicitve Analystics - Multiple Regression - S

Uploaded by

Notes - Predicitve Analystics - Multiple Regression - S

Uploaded by

BUS9040 – DECISION ANALYSIS FOR

Regression looks at the relationships between quantitative

• We can examine the relationship between our two variables

• In this seminar , we will look at how to examine such

The price of a bottle of wine is thought to depend on many

The table below shows the price of 10 randomly selected bottles of

Scatterplots such as Figure 1 can be difficult to interpret using

Some might think there is a moderate/fairly strong relationship

Interpreting such relationships with words alone can be subjective;

Dr. Lee Fawcett

The correlation coefficient r always lies between -1 and +1.

A correlation analysis helps to establish whether or not there is a

Regression analysis allows us to use the linear relationship between

To perform a regression analysis, we must assume that

• Using Excel (see Excel output below), we obtain the regression

We can use the estimated regression equation to make predictions

for example, suppose we produce a bottle of wine that has been

Y = 3.903 + 1.467 × 4.5

i.e. about £10.50.

Dr. Lee Fawcett

Note that we should only use our regression equation to make

So, for example, we should not use this regression equation to

We can also interpret the regression equation in the following way:

• However, the price of a bottle of wine might also depend on other

• β0 can be thought of as the intercept term as before (labelled as

R 2 measures the percentage of variability in the Y data that is

Dr. Lee Fawcett

• Part 2: Predicting the cost of

You might also like