[go: up one dir, main page]

0% found this document useful (0 votes)
51 views6 pages

Regression Diagnostics Guide

Basic Regression Analysis 3

Uploaded by

Abhorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views6 pages

Regression Diagnostics Guide

Basic Regression Analysis 3

Uploaded by

Abhorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Regression diagnostics {reg-diag}

 Diagnostic plots
 Regression diagnostics plots can be created using the
R base function plot() or the autoplot() function
[ggfortify package], which creates a ggplot2-based
graphics.
 Create the diagnostic plots with the R base function:
 par(mfrow = c(2, 2))
 plot(model)
 library(ggfortify)
 autoplot(model)

H.M.F 10
For the second option
 The diagnostic plots show residuals in four different ways:
 Residuals vs Fitted. Used to check the linear relationship assumptions. A
horizontal line, without distinct patterns is an indication for a linear
relationship, what is good.

 Normal Q-Q. Used to examine whether the residuals are normally distributed.
It’s good if residuals points follow the straight dashed line.

 Scale-Location (or Spread-Location). Used to check the homogeneity of


variance of the residuals (homoscedasticity). Horizontal line with equally
spread points is a good indication of homoscedasticity.

 Residuals vs Leverage. Used to identify influential cases, that is extreme


values that might influence the regression results when included or excluded
from the analysis. This plot will be described further in the next sections.

H.M.F 11
Outliers and high levarage points
 Outliers:
 An outlier is a point that has an extreme outcome variable value. The presence of
outliers may affect the interpretation of the model, because it increases the SE.

 Outliers can be identified by examining the standardized residual (or studentized


residual), which is the residual divided by its estimated standard error.

 Observations whose standardized residuals are greater than 3 in absolute value are
possible outliers (James et al. 2014).

 High leverage points:


 A data point has high leverage, if it has extreme predictor x values. This can be
detected by examining the leverage statistic or the hat-value. A value of this statistic
above 2(p + 1)/n indicates an observation with high leverage (P. Bruce and Bruce
2017); where, p is the number of predictors and n is the number of observations.
 Outliers and high leverage points can be identified by inspecting the Residuals vs
Leverage plot:
H.M.F  plot(model, 5) 12
Influential values
 An influential value is a value, which inclusion or exclusion can alter the
results of the regression analysis. Such a value is associated with a large
residual.
 Not all outliers (or extreme data points) are influential in linear regression
analysis.
 Statisticians have developed a metric called Cook’s distance to determine
the influence of a value. A rule of thumb is that an observation has high
influence if Cook’s distance exceeds 4/(n - p - 1)(P. Bruce and Bruce 2017),
where n is the number of observations and p the number of predictor
variables.
 The Residuals vs Leverage plot can help us to find influential observations
if any.
 On this plot, outlying values are generally located at the upper right corner
or at the lower right corner. Those spots are the places where data points
can be influential against a regression line.
 The following plots illustrate the Cook’s distance and the leverage of our
model:
H.M.F 13
 summary(influence.measures(model))
H.M.F 14
Modelling cycle
This leads us into a modelling cycle
 Fit
 Examine residuals
 Transform data or change model if necessary

This cycle is repeated until we are “happy” with the


fitted model
Diagramatically….

H.M.F 15

You might also like