10/20/24, 12:23 AM expt3
In [1]: # EXPERIMENT -3 :: Multiple Linear Regression
# Using multiple linear regression perform the following tasks on the Autodata s
# (a) Produce a scatter plot matrix which includes all of the variable sin the d
# (b) Compute the matrix of correlations between the variables using the f
# You will need to exclude the name variable, cor() which is qualitat
# (c) Use the lm() function to perform a multiple linear regression with m
# and all other variables except name as the predictors. Use the summ
# print the results.
# Comment on the output, That is
# i. Is there a relationship between the predictors and the response?
# ii. Which predictors appear to have a statistically
# significant relationship to the response?
# iii. What does the coefficient for the year variable suggest?
# (d) Use the plot() function to produce diagnostic plots of the linear regres
# Comment on any problems you see with the fit.
# Do the residual plots suggest any unusually large outliers?
# Does the leverage plot identify any observations with unusually high leverage?
# (e) Use the * and : symbols to fit linear regression models with interaction
# any interactions appear to be statistically significant?
# (f) Try a few different transformations of the variables, such aslog(X),√ X,
# Comment on your findings
#This question involves the use of multiple linear regression on the Auto data s
library(ISLR)
library(MASS)
data("Auto")
typeof(data)
head(Auto)
#Produce a scatterplot matrix which includes all of the variables in the data se
pairs(Auto)
#Compute the matrix of correlations between the variables using the function cor
Auto$name<-NULL
cor(Auto,method = c("pearson"))
lm.fit<-lm(mpg~.,data=Auto)
summary(lm.fit)
which.max(hatvalues(lm.fit))
## 14
## 14
par(mfrow = c(2,2))
plot(lm.fit)
localhost:8888/lab/tree/expt3/expt3.ipynb 1/5
10/20/24, 12:23 AM expt3
#Use the plot() function to produce diagnostic plots of the linear regression fi
#Comment on any problems you see with the fit.
lm.fit = lm(mpg ~.-name+displacement:weight, data = Auto)
summary(lm.fit)
#Use the * and : symbols to fit linear regression models with interaction effect
#Do any interactions appear to be statistically significant?
lm.fit = lm(mpg ~.-name+I((displacement)^2)+log(displacement)+displacement:weigh
summary(lm.fit)
#Try a few different transformations of the variables, such as log(X),√X, X2. Co
lm.fit = lm(mpg ~.-name+I((displacement)^2)+log(displacement)+displacement:weigh
summary(lm.fit)
Warning message:
"package 'ISLR' was built under R version 4.4.1"
'closure'
A data.frame: 6 × 9
mpg cylinders displacement horsepower weight acceleration year origin
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
che
1 18 8 307 130 3504 12.0 70 1 ch
m
2 15 8 350 165 3693 11.5 70 1 s
plym
3 18 8 318 150 3436 11.0 70 1
sa
4 16 8 304 150 3433 12.0 70 1
re
5 17 8 302 140 3449 10.5 70 1
6 15 8 429 198 4341 10.0 70 1 g
localhost:8888/lab/tree/expt3/expt3.ipynb 2/5
10/20/24, 12:23 AM expt3
A matrix: 8 × 8 of type dbl
mpg cylinders displacement horsepower weight acceleration
mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442 0.4233285
cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273 -0.5046834
displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944 -0.5438005
horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377 -0.6891955
weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000 -0.4168392
acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392 1.0000000
year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199 0.2903161
origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054 0.2127458
Call:
lm(formula = mpg ~ ., data = Auto)
Residuals:
Min 1Q Median 3Q Max
-9.5903 -2.1565 -0.1169 1.8690 13.0604
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -17.218435 4.644294 -3.707 0.00024 ***
cylinders -0.493376 0.323282 -1.526 0.12780
displacement 0.019896 0.007515 2.647 0.00844 **
horsepower -0.016951 0.013787 -1.230 0.21963
weight -0.006474 0.000652 -9.929 < 2e-16 ***
acceleration 0.080576 0.098845 0.815 0.41548
year 0.750773 0.050973 14.729 < 2e-16 ***
origin 1.426141 0.278136 5.127 4.67e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.328 on 384 degrees of freedom
Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
14: 14
localhost:8888/lab/tree/expt3/expt3.ipynb 3/5
10/20/24, 12:23 AM expt3
Warning message in terms.formula(formula, data = data):
"'varlist' has changed (from nvar=8) to new 9 after EncodeVars() -- should no lon
ger happen!"
Error in eval(predvars, data, env): object 'name' not found
Traceback:
1. lm(mpg ~ . - name + displacement:weight, data = Auto)
2. eval(mf, parent.frame())
3. eval(mf, parent.frame())
4. stats::model.frame(formula = mpg ~ . - name + displacement:weight,
. data = Auto, drop.unused.levels = TRUE)
5. model.frame.default(formula = mpg ~ . - name + displacement:weight,
. data = Auto, drop.unused.levels = TRUE)
6. eval(predvars, data, env)
7. eval(predvars, data, env)
localhost:8888/lab/tree/expt3/expt3.ipynb 4/5
10/20/24, 12:23 AM expt3
In [ ]:
localhost:8888/lab/tree/expt3/expt3.ipynb 5/5