Data Mining & Predictive Modeling
Lab Assign-02- Linear Regression Model for Toyota Used-Car Price
● Read the dataset "ToyotaCorolla.csv" that is provided to you.
● Build a suitable linear regression model using R.
● Analyze the predicted values of the response variable.
● Compute the residuals and plot the residual values.
● Develop some metrics to determine the accuracy of your regression model
● Save your R command file.
● Save your code and results in a word file and submit it on the VOLP.
Notes
1. The data set includes sale prices and vehicle characteristics of used Toyota Corollas. The
objective here is to predict the sale price of a used automobile.
2. Variable Description
a. Price - Offer price in EUROs
b. Age - Age in months
c. KM - Accumulated kilometers on odometer
d. FuelType - Fuel type (petrol, diesel, CNG)
e. HP - Horsepower
f. MetColor - Metallic color (Yes=1, No=0)
g. Automatic - Automatic (Yes=1, No=0)
h. CC - Cylinder volume in cubic centimeters
i. Doors - Number of doors
j. Weight - Weight in kilograms
3. We use price as the response, and age (in months), accumulated kilometers on the
odometer (in kilometer), fuel type (there are three: petrol, diesel, and compressed natural
gas CNG), horsepower, color (whether metallic = 1, or not), transmission (whether
automatic = 1, or not), cylinder volume (in cubic centimeters), doors (number of), and
weight (in kilograms) as the explanatory variables or predictors.
4. The R function lm is used to fit linear (regression) models. The syntax for this command is
given below:
lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y =
FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, …)
Type >help(lm) in Rstudio to get detailed help on lm command.
In help also check for “An object of class "lm" is a list containing at least the following
components” how to retrieve various components of object of class lm.