[go: up one dir, main page]

0% found this document useful (0 votes)
5 views4 pages

Ds Lab Assignment 4

The document outlines an experiment on regression analysis conducted by a student at Vishwakarma Institute of Technology, focusing on constructing simple and multiple linear regression models using a Toy Sales dataset in R. The results indicate that the R-squared value for simple linear regression is 0.619, while for multiple linear regression it is 0.8588, suggesting that multiple variables provide a better fit. The conclusion emphasizes the importance of checking assumptions in regression analysis and identifies scenario a as the optimal choice for maximizing unit sales.

Uploaded by

Swaroop Deokar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

Ds Lab Assignment 4

The document outlines an experiment on regression analysis conducted by a student at Vishwakarma Institute of Technology, focusing on constructing simple and multiple linear regression models using a Toy Sales dataset in R. The results indicate that the R-squared value for simple linear regression is 0.619, while for multiple linear regression it is 0.8588, suggesting that multiple variables provide a better fit. The conclusion emphasizes the importance of checking assumptions in regression analysis and identifies scenario a as the optimal choice for maximizing unit sales.

Uploaded by

Swaroop Deokar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

Bansilal RamnathAgarwal Charitable Trust’s

VISHWAKARMA INSTITUTE OF TECHNOLOGY – PUNE


Department of Multidisciplinary Engineering

MD2201: Data Science


Name of the student: Swaroop Deokar Roll No. 16

Div: CS-AIML-A Batch: 1

Date of performance: 17/08/2023

Experiment No.4

Title: Regression .

Aim: i. To construct a simple linear regression model


ii. To construct a multiple linear regression model.

Software used: Programming language R.

Data Set: Toy Sales Dataset

Code Statement:
1. Simple Linear Regression
i. Consider the Toy sales data set.
ii. Apply simple linear model considering response as Unit sales and explanatory variable
as Price.
iii. Plot the scatter plot and draw the regression.
iv. What are values of R-square and residual standard error? (Write in conclusion)
v. Display all predicted values from the designed model and the corresponding values of
error.
2. Multiple Linear regression:
i. Consider Toy sales data set.
ii. Consider all variables to fit the regression model.
iii. Compare the R-square of SLR with MLR. (Write in conclusion)
iv. Which of the variable is more significant? Why? (Write in conclusion)
v. Can you reject Null hypothesis for promotion expenditure variable? (Write in conclusion)
vi. Which scenario from the following you will select to be applied to get maximum
number of Unit sales? (Write in conclusion)
a. Price=9.1$, Adexp=52,000$, Promexp=61,000$
b. Price=8.1$, Adexp=50,000$,Promexp=60,000$

Code: #SLR----
f1=read.csv("Toy_sales_csv.csv")
#print(f1)
l1=lm(Unitsales~Price,f1)
s1=summary(l1)
print(s1)
library(ggplot2)
Bansilal RamnathAgarwal Charitable Trust’s
VISHWAKARMA INSTITUTE OF TECHNOLOGY – PUNE
Department of Multidisciplinary Engineering

p=ggplot(f1,aes(Price,Unitsales))+geom_point()+geom_smooth(method=lm,formula =
y~x,col="red",se=F)
print(p)

pred1=predict(l1)
cat("\nPredicted value\n",pred1)
err<-f1$Unitsales-pred1
cat("\n\nErrors",err)

#MLR----
l2=lm(Unitsales~Price+Adexp+Promexp,f1)
s2=summary(l2)
print(s2)

df=data.frame(Price=c(9.1,8.1),Adexp=c(52,50),Promexp=c(61,60))
pred2=predict(l2,df)
cat("\nPredicted value\n",pred2)

Results:
Bansilal RamnathAgarwal Charitable Trust’s
VISHWAKARMA INSTITUTE OF TECHNOLOGY – PUNE
Department of Multidisciplinary Engineering

Conclusion: In conclusion constructing a simple linear regression model involves visually observing a
linear pattern in the scatterplot and a statistically significant correlation between the independent and
dependent variables. It is important to check the assumptions for linear regression, including linearity,
independence of observations, normality and homogeneity of variance. On the other hand constructing a
multiple linear regression model involves analyzing the relationship between a dependent variable and
two or more independent variables. It is important to consider the assumptions and limitations of linear
regression analysis, as well as the potential pitfalls that may arise. By following these guidelines and
interpreting the results appropriately, linear regression can be a useful tool for predicting trends and
estimating values of variables.

1) Simple Linear Regression


The value of R-squared for LSR is 0.619
Residual standard error is 1997

2) Multiple Linear Regression


a) The value of R-Squared for SLR is 0.619 while that for MLR is 0.8588
Bansilal RamnathAgarwal Charitable Trust’s
VISHWAKARMA INSTITUTE OF TECHNOLOGY – PUNE
Department of Multidisciplinary Engineering

b) Multiple R-Squared variable is more significant having a higher value. Higher value
implies that more changes in independent variables corelates to shifts in dependent variable.
c) Yes, the Null hypothesis will be rejected as the pvalue is less than 0.05
d) Scenario a will be selected with price 9.1$

You might also like