0% found this document useful (0 votes)

24 views6 pages

Exp 2 (Multiple Linear Regression)

The document discusses the application of multiple linear regression (MLR) to analyze datasets, particularly focusing on the Boston Housing dataset. It outlines the theory behind MLR, its limitations, and applications, followed by a code implementation for model training and evaluation. The results indicate that while both models perform well, the Boston Housing model demonstrates superior prediction accuracy based on mean squared error (MSE).

Uploaded by

piyushdohare143

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views6 pages

Exp 2 (Multiple Linear Regression)

Uploaded by

piyushdohare143

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Aim:

To Perform multiple linear regression on multiple datasets and see the results and check which one
has better output.

Theory:

Multiple Linear Regression: Theory and Understanding

Multiple linear regression (MLR) is a statistical technique used to model the relationship between a
single dependent variable (what you want to predict) and multiple independent variables (features
that influence the dependent variable). It assumes a linear relationship between these variables and
builds a linear equation to capture this relationship.

Key Concepts:

Equation:

y_hat = β₀ + β₁x₁ + β₂x₂ + ... + β_p * x_p

 y_hat is the predicted value of the dependent variable.

 β₀ is the intercept term (constant value when all independent variables are zero).

 β_i are the coefficients for each independent variable x_i.

 p is the number of independent variables.

Limitations of MLR:

 Cannot capture non-linear relationships.

 Sensitive to assumptions, and their violation can lead to inaccurate results.

 Cannot establish causation; only identifies correlations.

Applications of MLR:

 Predicting house prices based on features like size, location, and amenities.

 Understanding how factors like age, income, and education affect job satisfaction.

 Analysing the impact of advertising campaigns on sales

Code:

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import StandardScaler, PolynomialFeatures

from sklearn.feature_selection import SelectFromModel

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (replace 'your_dataset_filename.csv' with the actual name)

df = pd.read_csv('boston.csv')

# Handle outliers using IQR (adjust based on your data's characteristics)

numeric_cols = df.select_dtypes(include=[np.number]).columns

Q1 = df[numeric_cols].quantile(0.25)

Q3 = df[numeric_cols].quantile(0.75)

IQR = Q3 - Q1

df = df[~((df[numeric_cols] < (Q1 - 1.5 * IQR)) | (df[numeric_cols] > (Q3 + 1.5 * IQR))).any(axis=1)]

# Extract features and target variable (using the provided column names)

X = df.drop(['TOWN', 'TRACT', 'LON', 'LAT', 'MEDV'], axis=1)

y = df['MEDV']

# Feature Scaling

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=0)

# Feature selection (experiment with different thresholds and methods)

rf_model = RandomForestRegressor(random_state=0)

rf_model.fit(X_train, y_train)

sfm = SelectFromModel(rf_model, threshold=0.1) # Adjust threshold if needed

X_train = sfm.transform(X_train)

X_test = sfm.transform(X_test)

# Polynomial features (consider different degrees)

poly = PolynomialFeatures(degree=2, include_bias=False) # Adjust degree if needed

X_train_poly = poly.fit_transform(X_train)

X_test_poly = poly.transform(X_test)

# Model fitting

regressor = LinearRegression()

regressor.fit(X_train_poly, y_train)

# Evaluation

y_pred = regressor.predict(X_test_poly)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print('Train Score: ', regressor.score(X_train_poly, y_train))

print('Test Score: ', regressor.score(X_test_poly, y_test))

print('Mean Squared Error (MSE): ', mse)

print('R-squared (R2): ', r2)

# Visualization (optional)

plt.scatter(y_test, y_pred)

plt.xlabel("Actaul Medv")

plt.ylabel("Predicted Medv")

plt.title("Actual Medv vs Predicted Medv")

plt.show()
Performance Metrics:
Multiple Linear Regression Dataset:

Boston Housing Dataset:

Output:
Multiple Regression dataset:
Boston Housing Dataset Output:

Comparission.

Comparing the performance of models trained on a multiple regression dataset and the Boston
Housing dataset:

Train Score:

The multiple regression model achieves a very high train score (0.983), indicating an excellent fit to
the training data.

The Boston Housing model also demonstrates a reasonably high train score (0.822), suggesting a
good fit to its training data.

Test Score:

Both models exhibit high test scores, with the multiple regression model at 0.887 and the Boston
Housing model at 0.877, indicating strong generalization performance.
Mean Squared Error (MSE):

The multiple regression model has a relatively high MSE of 2,611,228, suggesting higher prediction
errors on average.

In contrast, the Boston Housing model shows a much lower MSE of 5.379, indicating superior
prediction accuracy.

R-squared (R2):

The multiple regression model and the Boston Housing model both achieve high R-squared values
(0.887 and 0.877 respectively), indicating good explanatory power over the variance in their
respective dependent variables.

Conclusion:

While both models exhibit strong performance in terms of train and test scores, the Boston Housing
model outperforms in terms of MSE, suggesting superior prediction accuracy.

Despite the multiple regression model's higher R-squared value, indicating a better fit to the data, its
higher MSE implies potential issues with prediction accuracy on unseen data.

Therefore, for accurate prediction of housing prices, the Boston Housing model is preferred.
However, if the goal is to explain variance in the dependent variable, the multiple regression model
may be more suitable.

Experiment 4 ML
No ratings yet
Experiment 4 ML
9 pages
EXPNO5
No ratings yet
EXPNO5
2 pages
ML Exp 7
No ratings yet
ML Exp 7
3 pages
Ds 4 Linears Boston
No ratings yet
Ds 4 Linears Boston
2 pages
Day.11 What Is Multiple Linear Regression
No ratings yet
Day.11 What Is Multiple Linear Regression
10 pages
Mod2 - Multiple Linear Regression
No ratings yet
Mod2 - Multiple Linear Regression
10 pages
ML Exp3
No ratings yet
ML Exp3
2 pages
Boston Housing Price Prediction
No ratings yet
Boston Housing Price Prediction
3 pages
ML Practical 5
No ratings yet
ML Practical 5
10 pages
Python File
No ratings yet
Python File
5 pages
LR LogReg
No ratings yet
LR LogReg
53 pages
AD-22053227 Lab 401, 402
No ratings yet
AD-22053227 Lab 401, 402
4 pages
Decision Tree
No ratings yet
Decision Tree
4 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
ML Exp4
No ratings yet
ML Exp4
4 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Practice Exercise 4
No ratings yet
Practice Exercise 4
2 pages
DSBDA Practical 4 Tutorial
No ratings yet
DSBDA Practical 4 Tutorial
8 pages
223a1131 ML Exp 1
No ratings yet
223a1131 ML Exp 1
8 pages
20BCP021 - Assignment - 5
No ratings yet
20BCP021 - Assignment - 5
5 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
SML - Week 3
No ratings yet
SML - Week 3
5 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
House Price Prediction Full Report-2
No ratings yet
House Price Prediction Full Report-2
5 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Prediction of House Rent Using Multiple Linear Regression
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
20 pages
Coding Question
No ratings yet
Coding Question
6 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
3 pages
DL Assignment 1ms24rai03
No ratings yet
DL Assignment 1ms24rai03
10 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Data Science - Machine Learning - Multiple Linear Regression
No ratings yet
Data Science - Machine Learning - Multiple Linear Regression
14 pages
Linear Regression Mca Lab - Jupyter Notebook
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
2 pages
22 Practice Polynomial Regression
No ratings yet
22 Practice Polynomial Regression
6 pages
AIML
No ratings yet
AIML
5 pages
Multiple Regression
No ratings yet
Multiple Regression
8 pages
Machinelearning Project
No ratings yet
Machinelearning Project
3 pages
Lesson 6
No ratings yet
Lesson 6
25 pages
ML Assignment 1ipynb
No ratings yet
ML Assignment 1ipynb
10 pages
7 A
No ratings yet
7 A
2 pages
ML Practical 5
No ratings yet
ML Practical 5
10 pages
Regression Analysis On The Boston House Price Dataset For House Price Prediction
No ratings yet
Regression Analysis On The Boston House Price Dataset For House Price Prediction
2 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
ML Prac 1
No ratings yet
ML Prac 1
4 pages
Practical No. 02: To Implement Linear Regression To Predict A Continuous Target Variable
No ratings yet
Practical No. 02: To Implement Linear Regression To Predict A Continuous Target Variable
4 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
ML Record
No ratings yet
ML Record
19 pages
Experiment 7 ML Vtu
No ratings yet
Experiment 7 ML Vtu
5 pages
SNT 7
No ratings yet
SNT 7
13 pages
House Pricing
No ratings yet
House Pricing
15 pages
Assignment 2
No ratings yet
Assignment 2
10 pages
Mulitple Linear Regression
No ratings yet
Mulitple Linear Regression
6 pages
7th ExP
No ratings yet
7th ExP
4 pages
Lab 14 Questions
No ratings yet
Lab 14 Questions
4 pages
ML - Assignment 1ipynb - Colab
No ratings yet
ML - Assignment 1ipynb - Colab
5 pages
Pa Da1
No ratings yet
Pa Da1
17 pages
Import As From Import From Import From Import: R'creditcard - CSV' 'Time' 'Time'
No ratings yet
Import As From Import From Import From Import: R'creditcard - CSV' 'Time' 'Time'
3 pages
Exp4 (Linear Regression)
No ratings yet
Exp4 (Linear Regression)
2 pages
Front Pages of Lab Journal
No ratings yet
Front Pages of Lab Journal
12 pages
Certificate Index
No ratings yet
Certificate Index
2 pages
Exp1 (Linear - Regression) (1) 2
No ratings yet
Exp1 (Linear - Regression) (1) 2
7 pages
Decision Tree PDF
No ratings yet
Decision Tree PDF
2 pages
AWS Assignment 2
No ratings yet
AWS Assignment 2
1 page
Assignment 1 - Icc - Even Sem 2025
No ratings yet
Assignment 1 - Icc - Even Sem 2025
1 page
Assistant Manager, Developer Role
No ratings yet
Assistant Manager, Developer Role
2 pages
Material Specification Sheet Saarstahl - 30Mnvs6 (27mnsivs6) - Saarform 900
100% (1)
Material Specification Sheet Saarstahl - 30Mnvs6 (27mnsivs6) - Saarform 900
1 page
Batdad Homework
100% (1)
Batdad Homework
5 pages
IGCC Coal Gasification Power Tech
No ratings yet
IGCC Coal Gasification Power Tech
2 pages
Higher Ed's Shift to Online Learning
No ratings yet
Higher Ed's Shift to Online Learning
6 pages
Internet Marketing and ECom (Revised - 01)
No ratings yet
Internet Marketing and ECom (Revised - 01)
2 pages
Li Ion Battery PSDS
No ratings yet
Li Ion Battery PSDS
2 pages
MM Sbmxgm120235 v00 en GB
No ratings yet
MM Sbmxgm120235 v00 en GB
508 pages
Hitachi Data Systems
No ratings yet
Hitachi Data Systems
11 pages
BIR Tax Filing and Deadlines For Individuals
No ratings yet
BIR Tax Filing and Deadlines For Individuals
2 pages
EVPN With IRB Solution Overview - Technical Documentation - Support - Juniper Networks
No ratings yet
EVPN With IRB Solution Overview - Technical Documentation - Support - Juniper Networks
4 pages
Excavator Daily Inspection Checklist
100% (1)
Excavator Daily Inspection Checklist
1 page
EBSCO-FullText-10 10 2025
No ratings yet
EBSCO-FullText-10 10 2025
8 pages
hp42s Alignoffset
No ratings yet
hp42s Alignoffset
28 pages
Bomba Injectora Dl420a Pi20-057-14de12dl11de08db58 - Fip Service and Repair Kit - 200624en
No ratings yet
Bomba Injectora Dl420a Pi20-057-14de12dl11de08db58 - Fip Service and Repair Kit - 200624en
6 pages
BS Iso 17757-2019 - (2020-07-13 - 02-31-34 PM)
No ratings yet
BS Iso 17757-2019 - (2020-07-13 - 02-31-34 PM)
58 pages
TPG4160 Reservoir Simulation Exam
No ratings yet
TPG4160 Reservoir Simulation Exam
6 pages
Opus 10
No ratings yet
Opus 10
1 page
Cake Display Brochure
No ratings yet
Cake Display Brochure
3 pages
Youtube Automation Service
No ratings yet
Youtube Automation Service
12 pages
How To Install Presets To Your Computer
No ratings yet
How To Install Presets To Your Computer
5 pages
Design Thinking - September2024
No ratings yet
Design Thinking - September2024
9 pages
StartupOperating&MaintSTD - RFG1 Fuel Gas Skid
100% (1)
StartupOperating&MaintSTD - RFG1 Fuel Gas Skid
14 pages
Formal Modeling and Verification - Software Engineering MCQs - 1614773279443
No ratings yet
Formal Modeling and Verification - Software Engineering MCQs - 1614773279443
9 pages
DMAIC Analyze Phase 1694960554
100% (2)
DMAIC Analyze Phase 1694960554
186 pages
Your Results - Immigration, Refugees and Citizenship Canada PDF
No ratings yet
Your Results - Immigration, Refugees and Citizenship Canada PDF
2 pages
Learning Agent
0% (1)
Learning Agent
6 pages
Graphs: "One Graph Is Worth A Thousand Logs."
No ratings yet
Graphs: "One Graph Is Worth A Thousand Logs."
52 pages
Christopher Aldora
No ratings yet
Christopher Aldora
3 pages
Max 50W
No ratings yet
Max 50W
1 page

Exp 2 (Multiple Linear Regression)

Uploaded by

Exp 2 (Multiple Linear Regression)

Uploaded by

Aim:

Multiple Linear Regression: Theory and Understanding

y_hat = β₀ + β₁x₁ + β₂x₂ + ... + β_p * x_p

 y_hat is the predicted value of the dependent variable.

 β_i are the coefficients for each independent variable x_i.

 p is the number of independent variables.

 Cannot capture non-linear relationships.

 Sensitive to assumptions, and their violation can lead to inaccurate results.

 Cannot establish causation; only identifies correlations.

 Analysing the impact of advertising campaigns on sales

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import StandardScaler, PolynomialFeatures

from sklearn.feature_selection import SelectFromModel

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (replace 'your_dataset_filename.csv' with the actual name)

# Handle outliers using IQR (adjust based on your data's characteristics)

X = df.drop(['TOWN', 'TRACT', 'LON', 'LAT', 'MEDV'], axis=1)

# Feature selection (experiment with different thresholds and methods)

sfm = SelectFromModel(rf_model, threshold=0.1) # Adjust threshold if needed

# Polynomial features (consider different degrees)

poly = PolynomialFeatures(degree=2, include_bias=False) # Adjust degree if needed

mse = mean_squared_error(y_test, y_pred)

print('Train Score: ', regressor.score(X_train_poly, y_train))

print('Test Score: ', regressor.score(X_test_poly, y_test))

print('Mean Squared Error (MSE): ', mse)

print('R-squared (R2): ', r2)

plt.title("Actual Medv vs Predicted Medv")

Boston Housing Dataset:

You might also like