[go: up one dir, main page]

0% found this document useful (0 votes)
5 views2 pages

Dav 3rd

This document outlines the process of performing multiple linear regression using the California Housing dataset. It includes steps for importing necessary libraries, loading the dataset, selecting features, splitting the data into training and testing sets, training a linear regression model, making predictions, and visualizing the results in a 3D plot. The visualization displays the best fit line for the relationship between median income, average rooms, and house prices.

Uploaded by

sejaldeshmukh93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views2 pages

Dav 3rd

This document outlines the process of performing multiple linear regression using the California Housing dataset. It includes steps for importing necessary libraries, loading the dataset, selecting features, splitting the data into training and testing sets, training a linear regression model, making predictions, and visualizing the results in a 3D plot. The visualization displays the best fit line for the relationship between median income, average rooms, and house prices.

Uploaded by

sejaldeshmukh93
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

1.

Importing Libraries

We will import numpy, pandas, matplotlib and scikit learn for this.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.datasets import fetch_california_housing

2. Load Dataset

 Fetches the California Housing dataset from sklearn.datasets.

 Dataset contains features (such as median income, average rooms) stored in X and the
target (house prices) is stored in y.

california_housing = fetch_california_housing()

X = pd.DataFrame(california_housing.data, columns=california_housing.feature_names)

y = pd.Series(california_housing.target)

3. Select Features for Visualization

Selects two features (MedInc for median income and AveRooms for average rooms) to simplify
the visualization to two dimensions.

X = X[['MedInc', 'AveRooms']]

4. Train-Test Split

We will use 80% data for training and 20% for testing.

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size=0.2, random_state=42)

5. Initialize and Train Model

model = LinearRegression()
model.fit(X_train, y_train)

6. Make Predictions

y_pred = model.predict(X_test)

7. Visualizing Best Fit Line in 3D

fig = plt.figure(figsize=(10, 7))

ax = fig.add_subplot(111, projection='3d')

ax.scatter(X_test['MedInc'], X_test['AveRooms'],

y_test, color='blue', label='Actual Data')

x1_range = np.linspace(X_test['MedInc'].min(), X_test['MedInc'].max(), 100)

x2_range = np.linspace(X_test['AveRooms'].min(), X_test['AveRooms'].max(), 100)

x1, x2 = np.meshgrid(x1_range, x2_range)

z = model.predict(np.c_[x1.ravel(), x2.ravel()]).reshape(x1.shape)

ax.plot_surface(x1, x2, z, color='red', alpha=0.5, rstride=100, cstride=100)

ax.set_xlabel('Median Income')

ax.set_ylabel('Average Rooms')

ax.set_zlabel('House Price')

ax.set_title('Multiple Linear Regression Best Fit Line (3D)')

plt.show()

You might also like