0% found this document useful (0 votes)

9 views7 pages

Perceptron Regression

Uploaded by

Suhani Talreja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views7 pages

Perceptron Regression

Uploaded by

Suhani Talreja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Department of Computer Science and Engineering AISC Lab[CS3131]

Project Report
Predict Customer Ad Clicks Using Logistic Regression and Gradient Boosting
Overview:
This project focuses on predicting whether a customer will click on an advertisement using
historical data. Leveraging machine learning techniques like Logistic Regression and Gradient
Boosting, the project aims to identify patterns in customer behavior and provide actionable insights
for marketing strategies.
Objectives:
 To analyse customer data and identify the factors influencing ad clicks.
 To build machine learning models (Logistic Regression and Gradient Boosting) for click
prediction.
 To evaluate model performance using appropriate metrics and visualizations.
 To visualize decision boundaries for better interpretability of the models.
Dataset Description:
The dataset consists of customer information and their interaction with ads. Key attributes include:
 Time Spent on Site: Time (in minutes) the user spent on the advertiser’s website.
 Estimated Salary: Customer’s estimated income.
 Clicked: Target variable indicating whether the customer clicked on the ad (1 for clicked, 0
for not clicked).
After preprocessing, non-essential columns like Names, emails, and Country were removed, leaving
only the necessary features for modelling.
Methodology:
The project followed a systematic approach:
1. Data Loading and Preprocessing:
o Data was loaded using Pandas and cleaned by removing irrelevant columns.
o Features were scaled using StandardScaler to standardize values and improve model
performance.
2. Exploratory Data Analysis (EDA):
o Scatter plots, histograms, and box plots were used to visualize relationships between
features and the target variable.
o Key findings:
 A significant correlation exists between time spent on the site and ad clicks.
3. Model Building:

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

o Logistic Regression: A linear model to predict the binary outcome (clicked or not
clicked).
o Gradient Boosting Classifier: An ensemble model to enhance prediction accuracy.
4. Model Evaluation:
o Metrics like confusion matrix, accuracy, precision, recall, and F1-score were used for
evaluation.
o Decision boundaries were plotted for visualizing model predictions.

Source Code:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.ensemble import GradientBoostingClassifier

from sklearn.metrics import classification_report, confusion_matrix

from matplotlib.colors import ListedColormap

# Setting up theme for plots

def set_plot_theme():

from jupyterthemes import jtplot

jtplot.style(theme='monokai', context='notebook', ticks=True, grid=False)

# Load data

def load_data(filepath):

return pd.read_csv(filepath, encoding='ISO-8859-1')

# Preprocess data

def preprocess_data(data):

data.drop(['Names', 'emails', 'Country'], axis=1, inplace=True)

X = data.drop('Clicked', axis=1).values

y = data['Clicked'].values

return X, y

# Scale features

def scale_features(X):

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

scaler = StandardScaler()

return scaler.fit_transform(X)

# Split dataset

def split_data(X, y, test_size=0.2):

return train_test_split(X, y, test_size=test_size, random_state=42)

# Train logistic regression model

def train_logistic_regression(X_train, y_train):

model = LogisticRegression()

model.fit(X_train, y_train)

return model

# Train Gradient Boosting model

def train_gradient_boosting(X_train, y_train):

model = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1, random_state=0)

model.fit(X_train, y_train)

return model

# Evaluate model

def evaluate_model(model, X_test, y_test):

y_pred = model.predict(X_test)

print("Confusion Matrix:")

cm = confusion_matrix(y_test, y_pred)

sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")

plt.show()

print("\nClassification Report:")

print(classification_report(y_test, y_pred))

# Visualize decision boundary

def visualize_boundary(X, y, model, title):

X1, X2 = np.meshgrid(np.arange(start=X[:, 0].min() - 1, stop=X[:, 0].max() + 1, step=0.01),

np.arange(start=X[:, 1].min() - 1, stop=X[:, 1].max() + 1, step=0.01))

plt.contourf(X1, X2, model.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),

alpha=0.75, cmap=ListedColormap(('magenta', 'blue')))

plt.xlim(X1.min(), X1.max())

plt.ylim(X2.min(), X2.max())

for i, j in enumerate(np.unique(y)):

plt.scatter(X[y == j, 0], X[y == j, 1],

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

c=ListedColormap(('magenta', 'blue'))(i), label=j)

plt.title(title)

plt.xlabel('Time Spent on Site')

plt.ylabel('Estimated Salary')

plt.legend()

plt.show()

# Main pipeline

def main():

set_plot_theme

# Load dataset

data_path = 'AD_CLICKS_Project\clicks_dataset.csv'

data = load_data(data_path)

# Preprocess data

X, y = preprocess_data(data)

# Scale features

X = scale_features(X)

# Split dataset

X_train, X_test, y_train, y_test = split_data(X, y)

# Train logistic regression

print("Training Logistic Regression...")

lr_model = train_logistic_regression(X_train, y_train)

print("Logistic Regression Evaluation:")

evaluate_model(lr_model, X_test, y_test)

# Visualize decision boundary

visualize_boundary(X_train, y_train, lr_model, "Logistic Regression (Training Set)")

visualize_boundary(X_test, y_test, lr_model, "Logistic Regression (Test Set)")

# Train Gradient Boosting

print("\nTraining Gradient Boosting...")

gb_model = train_gradient_boosting(X_train, y_train)

print("Gradient Boosting Evaluation:")

evaluate_model(gb_model, X_test, y_test)

# Visualize decision boundary

visualize_boundary(X_train, y_train, gb_model, "Gradient Boosting (Training Set)")

visualize_boundary(X_test, y_test, gb_model, "Gradient Boosting (Test Set)")

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

if __name__ == "__main__":

main()

Output:

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

Suhani Talreja 229301425

Department of Computer Science and Engineering AISC Lab[CS3131]

Suhani Talreja 229301425

Machine Learning PBL
No ratings yet
Machine Learning PBL
9 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Deep Learningexp4
No ratings yet
Deep Learningexp4
4 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Predictive Modeling for Business Insights
100% (3)
Predictive Modeling for Business Insights
69 pages
Week-7 DS Practical
No ratings yet
Week-7 DS Practical
8 pages
Ex 5.1 Customer Behaviour Prediction
No ratings yet
Ex 5.1 Customer Behaviour Prediction
8 pages
Machine Learning Boosts Bank Marketing
No ratings yet
Machine Learning Boosts Bank Marketing
21 pages
ML Record
No ratings yet
ML Record
23 pages
Good-Logistic Regression With A Real-World Example in Python - MarkTechPost
No ratings yet
Good-Logistic Regression With A Real-World Example in Python - MarkTechPost
9 pages
What Does This File Say - What Should I Do - I Have
No ratings yet
What Does This File Say - What Should I Do - I Have
14 pages
ML Lab Logitsic Regression
No ratings yet
ML Lab Logitsic Regression
3 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Logistic Regression Tech Document
No ratings yet
Logistic Regression Tech Document
12 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
TD2345
No ratings yet
TD2345
3 pages
Revenue Predictor - Udit Ennam PDF
No ratings yet
Revenue Predictor - Udit Ennam PDF
30 pages
ML Manual
No ratings yet
ML Manual
24 pages
Black Friday Sales Prediction Project
No ratings yet
Black Friday Sales Prediction Project
14 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
DL Lab 5
No ratings yet
DL Lab 5
3 pages
PA Lab2
No ratings yet
PA Lab2
11 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Classification Models
No ratings yet
Classification Models
3 pages
Lesson 3
No ratings yet
Lesson 3
5 pages
B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Experiments
No ratings yet
Experiments
7 pages
ML Project Part B
No ratings yet
ML Project Part B
8 pages
SMDS Unit 5
No ratings yet
SMDS Unit 5
21 pages
Data Analytics Program
No ratings yet
Data Analytics Program
11 pages
New Chat: 1. Predicting Uber Ride Prices
No ratings yet
New Chat: 1. Predicting Uber Ride Prices
16 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
Web II & DA Slip Solution
No ratings yet
Web II & DA Slip Solution
40 pages
ML Project Stage 2
No ratings yet
ML Project Stage 2
9 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
ML Lap
No ratings yet
ML Lap
23 pages
PythonForML2023 Laboratory07 08 Regression Classification Update2
No ratings yet
PythonForML2023 Laboratory07 08 Regression Classification Update2
6 pages
DA Practicle Answers Easyw
No ratings yet
DA Practicle Answers Easyw
30 pages
Lab (Work) Experiment File Priyanka Rajak 0901MC221056
No ratings yet
Lab (Work) Experiment File Priyanka Rajak 0901MC221056
19 pages
How To Create A Python Model
No ratings yet
How To Create A Python Model
29 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Hemraj Python Ass1
No ratings yet
Hemraj Python Ass1
7 pages
Logistic Regression
No ratings yet
Logistic Regression
3 pages
Logistic Regression
No ratings yet
Logistic Regression
3 pages
Machine Learning-SEAIML-241P (PR) Bharat
No ratings yet
Machine Learning-SEAIML-241P (PR) Bharat
42 pages
Machine Learning Case Study
No ratings yet
Machine Learning Case Study
8 pages
ML Project Presentation
No ratings yet
ML Project Presentation
9 pages
Assignment 9
No ratings yet
Assignment 9
2 pages
ML Recordjp
No ratings yet
ML Recordjp
35 pages
Bank Marketing Prediction
No ratings yet
Bank Marketing Prediction
2 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
Syllogism Question
No ratings yet
Syllogism Question
4 pages
Alphanumeric Series Question
No ratings yet
Alphanumeric Series Question
3 pages
Alphanumeric Series Solution
No ratings yet
Alphanumeric Series Solution
2 pages
Astar
No ratings yet
Astar
4 pages
Perceptron Neural Network Program
No ratings yet
Perceptron Neural Network Program
3 pages
Multimedia Lab 1
No ratings yet
Multimedia Lab 1
28 pages
How Much DSA Is Enough For A 4-6 LPA Job
No ratings yet
How Much DSA Is Enough For A 4-6 LPA Job
3 pages
L11 Time Series I
No ratings yet
L11 Time Series I
46 pages
Cap653 - Artificial Intelligence PDF
No ratings yet
Cap653 - Artificial Intelligence PDF
10 pages
DS Through JAVA Lab Manual-BSC
100% (1)
DS Through JAVA Lab Manual-BSC
31 pages
Fundamentals of Statistical Signal Processing Estimation 3001q9c4fj
No ratings yet
Fundamentals of Statistical Signal Processing Estimation 3001q9c4fj
5 pages
MUSIC Algorithm for EE Students
No ratings yet
MUSIC Algorithm for EE Students
3 pages
Electronics 3099169 Peer Review v1
No ratings yet
Electronics 3099169 Peer Review v1
12 pages
3-2: Solving Systems of Equations Using Substitution
No ratings yet
3-2: Solving Systems of Equations Using Substitution
10 pages
Quantum Mechanics for Enthusiasts
No ratings yet
Quantum Mechanics for Enthusiasts
5 pages
Germany17 Jann
No ratings yet
Germany17 Jann
84 pages
Secondary 3 A Math by Paradigm Linear Law
No ratings yet
Secondary 3 A Math by Paradigm Linear Law
2 pages
Artifical Neural Network
No ratings yet
Artifical Neural Network
7 pages
Chapter 3 SCM
No ratings yet
Chapter 3 SCM
33 pages
Savas Tuylu
No ratings yet
Savas Tuylu
6 pages
General Relativity PDF
No ratings yet
General Relativity PDF
2 pages
Eshaan Gupta: Software Engineer Intern at Google
No ratings yet
Eshaan Gupta: Software Engineer Intern at Google
1 page
Avinash K. Dixit) Optimization in Economic Theory (BookFi - Org) - 1
No ratings yet
Avinash K. Dixit) Optimization in Economic Theory (BookFi - Org) - 1
101 pages
The Idiots Guide To Quantum Error Correction
No ratings yet
The Idiots Guide To Quantum Error Correction
38 pages
Data Preprocessing: Clean, Transform, Integrate
No ratings yet
Data Preprocessing: Clean, Transform, Integrate
6 pages
Astrological Prediction For Profession Doctor Usin
No ratings yet
Astrological Prediction For Profession Doctor Usin
5 pages
Machine Learning Techniques Guide
No ratings yet
Machine Learning Techniques Guide
15 pages
DSD Quiz 2
No ratings yet
DSD Quiz 2
8 pages
Forecasting Seasonal Time Series Decomposition
No ratings yet
Forecasting Seasonal Time Series Decomposition
32 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Simulink Practice
No ratings yet
Simulink Practice
5 pages
Practice Questions 2
No ratings yet
Practice Questions 2
1 page
Simulasi Random
No ratings yet
Simulasi Random
31 pages
NVRadarNet Real-Time Radar Obstacle and Free Space Detection For Autonomous Driving
No ratings yet
NVRadarNet Real-Time Radar Obstacle and Free Space Detection For Autonomous Driving
7 pages
Keccak Slides at NIST
No ratings yet
Keccak Slides at NIST
71 pages

Perceptron Regression

Uploaded by

Perceptron Regression

Uploaded by

Department of Computer Science and Engineering AISC Lab[CS3131]

Suhani Talreja 229301425

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.ensemble import GradientBoostingClassifier

from sklearn.metrics import classification_report, confusion_matrix

from matplotlib.colors import ListedColormap

# Setting up theme for plots

from jupyterthemes import jtplot

jtplot.style(theme='monokai', context='notebook', ticks=True, grid=False)

return pd.read_csv(filepath, encoding='ISO-8859-1')

data.drop(['Names', 'emails', 'Country'], axis=1, inplace=True)

Suhani Talreja 229301425

def split_data(X, y, test_size=0.2):

return train_test_split(X, y, test_size=test_size, random_state=42)

# Train logistic regression model

def train_logistic_regression(X_train, y_train):

# Train Gradient Boosting model

def train_gradient_boosting(X_train, y_train):

model = GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1, random_state=0)

def evaluate_model(model, X_test, y_test):

sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")

# Visualize decision boundary

def visualize_boundary(X, y, model, title):

X1, X2 = np.meshgrid(np.arange(start=X[:, 0].min() - 1, stop=X[:, 0].max() + 1, step=0.01),

np.arange(start=X[:, 1].min() - 1, stop=X[:, 1].max() + 1, step=0.01))

plt.contourf(X1, X2, model.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),

alpha=0.75, cmap=ListedColormap(('magenta', 'blue')))

plt.scatter(X[y == j, 0], X[y == j, 1],

Suhani Talreja 229301425

c=ListedColormap(('magenta', 'blue'))(i), label=j)

plt.xlabel('Time Spent on Site')

X_train, X_test, y_train, y_test = split_data(X, y)

# Train logistic regression

print("Training Logistic Regression...")

lr_model = train_logistic_regression(X_train, y_train)

print("Logistic Regression Evaluation:")

evaluate_model(lr_model, X_test, y_test)

# Visualize decision boundary

visualize_boundary(X_train, y_train, lr_model, "Logistic Regression (Training Set)")

visualize_boundary(X_test, y_test, lr_model, "Logistic Regression (Test Set)")

# Train Gradient Boosting

print("\nTraining Gradient Boosting...")

gb_model = train_gradient_boosting(X_train, y_train)

print("Gradient Boosting Evaluation:")

evaluate_model(gb_model, X_test, y_test)

# Visualize decision boundary

visualize_boundary(X_train, y_train, gb_model, "Gradient Boosting (Training Set)")

visualize_boundary(X_test, y_test, gb_model, "Gradient Boosting (Test Set)")

Suhani Talreja 229301425

Suhani Talreja 229301425

Suhani Talreja 229301425

Suhani Talreja 229301425

You might also like