0% found this document useful (0 votes)

46 views3 pages

Classification Models

The document discusses several classification models: Logistic regression is used for binary classification problems and models the probability of class membership. Discriminant analysis finds linear combinations of features that best separate classes, assuming normal distributions. Naive Bayes assumes independence between predictors. Support vector machines find the optimal separating hyperplane between classes. Plots like ROC curves and decision boundaries are used to evaluate some models.

Uploaded by

Meis Educational

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views3 pages

Classification Models

Uploaded by

Meis Educational

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Classification Models

Logistic Regression:

 Explanation:

 used for binary classification problems (i.e., response variable is binary (0 or 1)).

 It models the probability that an instance belongs to a particular category.

 In this case, we are predicting whether a car's miles per gallon (mpg) is above or below the mean value.

 The logistic function (sigmoid) is used to map predictions to probabilities.

 When to Use:

 When the relationship between the predictor variables and the response variable is approximately
linear.

 Logistic Regression is chosen when the response variable is categorical, and in this example, it's whether
the mpg is above or below the mean.

 Suitable for problems where the outcome is binary, like whether an email is spam or not.

 Predictors:

 Predictor variables should be numeric or categorical.

# =======================================

# R code: Logistic Regression

# =======================================

Step 1: Load Libraries

library(caret)

library(dplyr)

library(zoo) # used in finding and replacing NA values with mean

Step 2: Load Dataset

data <- mtcars

Step 3: Handle Missing Values, Scaling, and Normalization

# Check for missing values

summary(data)

# If there are missing values:

1) use na.omit() (bad) or 2) replace them with mean or median (BEST)

# Specify pre-processing methods

preprocess_params <- preProcess(data, method = c("mean", "dummy")) # uses mean

preprocess_params <- preProcess(data, method = c("medianImpute", "dummy")) # uses median

# Apply the pre-processing to replace missing values

data <- predict(preprocess_params, newdata = data)

# If scaling or normalization is needed, you can use:

# data <- scale(data) # for scaling

# data <- scale(data, center = FALSE) # for normalization

Step 4: Data Splitting

# Set seed for reproducibility

set.seed(123)

# Split the data into training (80%) and testing (20%) sets

train_index <- createDataPartition(data$mpg, p = 0.8, list = FALSE)

train_data <- data[train_index, ]

test_data <- data[-train_index, ]

Step 5: Build Logistic Regression Model

log_model <- glm(mpg ~., data = train_data, family = "binomial")

Step 6: Model Summary or Plots

# Summary statistics

summary(log_model)

# Or you can create plots if applicable

Step 7: Make Predictions

predictions <- predict(log_model, newdata = test_data, type = "response")

Step 8: Model Evaluation Metrics

# Evaluate model accuracy and performance

conf_matrix <- confusionMatrix(predictions > 0.5, test_data$mpg > mean(data$mpg))

# Display the confusion matrix and other metrics

conf_matrix

=======================

Discriminant Analysis:

 Explanation:

 Discriminant Analysis is used when there are two or more classes and the goal is to find the linear
combination of features that best separates them.

 Assumes normal distribution of predictor variables within each class.

 When to Use:

 When you have more than two classes and you want to classify new observations into one of them.

 Predictors:

 Assumes continuous predictors that are normally distributed.

Naive Bayes Classifier:

 Explanation:

 Naive Bayes is a probabilistic algorithm based on Bayes' theorem, assuming independence between
predictors.

 Despite its "naive" assumption, it performs surprisingly well in many real-world situations.

 When to Use:

 Particularly effective for text classification (spam detection, sentiment analysis).

 Predictors:

 Works well with both categorical and continuous predictors.

Support Vector Machines (SVM):

 Explanation:

 SVM is a powerful classification algorithm that finds the hyperplane that best separates data points of
different classes.

 It works well in high-dimensional spaces and is effective in cases where the number of dimensions is
greater than the number of samples.

 When to Use:

 Useful for both linear and non-linear data.

 Effective when there is a clear margin of separation between classes.

 Predictors:

 Works with numeric predictors; it's essential to scale the data for SVM.

Plots:

 Logistic Regression and Discriminant Analysis:

 Commonly used plots include ROC curves, confusion matrices, and decision boundaries.

 SVM:

 SVM often involves visualizing decision boundaries in feature space.

SMDS Unit 5
No ratings yet
SMDS Unit 5
21 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
33 pages
Commonly Used Machine Learning Algorithms (With Python and R Codes)
No ratings yet
Commonly Used Machine Learning Algorithms (With Python and R Codes)
19 pages
SML
No ratings yet
SML
8 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
Essentials of Machine Learning Algorithms
No ratings yet
Essentials of Machine Learning Algorithms
15 pages
Regression Bayesian SVM Notes
No ratings yet
Regression Bayesian SVM Notes
6 pages
Developing A Machining Learning Models From Start To Finish.
No ratings yet
Developing A Machining Learning Models From Start To Finish.
59 pages
Dsbda Ut4
No ratings yet
Dsbda Ut4
12 pages
Intro to Machine Learning Algorithms
No ratings yet
Intro to Machine Learning Algorithms
72 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
ML Final
No ratings yet
ML Final
92 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
MCQ
No ratings yet
MCQ
4 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
No ratings yet
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
153 pages
Predective Analytics
No ratings yet
Predective Analytics
11 pages
B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Moocs Ritesh
No ratings yet
Moocs Ritesh
22 pages
Intro To Linear and Logistic Reg
No ratings yet
Intro To Linear and Logistic Reg
5 pages
Summary Machine Learning
No ratings yet
Summary Machine Learning
6 pages
General ML Notes
No ratings yet
General ML Notes
30 pages
5 Markd
No ratings yet
5 Markd
24 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Regression Vs Classification in Machine Learning Explained!
No ratings yet
Regression Vs Classification in Machine Learning Explained!
10 pages
Logistic Regression Tech Document
No ratings yet
Logistic Regression Tech Document
12 pages
CO 2 Session 3
No ratings yet
CO 2 Session 3
39 pages
SDL Unit 1
No ratings yet
SDL Unit 1
7 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
Aychew Chernet
No ratings yet
Aychew Chernet
8 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
ML Unit-IV Notes
No ratings yet
ML Unit-IV Notes
49 pages
Linear Regression Simple Technique For I
No ratings yet
Linear Regression Simple Technique For I
3 pages
Machine Learning: Spam Filtering & Regression
No ratings yet
Machine Learning: Spam Filtering & Regression
8 pages
Prac 5
No ratings yet
Prac 5
4 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
25 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
Final ML
No ratings yet
Final ML
2 pages
Big Data Imp Notes of Big Dats
No ratings yet
Big Data Imp Notes of Big Dats
17 pages
Day.12 Logistic Regression
No ratings yet
Day.12 Logistic Regression
8 pages
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
No ratings yet
ML 7th Sem AIML ITE Notes Complete LONG (1) - 34-62
29 pages
ML Lab Manual
No ratings yet
ML Lab Manual
36 pages
Data-Analytics-Manual Lab G.anill Kumar
No ratings yet
Data-Analytics-Manual Lab G.anill Kumar
23 pages
ML 01 (Shubham)
No ratings yet
ML 01 (Shubham)
14 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
SYLLABUS Predictive Analysic
No ratings yet
SYLLABUS Predictive Analysic
3 pages
(BI 2025-1) Lesson15
No ratings yet
(BI 2025-1) Lesson15
70 pages
AIML - Lab7 - Manual (Model Eval-Cross Validation)
No ratings yet
AIML - Lab7 - Manual (Model Eval-Cross Validation)
6 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Section 4
No ratings yet
Section 4
40 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Chapter 6
No ratings yet
Chapter 6
21 pages
Descriptive Analytics
No ratings yet
Descriptive Analytics
10 pages
Lesson 3.1 SPSS OUTPUT
No ratings yet
Lesson 3.1 SPSS OUTPUT
6 pages
Detailed Sales and Promotion Analysis
No ratings yet
Detailed Sales and Promotion Analysis
44 pages
Scatter Plots and Linear Regression
No ratings yet
Scatter Plots and Linear Regression
2 pages
BIGDATA Pharmaceutical Industry
No ratings yet
BIGDATA Pharmaceutical Industry
7 pages
Foundation Course - Statistics For Data Science II
No ratings yet
Foundation Course - Statistics For Data Science II
2 pages
Data Science Internship With Python Program Book Semester-Term Internship-1
No ratings yet
Data Science Internship With Python Program Book Semester-Term Internship-1
90 pages
Emerging Trends Business Analytics
No ratings yet
Emerging Trends Business Analytics
4 pages
Politics of Teachers Promotion System in Public Schools
No ratings yet
Politics of Teachers Promotion System in Public Schools
17 pages
Moving Average For DEMAND: Method
No ratings yet
Moving Average For DEMAND: Method
4 pages
Beyond Multiple Linear Regression Applied Generalized Linear Models and Multilevel Models in R 1st Edition Paul Roback
No ratings yet
Beyond Multiple Linear Regression Applied Generalized Linear Models and Multilevel Models in R 1st Edition Paul Roback
71 pages
Research Proposal - MPhil Public Policy and Management
No ratings yet
Research Proposal - MPhil Public Policy and Management
32 pages
Insurance Profitability Factors
No ratings yet
Insurance Profitability Factors
38 pages
TE - Syllabus - R2019 July9
No ratings yet
TE - Syllabus - R2019 July9
3 pages
Challenges in Social Entrepreneurship: Dr. N.Rajendhiran and C.Silambarasan
No ratings yet
Challenges in Social Entrepreneurship: Dr. N.Rajendhiran and C.Silambarasan
3 pages
BSBMGT615 Organisation Development Assessment
50% (2)
BSBMGT615 Organisation Development Assessment
34 pages
Design Rainfall Data and Analysis
No ratings yet
Design Rainfall Data and Analysis
213 pages
Ee 708 Report
No ratings yet
Ee 708 Report
3 pages
Videocon Smart TV
No ratings yet
Videocon Smart TV
50 pages
Tapos Na To
No ratings yet
Tapos Na To
32 pages
Measures of Variation Guide
No ratings yet
Measures of Variation Guide
26 pages
Winters Model Excel For SPC
No ratings yet
Winters Model Excel For SPC
14 pages
Revision File PDF
75% (4)
Revision File PDF
78 pages
Ch7. Hypothesis Testing
100% (2)
Ch7. Hypothesis Testing
86 pages
Regression Analysis: Li-Ann Lee C. Nalangan
No ratings yet
Regression Analysis: Li-Ann Lee C. Nalangan
92 pages
ML Cheat Sheet MediSearch
No ratings yet
ML Cheat Sheet MediSearch
1 page
Correlation Analysis Examples
100% (1)
Correlation Analysis Examples
6 pages
Reduce Sampling Rejection in Fashion
No ratings yet
Reduce Sampling Rejection in Fashion
61 pages

Classification Models

Uploaded by

Classification Models

Uploaded by

Classification Models

 It models the probability that an instance belongs to a particular category.

 The logistic function (sigmoid) is used to map predictions to probabilities.

 Predictor variables should be numeric or categorical.

# R code: Logistic Regression

Step 1: Load Libraries

library(zoo) # used in finding and replacing NA values with mean

Step 2: Load Dataset

data <- mtcars

Step 3: Handle Missing Values, Scaling, and Normalization

# Check for missing values

# If there are missing values:

1) use na.omit() (bad) or 2) replace them with mean or median (BEST)

# Specify pre-processing methods

preprocess_params <- preProcess(data, method = c("mean", "dummy")) # uses mean

preprocess_params <- preProcess(data, method = c("medianImpute", "dummy")) # uses median

# Apply the pre-processing to replace missing values

# If scaling or normalization is needed, you can use:

# data <- scale(data) # for scaling

# data <- scale(data, center = FALSE) # for normalization

Step 4: Data Splitting

# Set seed for reproducibility

train_index <- createDataPartition(data$mpg, p = 0.8, list = FALSE)

train_data <- data[train_index, ]

test_data <- data[-train_index, ]

Step 5: Build Logistic Regression Model

log_model <- glm(mpg ~., data = train_data, family = "binomial")

Step 6: Model Summary or Plots

# Or you can create plots if applicable

Step 7: Make Predictions

predictions <- predict(log_model, newdata = test_data, type = "response")

Step 8: Model Evaluation Metrics

# Evaluate model accuracy and performance

conf_matrix <- confusionMatrix(predictions > 0.5, test_data$mpg > mean(data$mpg))

# Display the confusion matrix and other metrics

 Assumes normal distribution of predictor variables within each class.

 Assumes continuous predictors that are normally distributed.

Naive Bayes Classifier:

 Particularly effective for text classification (spam detection, sentiment analysis).

 Works well with both categorical and continuous predictors.

Support Vector Machines (SVM):

 Useful for both linear and non-linear data.

 Effective when there is a clear margin of separation between classes.

 Logistic Regression and Discriminant Analysis:

 SVM often involves visualizing decision boundaries in feature space.

You might also like