[go: up one dir, main page]

0% found this document useful (0 votes)
19 views21 pages

Update Week 13 Machine Learning Supervised

This document outlines the topic of Supervised Learning in Machine Learning, detailing its definition, processes, and types of algorithms such as classification and regression. It also covers evaluation methods like accuracy and confusion matrix, as well as the K-Nearest Neighbors (KNN) algorithm. The learning objectives include understanding ML concepts, tasks, and evaluation techniques essential for practical applications.

Uploaded by

am16.pro16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views21 pages

Update Week 13 Machine Learning Supervised

This document outlines the topic of Supervised Learning in Machine Learning, detailing its definition, processes, and types of algorithms such as classification and regression. It also covers evaluation methods like accuracy and confusion matrix, as well as the K-Nearest Neighbors (KNN) algorithm. The learning objectives include understanding ML concepts, tasks, and evaluation techniques essential for practical applications.

Uploaded by

am16.pro16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Topic 9:

Machine Learning:
Supervised Learning
Term 2-ARTI 106
Computer Track
2024-2025
Learning outcomes

The main learning objectives of this topic are:


❑Define ML and supervised learning.
❑ Explain the tasks of ML such classification and
regression.
❑ Explore some evaluation methods in ML such as cross
validation and accuracy.
❑ Explain the steps of KNN algorithm.
Outlines

❑ Define machine learning


❑ Define Supervised learning
❑ SL process
❑ Classification and regression in SL
❑ Evaluation methods
❑ K-Nearest Neighbors algorithm
What is ML?
❑ Machine learning (ML) is a branch of artificial intelligence (AI) and
computer science that focuses on the using data and algorithms to
enable AI to imitate the way that humans learn, gradually
improving its accuracy.
❑ ML provides machines the ability to automatically learn from data
and past experiences to identify patterns and make predictions
with minimal human intervention.
❑ ML applications are fed with new data, and they can
independently learn, grow, develop, and adapt.
Types of ML

❑ Supervised Machine Learning

❑ Unsupervised Machine Learning


What is supervised Learning
❑ Supervised learning is the types of ML in which
machines are trained using “labelled” training data, and
on basis of that data, machines predict the output.
❑ In supervised learning, the training data provided to the
machines works as the supervisor that teaches the
machines to predict the output correctly. In applies the
same concept as a student learns in the supervision of
the teacher.
Example 1
Example 2
Steps involved in
supervised learning
❑Determine the type of dataset.
❑Collet the labelled data.
❑Split the dataset into training dataset and test dataset.
❑Identify the suitable algorithm for the model.
❑Execute the algorithm on the training dataset.
❑Evaluate the accuracy of the model by providing the test
set.
Types of SL algorithms
Classification tasks
❑ Classification algorithms refer to algorithms that address
classification problems where the output variable is
categorical; for example, yes or no, true or false, male or
female, etc. Real-world applications of this category are
evident in spam detection and email filtering.
❑ Some known classification algorithms include the Random
Forest Algorithm, Decision Tree Algorithm, Logistic
Regression Algorithm, and Support Vector Machine
Algorithm.
Regression
❑ Regression algorithms handle regression problems where
input and output variables have a linear relationship.
These are known to predict continuous output variables.
Examples include weather prediction, market trend
analysis, etc
❑ Popular regression algorithms include the Simple Linear
Regression Algorithm, Multivariate Regression Algorithm,
Decision Tree Algorithm, and Lasso Regression
Classification Vs Regressions
❑The main goal of classification is to predict the target class
(Yes/ No).
❑The main goal of regression algorithms is the predict the
discrete or a continues value.
❑If forecasting target class ( Classification )
❑If forecasting a value ( Regression )
Classification Vs Regressions
Evaluations metrics Types

❑Different types of evaluation metrics for different


types of ML algorithm (classification, regression,
ranking..).
❑Some metrics can be useful for more than one type of
algorithm such as Precision-Recall.
❑Some popular classification metrics are: Accuracy,
Confusion matrix and AUC.
Evaluations metrics…

❑Methods which determine an algorithm’s performance


and behavior.
❑Helpful to decide the best model to meet the target
performance.
❑Helpful to parameterize the model in such a way that can
offer best performing algorithm.
Accuracy vs Confusion matrix
❑ Accuracy is a ratio between the number of correct predictions and total
number of predictions:
accuracy=#correct predictions/#total data points
❑ Confusion matrix shows a more detailed breakdown of correct and
incorrect classifications for each class.

Accuracy=(30+930)/(30+10+30+930)​
=960/1000
​=0.96
Accuracy = 96%
Holdout set

❑ Holdout set: The available data set D is divided into two disjoint subsets,
❖ the training set Dtrain (for learning a model)
❖ the test set Dtest (for testing the model)
❑ Important: training set should not be used in testing and the test set should
not be used in learning.
❑ The test set is also called the holdout set (the examples in the original data
set D are all labeled with classes.)
❑ This method is mainly used when the data set D is large.
Cross-validation: How it works

1. Split the dataset into N equal-size disjoint subsets (called folds).


2. Iterate N times:
❑ In each iteration:
❑ Use N-1 folds for training.
❑ Use the remaining 1 fold for testing.
3. Repeat this process until every fold has been used as the test set once.
4. Average the evaluation metrics (accuracy, precision, etc.) across all N iterations.
❑ 10-fold and 5-fold cross-validations are commonly used.
❑ This method is used when the available data is not large.
Application using KNN
❑ kNN is a simple algorithm that stores all available cases and
classifies new cases based on a similarity measure (eg distance
function).
❑ k is usually chosen empirically via a validation set or cross-
validation by trying a range of k values.
❑ Distance function is crucial, but depends on applications.

❑ Simple Explanation of the K-Nearest Neighbors (KNN)


Algorithm:
https://www.youtube.com/watch?v=zeFt_JCA3b4
Thank you for your attention

You might also like