0% found this document useful (0 votes)

8 views87 pages

Lec-7 Intro Machine Learning

Lec-7_Intro_Machine_Learning

Uploaded by

Nure Hafsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views87 pages

Lec-7 Intro Machine Learning

Lec-7_Intro_Machine_Learning

Uploaded by

Nure Hafsa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 87

Machine Learning (ML)

Theory and Concepts

• American pioneer in the field of computer
Arthur Lee Samuel gaming and artificial intelligence

(December 5, 1901 – July • Two early game-playing programs, Samuel

Checkers (self-learning) and TD-Gammon, led
to breakthroughs in artificial intelligence
29, 1990) • MIT Graduate, Bell Labs, IBM, Stanford Prof.

2
Topics (Edit Later)
• What is Machine Learning?
• What is Deep Learning?
• Difference between Supervised and Unsupervised Learning
• Supervised Learning Process
• Evaluating performance
• Overfitting

3
Machine Learning (ML)

Artificial
• Subset/branch/subfield of Artificial Intelligence
Intelligence (AI)
• “Learning machines to imitate
Machine
human intelligence” Learning
• “Focuses on the using data and
algorithms to enable AI to imitate the
way that humans learn, gradually
improving its accuracy.”-IBM
Deep
• Allows computers to learn without Learning
explicit programming

4
Traditional vs ML
Programing

5
ML Related Fields data
mining control theory

statistics
decision theory
information theory machine
learning
cognitive science
databases
psychological models
Machine learning is primarily
concerned with the accuracy and evolutionary neuroscience
models
effectiveness of the computer
system.

6
• Fraud detection. • Network intrusion
ML • Web search results. detection.
Application • Real-time ads on web
• Recommendation
Engines
pages
s • Credit scoring. • Customer Segmentation

• Prediction of equipment • Text Sentiment Analysis

failures. • Customer Churn
• New pricing models. • Pattern and image
recognition.
• Email spam filtering.

7
Types of ML

Semi-
Supervised Unsupervised Reinforcement
Supervised
Machine Machine Machine
Machine
Learning Learning Learning
Learning

8
Supervised
Machine
Learning
• Model gets trained on a
“Labelled Dataset”
• High accuracy as they are
trained on labelled data
• Can be time-consuming
and costly as it relies on
labeled data only
• Two main categories
• Classification
• Regression
Labeled: America
Labelled: British 9
Unsupervised
Machine Learning

• Algorithm discovers patterns and

relationships using unlabeled data
• Discover hidden patterns, similarities,
or clusters within the data
• Without using labels, it may be difficult
to predict the quality of the model’s
output
• Two main categories
• Clustering
• Association

10
Semi- • Works between the supervised and unsupervised
Supervised learning so it uses both labelled and unlabelled data
Machine • Useful when obtaining labeled data is costly, time-
consuming, or resource-intensive
Learning

Example: Image and Speech

11
Analysis
Algorithms

Supervised Unsupervised
• Classification • Clustering
• Logistic Regression • K-Means Clustering algorithm
• Support Vector Machine • Mean-shift algorithm
• Random Forest • DBSCAN Algorithm
• Decision Tree • Principal Component Analysis
• K-Nearest Neighbors (KNN)
• Independent Component Analysis
• Naive Bayes
• Regression • Association
• Linear Regression • Apriori Algorithm
• Polynomial Regression • Eclat
• Ridge Regression • FP-growth Algorithm
• Lasso Regression
• Decision tree
• Random Forest

12
Reinforcement Machine
Learning
• Interacts with the environment by
producing actions and discovering
errors.
• Trial, error, and delay are the most
relevant characteristics
• Popular Algorithm
• Q-learning
• SARSA (State-Action-Reward-State-
Action)
• Deep Q-learning

13
14
ML Applications

15
Photo Tail length neck Has Is
No (cm) length horn? Giraffe?
(cm)
1 5 8 Yes Yes
2 2 3 No No
3 1 2 No No
4 0 2 No No

• Measurable property or unique characteristic of a

phenomenon
• Except labelled all other are feature data (may take part
Feature of)
• Labelled data (Output is given)
• Unlabelled data (Output is not given)
16
Supervised Learning
Pizza Pizza
Problem
size (in prince (in • Pizza price prediction
inch) taka) (regression problem)
• Finding whether the animal is
6 399 Giraffe (classification problem)
• Train the machine using data
9 699 and then do predict based on
the learning
12 1000

17
CRITERIA L AB E L E D D ATA U N L AB E L E D D ATA

Definition Data with both input features and corresponding output labels Data with only input features and no output labels

Usage Primarily used in supervised learning Primarily used in unsupervised learning

Find patterns, groupings, or structures without

Application Train models to predict or classify based on input data
predefined labels

Annotation Annotated with correct answers No annotations or labels

Example Images labeled with categories like "cat," "dog" Images without any category labels

More expensive and time-consuming due to the need for manual

Cost and Effort Easier and cheaper to collect
annotation

Supervised Learning Essential for training models Not used directly in training models

Unsupervised
Not applicable Essential for discovering patterns and structures
Learning

Helps in uncovering hidden patterns and

Importance Helps in learning the relationship between input and output
relationships

18
Supervised Machine Learning Process
(1)

Test
Data

Model
Data Data Model Model
Training &
Acquisition Cleaning Testing Deployment
Building

19
Supervised Machine Learning Process
(2)
• Get your data! Customers, Sensors, etc...

Data
Acquisition

20
Supervised Machine Learning Process
(3)
• Clean and format your data (using Pandas)

Data Data
Acquisition Cleaning

21
Supervised Machine Learning Process
(4)

Test
Data

Model
Data Data Training &
Acquisition Cleaning Building

22
Supervised Machine Learning Process
(5)

Test
Data

Model
Data Data Model
Training &
Acquisition Cleaning Testing
Building

23
Supervised Machine Learning Process
(6)

Test
Data

Model
Data Data Model
Training &
Acquisition Cleaning Testing
Building

Adjust
Model
Parameters

24
Model parameters

• Internal to the model

• Essential for making prediction
• Dependent on dataset
• Example: weight/coefficient of independent
variables in linear regression

Parameters Hyperparameters

• Essential for optimizing model performance

• External for model
• Set manually by ML engineer
• Not dependent on dataset
• Example: kernel and slack in SVM, value of k
in KNN, depth of tree in decision tree

25
Supervised Machine Learning Process
(7)

Test
Data

Model
Data Data Model Model
Training &
Acquisition Cleaning Testing Deployment
Building

26
ML Data Sources

27
Popular Data Sources

28
Why Data Cleaning/Preprocessing?
• Data in the real world is dirty
• incomplete: lacking attribute values, lacking certain attributes of
interest, or containing only aggregate data
• noisy: containing errors or outliers
• inconsistent: containing discrepancies in codes or names
• No quality data, no quality mining results!
• Quality decisions must be based on quality data
• Data warehouse needs consistent integration of quality data

Cleaning, preprocessing, preparing

Data is an important task to be able to
Develop effective ML frameworks

29
Data Reduction Strategies
• Data reduction: Obtain a reduced representation
of the data set that is much smaller in volume. But
produces the same (or almost the same) analytical
results
• Why data reduction?
• A database/data warehouse may store
terabytes of data. Complex data analysis may
take a very long time to run on the complete
data set.
• Data reduction strategies
• Dimensionality reduction, e.g., remove
unimportant attributes
• Principal Components Analysis (PCA)
• Feature subset selection, feature
creation/extraction
• Compression, Sampling, Aggregation,
Filtering, Transformation, …
30
Data Splitting
• Training Data
• Used to train model parameters
• Validation Data
• Used to determine what model hyperparameters to adjust
• Test Data
• Used to get some final performance metric

31
Cross-
validation
• Rahim’s exam
preparation
• Questions covered
from know/unknown
chapters
• Result: good/bad
• Is he a good/bad
student?

32
K-Fold Cross Validation
• Divide the dataset into K chunks (i.e., folds) and train K times, using
a different fold for each time.
• E.g., Assume K=5

Dataset Fold 1 Fold 2 Fold 3 Fold 4 Fold 5

1 Test Train Train Train Train

2 Train Test Train Train Train
3 Train Train Test Train Train
4 Train Train Train Test Train
5 Train Train Train Train Test

33
K-Fold
• Final Model Evaluation: (S1 + S2 + S3 + S4 + S5)
Cross /5
Validation
Iteration (1 to Training Set Test Performance Score
K) Set
1 D1, D2, D3, D4, D1 S1
D5
2 D1, D3, D4, D5 D2 S2
3 D1, D2, D4, D5 D3 S3
4 D1, D2, D3, D5 D4 S4
5 D1, D2, D3, D4 D5 S5
34
Overfitting and Underfitting
• Underfitting: Poor performance on the training data and poor
generalization to other (unseen) data
• Overfitting: Good performance on training data but poor
generalization to other (unseen) data (memorizing!!)

35
Performance Metrics/Model Evaluation
Key Classification Problem • Clustering
• Confusion Matrix (not a metric)
• Elbow Method (not a
• Accuracy performance metric but used to
• Precision find optimal number of K
• Recall cluster)
• F1-Score

Regression Problem

• Mean Absolute Error (MAE)

• Mean Squared Error (MSE)
• R-squared (R²) Value
• Root Mean Squared Error (RMSE)

36
Confusion Matrix (not a
metric)

• It is also called as error matrix

• It is an N x N matrix, where N is the
number of target labels (classes)
• It shows the number of correct and
incorrect predictions made by the classifier
compared to the actual outcomes (target
labels) in the actual data
• E.g., binary classification problem (e.g., two
classes 0|1 or T|F)

37
Evaluating Performance
REGRESSION
Model Evaluation

● But first, we should understand the reasoning

behind these metrics and how they will actually
work in the real world!
Model Evaluation

● Typically in any classification task your model

can only achieve two results:
○ Either your model was correct in its
prediction.
○ Or your model was incorrect in its prediction.
Model Evaluation

● Fortunately incorrect vs correct expands to

situations where you have multiple classes.
● For the purposes of explaining the metrics, let’s
imagine a binary classification situation, where
we only have two available classes.
Model Evaluation

● In our example, we will attempt to predict if an

image is a dog or a cat.
● Since this is supervised learning, we will first
fit/train a model on training data, then test the
model on testing data.
● Once we have the model’s predictions from the
X_test data, we compare it to the true y values
(the correct labels).
Model Evaluation

TRAINED
MODEL
Model Evaluation

TRAINED
Test Image
from X_test
MODEL
Model Evaluation

TRAINED
Test Image
from X_test
MODEL

DOG
Correct Label
from y_test
Model Evaluation

TRAINED
Test Image DOG
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label
from y_test
Model Evaluation

TRAINED
Test Image DOG
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label DOG == DOG ?
from y_test
Compare Prediction to Correct Label
Model Evaluation

TRAINED
Test Image CAT
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label DOG == CAT ?
from y_test
Compare Prediction to Correct Label
Model Evaluation

● We repeat this process for all the images in our X

test data.
● At the end we will have a count of correct
matches and a count of incorrect matches.
● The key realization we need to make, is that in
the real world, not all incorrect or correct
matches hold equal value!
Model Evaluation

● Also in the real world, a single metric won’t tell

the complete story!
● To understand all of this, let’s bring back the 4
metrics we mentioned and see how they are
calculated.
● We could organize our predicted values
compared to the real values in a confusion
matrix.
Model Evaluation

● Accuracy
○ Accuracy in classification problems is the
number of correct predictions made by the
model divided by the total number of
predictions.
Model Evaluation

● Accuracy
○ For example, if the X_test set was 100 images
and our model correctly predicted 80 images,
then we have 80/100.
○ 0.8 or 80% accuracy.
Model Evaluation

● Accuracy
○ Accuracy is useful when target classes are
well balanced
○ In our example, we would have roughly the
same amount of cat images as we have dog
images.
Model Evaluation

● Accuracy
○ Accuracy is not a good choice with
unbalanced classes!
○ Imagine we had 99 images of dogs and 1
image of a cat.
○ If our model was simply a line that always
predicted dog we would get 99% accuracy!
Model Evaluation

● Accuracy
○ Imagine we had 99 images of dogs and 1
image of a cat.
○ If our model was simply a line that always
predicted dog we would get 99% accuracy!
○ In this situation we’ll want to understand recall
and precision
Model Evaluation

● Recall/Sensitivity
○ TP/(TP+FN)
○ Ability of a model to find all the relevant cases within a
dataset.
○ The precise definition of recall is the number of true
positives divided by the number of true positives plus
the number of false negatives.
Model Evaluation
● Assume, a data set of 100 possible cancer patients
whose only 10 patients have really cancer
● Recall: TP/(TP+FN)
● Let, our model identified 5 patients have cancer. So,
True Positive = 5, False Negative = 5
● Recall = 5/10 = .5 = 50 %
● Recall works on false negative- finds the number of
unidentified real cancer patients

57
Model Evaluation

● Precision
○ Ability of a classification model to identify only
the relevant data points.
○ Precision is defined as the number of true
positives divided by the number of true
positives plus the number of false positives.
Model Evaluation
● Assume, a data set of 100 possible cancer patients
whose only 5 patients have really cancer
● Precision: TP/(TP+FP)
● Let, our model predicted all100 patients have
cancer. So, True Positive + False Positive = 100
● But only 5 patients have cancer. True Positive = 5
● Precision = 5/100 = .05 = 5%
● Precision works on false positive- finds the number
of real cancer patients
59
Model Evaluation:
Recall and
Precision

• Often you have a trade-off between

Recall and Precision.
• While recall expresses the ability to find
all relevant instances in a dataset,
precision expresses the proportion of
the data points our model says was
relevant were relevant.
Model Evaluation

● F1-Score
○ In cases where we want to find an optimal
blend of precision and recall we can combine
the two metrics using what is called the F1
score.
Model Evaluation

● F1-Score
○ The F1 score is the harmonic mean of
precision and recall taking both metrics into
account in the following equation:
Model Evaluation

● F1-Score
○ We use the harmonic mean instead of a
simple average because it punishes extreme
values.
○ A classifier with a precision of 1.0 and a recall
of 0.0 has a simple average of 0.5 but an F1
score of 0.
○
Model Evaluation

Machin Math &

e Statistics
The main point to remember with the confusion matrix and the
Learnin
various calculated metrics is that
D they are all fundamentally ways
g
of comparing the predicted values
S versus the true values.
Softwar Resear
e ch
What constitutes “good” metrics, will really
Domain
depend on the specific situation!
Knowledge
Performance Metrics/Model Evaluation

65
Evaluating Performance
REGRESSION
Evaluating Regression
● Let’s take a moment now to discuss evaluating
Regression Models
● Regression is a task when a model attempts
to predict continuous values (unlike
categorical values, which is classification)
Evaluating Regression
● You may have heard of some evaluation
metrics like accuracy or recall.
● These sort of metrics aren’t useful for
regression problems, we need metrics
designed for continuous values!
Evaluating Regression
● For example, attempting to predict the price of
a house given its features is a regression
task.
● Attempting to predict the country a house is in
given its features would be a classification
task.
Evaluating Regression
● Most common evaluation metrics:
○ Mean Absolute Error (MAE)
○ Mean Squared Error (MSE)
○ Root Mean Square Error (RMSE)
Evaluating
Regression
Mean Absolute Error (MAE)
• This is the mean of
the absolute value of
errors.
• Easy to understand
Evaluating Regression
● MAE won’t punish large errors however.
Evaluating Regression

● MAE won’t punish large errors however.

Evaluating Regression

● We want our error metrics to account for

these!
Evaluating Regression

● Mean Squared Error (MSE)

○ This is the mean of the squared errors.
○ Larger errors are noted more than with
MAE, making MSE more popular.
Evaluating Regression:
R-squared (R²) Value/Score

• Y bar: mean of all

real values
• As value close to 1
the model
performed better

76
• Root Mean Square Error (RMSE)
Evaluating • This is the root of the mean of the
Regression squared errors.
• Most popular (has same units as y)
Machine Learning

● Most common question from students:

○ “Is this value of RMSE good?”
● Context is everything!
● A RMSE of $10 is fantastic for predicting the
price of a house, but horrible for predicting the
price of a candy bar!
Machine Learning

● Compare your error metric to the average value

of the label in your data set to try to get an
intuition of its overall performance.
● Domain knowledge also plays an important role
here!
Unsupervised Learning
Machine Learning

● We’ve covered supervised learning, where the

label was known due to historical labeled
data.
● But what happens when we don’t have historical
labels?
Machine Learning
● There are certain tasks that fall under
unsupervised learning:
○ Clustering
○ Anomaly Detection
○ Dimensionality Reduction
Machine Learning

● Clustering
○ Grouping together unlabeled data points
into categories/clusters
○ Data points are assigned to a cluster based
on similarity
Machine Learning
● Anomaly Detection
○ Attempts to detect outliers in a dataset
○ For example, fraudulent transactions on a
credit card.
Machine Learning

● Dimensionality Reduction
○ Data processing techniques that reduces
the number of features in a data set, either
for compression, or to better understand
underlying trends within a data set.
Machine Learning

● Unsupervised Learning
○ It’s important to note, these are situations
where we don’t have the correct answer
for historical data!
○ Which means evaluation is much harder
and more nuanced!
Unsupervised Process

Test
Data

Model
Data Data Training & Transformation Model
Acquisition Cleaning Building Deployment

ML Module I
No ratings yet
ML Module I
71 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
Ch01 ICS422 02
No ratings yet
Ch01 ICS422 02
39 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Intro ML
No ratings yet
Intro ML
35 pages
Classification of Machine Learning
No ratings yet
Classification of Machine Learning
73 pages
Basics of Machine Learning and Deep Learning
100% (1)
Basics of Machine Learning and Deep Learning
49 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Machine Learning Notes
91% (11)
Machine Learning Notes
19 pages
Introduction to AI: Machine Learning Basics
No ratings yet
Introduction to AI: Machine Learning Basics
72 pages
Section2 ML
100% (1)
Section2 ML
8 pages
Social Media Analytics Techniques
No ratings yet
Social Media Analytics Techniques
77 pages
Unit 3 ML
No ratings yet
Unit 3 ML
119 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Intro to Machine Learning & kNN
No ratings yet
Intro to Machine Learning & kNN
90 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
73 pages
LM #02-ML Concepts & Frameworks
No ratings yet
LM #02-ML Concepts & Frameworks
31 pages
Big-Data Unit-3
100% (1)
Big-Data Unit-3
54 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
225 pages
Unit I 1
No ratings yet
Unit I 1
203 pages
01 Introduction
No ratings yet
01 Introduction
28 pages
Aiya Session 4
No ratings yet
Aiya Session 4
42 pages
Unit I 2
No ratings yet
Unit I 2
78 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Lecture 2 Unit 1
No ratings yet
Lecture 2 Unit 1
60 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Supervised & Deep Learning Guide
No ratings yet
Supervised & Deep Learning Guide
83 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
Mlintro 4
No ratings yet
Mlintro 4
28 pages
Unit 4 - Question Bank and Answers
No ratings yet
Unit 4 - Question Bank and Answers
23 pages
Intro to Machine Learning Basics
100% (3)
Intro to Machine Learning Basics
24 pages
Made By: Swati Tripathi
No ratings yet
Made By: Swati Tripathi
31 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
Big Data Lecture # 08
No ratings yet
Big Data Lecture # 08
21 pages
ML 1
No ratings yet
ML 1
35 pages
1-Introduction To Machine Learning
No ratings yet
1-Introduction To Machine Learning
61 pages
Unit 3
No ratings yet
Unit 3
33 pages
ML - Unit 1
No ratings yet
ML - Unit 1
68 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Machine Learning - Course
No ratings yet
Machine Learning - Course
6 pages
Week 12 Intro To DS and ML
No ratings yet
Week 12 Intro To DS and ML
67 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
53 pages
Unit 1 ML
No ratings yet
Unit 1 ML
49 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Intro ML 1 Day
No ratings yet
Intro ML 1 Day
43 pages
Machine Learning Basics & Techniques
No ratings yet
Machine Learning Basics & Techniques
13 pages
MLUnit - 1 Share
No ratings yet
MLUnit - 1 Share
162 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
135 pages
Machine Learning - Brief
No ratings yet
Machine Learning - Brief
12 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
Machine Learning IAI
No ratings yet
Machine Learning IAI
94 pages
CHP 1
No ratings yet
CHP 1
47 pages
Machinelearning Unit1
No ratings yet
Machinelearning Unit1
9 pages
Example of Statistics Coursework
100% (2)
Example of Statistics Coursework
4 pages
Correlation Analysis Examples
100% (1)
Correlation Analysis Examples
6 pages
Pornography Consumption in People of Different Age
No ratings yet
Pornography Consumption in People of Different Age
15 pages
Research Methodology: Dr. Wesley K. Kirui PH.D
No ratings yet
Research Methodology: Dr. Wesley K. Kirui PH.D
105 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
32 pages
British Airways Internship Report
No ratings yet
British Airways Internship Report
26 pages
Project Report
No ratings yet
Project Report
24 pages
Modelling Binary Data: Second Edition
0% (2)
Modelling Binary Data: Second Edition
4 pages
Questions For Econometrics Viva
No ratings yet
Questions For Econometrics Viva
3 pages
SPSS Worksheet 2 One-Way ANOVA
No ratings yet
SPSS Worksheet 2 One-Way ANOVA
6 pages
Jarvis Development Plan
No ratings yet
Jarvis Development Plan
21 pages
Dummy Variables in Econometrics
No ratings yet
Dummy Variables in Econometrics
68 pages
(UPCRES 2021) A Primer On Communication and Media Research
100% (1)
(UPCRES 2021) A Primer On Communication and Media Research
181 pages
Machine Learning Suggestion (2 Marks) MCQ
No ratings yet
Machine Learning Suggestion (2 Marks) MCQ
5 pages
Project Report On Computer Application in Business
No ratings yet
Project Report On Computer Application in Business
8 pages
Pallavi Patil: Data Science Engineer Profile
No ratings yet
Pallavi Patil: Data Science Engineer Profile
1 page
Presention
No ratings yet
Presention
9 pages
Kendall's Tau and Spearman's Rank Correlation Coefficient Assess Statistical
No ratings yet
Kendall's Tau and Spearman's Rank Correlation Coefficient Assess Statistical
7 pages
EDUA630 Research Project Spring 19
No ratings yet
EDUA630 Research Project Spring 19
5 pages
Module 3
No ratings yet
Module 3
3 pages
Videocon Smart TV
No ratings yet
Videocon Smart TV
50 pages
Biostatistics Questions For PG Community Medicine
No ratings yet
Biostatistics Questions For PG Community Medicine
3 pages
Assignment 8
No ratings yet
Assignment 8
4 pages
DSAT English - Command of Evidence (Quantitative)
No ratings yet
DSAT English - Command of Evidence (Quantitative)
4 pages
Moderation Analysis with Regression
No ratings yet
Moderation Analysis with Regression
20 pages
Forecasting Part 2
No ratings yet
Forecasting Part 2
66 pages
Mengistu Researchb - Dsrwerdfsd
No ratings yet
Mengistu Researchb - Dsrwerdfsd
40 pages
Ablone Data Set - Case Study
No ratings yet
Ablone Data Set - Case Study
24 pages
MCA Syllabus
No ratings yet
MCA Syllabus
26 pages
Verizon's 2012 Data Breach Investigations Report
No ratings yet
Verizon's 2012 Data Breach Investigations Report
80 pages

Lec-7 Intro Machine Learning

Uploaded by

Lec-7 Intro Machine Learning

Uploaded by

Machine Learning (ML)

Theory and Concepts

(December 5, 1901 – July • Two early game-playing programs, Samuel

• Prediction of equipment • Text Sentiment Analysis

• Algorithm discovers patterns and

Example: Image and Speech

• Measurable property or unique characteristic of a

Usage Primarily used in supervised learning Primarily used in unsupervised learning

Find patterns, groupings, or structures without

Annotation Annotated with correct answers No annotations or labels

More expensive and time-consuming due to the need for manual

Helps in uncovering hidden patterns and

• Internal to the model

• Essential for optimizing model performance

Cleaning, preprocessing, preparing

Dataset Fold 1 Fold 2 Fold 3 Fold 4 Fold 5

1 Test Train Train Train Train

• Mean Absolute Error (MAE)

• It is also called as error matrix

● But first, we should understand the reasoning

● Typically in any classification task your model

● Fortunately incorrect vs correct expands to

● In our example, we will attempt to predict if an

● We repeat this process for all the images in our X

● Also in the real world, a single metric won’t tell

• Often you have a trade-off between

Machin Math &

● MAE won’t punish large errors however.

● We want our error metrics to account for

● Mean Squared Error (MSE)

• Y bar: mean of all

● Most common question from students:

● Compare your error metric to the average value

● We’ve covered supervised learning, where the

You might also like