0% found this document useful (0 votes)

22 views38 pages

Resampling Methods Class 2

Uploaded by

nick peak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views38 pages

Resampling Methods Class 2

Uploaded by

nick peak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Lecture 1

Resampling Methods

Manoj Kumar

Machine Learning
Youtube

October 30, 2024

October 30, 2024 1 / 15

Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

x y
0.0000 1.4544
0.1579 2.1039
0.3158 1.6518
0.4737 1.5701
0.6316 2.1284
Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

y x x2
1.4544 0.0000 0.0000
2.1039 0.1579 0.0249
1.6518 0.3158 0.0998
1.5701 0.4737 0.2244
2.1284 0.6316 0.3989
Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

y x x2 x3
1.4544 0.0000 0.0000 0.0000
2.1039 0.1579 0.0249 0.0039
1.6518 0.3158 0.0998 0.0315
1.5701 0.4737 0.2244 0.1063
2.1284 0.6316 0.3989 0.2519
Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

y x x2 x3
1.4544 0.0000 0.0000 0.0000
2.1039 0.1579 0.0249 0.0039
1.6518 0.3158 0.0998 0.0315
1.5701 0.4737 0.2244 0.1063
2.1284 0.6316 0.3989 0.2519

Does higher value of polynomial degree

means better model?
Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

y x x2 ··· x9
1.4544 0.0000 0.0000 ··· 0.0000
2.1039 0.1579 0.0249 ··· 0.0000
1.6518 0.3158 0.0998 ··· 0.0003
1.5701 0.4737 0.2244 ··· 0.0013
2.1284 0.6316 0.3989 ··· 0.0101
Goal: Learn y = wT f (x) + b where f (.) is a polynomial basis function.

y x x2 ··· x19
1.4544 0.0000 0.0000 ··· 0.0000
2.1039 0.1579 0.0249 ··· 0.0000
1.6518 0.3158 0.0998 ··· 0.0000
1.5701 0.4737 0.2244 ··· 0.0001
2.1284 0.6316 0.3989 ··· 0.0004
Given a dataset, begin by splitting into

October 30, 2024 2 / 15

Given a dataset, begin by splitting into

October 30, 2024 2 / 15

Given a dataset, begin by splitting into

Model assessment: Use TEST to assess the accuracy of the model you output.

October 30, 2024 2 / 15

Given a dataset, begin by splitting into

Model assessment: Use TEST to assess the accuracy of the model you output.
Never ever train or choose parameters based on the test data.

October 30, 2024 2 / 15

Validation set Approach

October 30, 2024 4 / 15

Validation set Approach

Goal: Estimate the test error for a supervised learning method.

Strategy:
Split the data into 2 parts
Train the method in the first part
Compute the error on the second part

October 30, 2024 4 / 15

Validation set approach

Estimates can vary considerably with different train-validation splits.

Only subset of points used to evaluate model.

October 30, 2024 5 / 15

Leave one out cross-validation

October 30, 2024 6 / 15

Leave one out cross-validation
For every i = 1, . . . , n:
▶ Train the model on every point except i,
▶ Compute the test error on the held-out point.
Average the test errors.

October 30, 2024 6 / 15

Leave one out cross-validation

For every i = 1, . . . , n:
▶ Train the model on every point except i,
▶ Compute the test error on the held-out point.
Average the test errors.

n
1X
CV(n) = (yi − ŷi )2
n
i=1

Prediction for the i sample without using the ith sample.

October 30, 2024 7 / 15

Leave one out cross-validation

For every i = 1, . . . , n:
▶ Train the model on every point except i,
▶ Compute the test error on the held-out point.
Average the test errors.

n
1X
CV(n) = 1(yi ̸= ŷi )
n
i=1

... for a classification problem.

October 30, 2024 8 / 15

Leave one out cross-validation

Computing CV(n) can be computationally expensive, since it involves fitting the model n
times.
A single linear regression fit takes O(d 3 + nd 2 ) time.

October 30, 2024 9 / 15

k-fold cross-validation
Split the data into k subsets or folds.
For every i = 1, . . . , k:
▶ Train the model on every fold except the i-th fold,
▶ Compute the test error on the i-th fold.
Average the test errors.

October 30, 2024 10 / 15

5-Fold Cross Validation Demo
Given k folds, to determine the quality of a particular hyperparameter, e.g., alpha =
0.1:
Pick a fold, which we’ll call the validation fold. Train the model on all but this fold.
Compute the error on the validation fold.
Repeat the step above for all k possible choices of validation fold.
Quality of the hyperparameter is the average of the k validation fold errors.

October 30, 2024 12 / 15

Picking K

Typical choices of k are 5, 10, and N, where N is the amount of data.

K = N is also known as “leave one out cross-validation,” and will typically give you the
best results.
▶ In this approach, each validation set is only one point.
▶ Every point gets a chance to get used as the validation set.
k = N is also very expensive, requiring you to fit a huge number of models.

Ultimately, the tradeoff is between k and computation time.

October 30, 2024 13 / 15

LOOCV vs. k-fold cross-validation

k-fold CV depends on the chosen split.

In k-fold CV, we train the model on less data than what is available. This
introduces bias into the estimates of test error.
In LOOCV, the training samples highly resemble each other. This increases the
variance of the test error estimate.
October 30, 2024 14 / 15

Resampling Methods Class 1
No ratings yet
Resampling Methods Class 1
33 pages
MI - Unit 5
No ratings yet
MI - Unit 5
72 pages
ML Mod 5
No ratings yet
ML Mod 5
58 pages
Sklearn Cross-Validation Guide
100% (1)
Sklearn Cross-Validation Guide
9 pages
Unit V
No ratings yet
Unit V
16 pages
Lecture 9
No ratings yet
Lecture 9
16 pages
Model Evaluation and Cross-Validation Methods
No ratings yet
Model Evaluation and Cross-Validation Methods
3 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Unit 5 (ML)
No ratings yet
Unit 5 (ML)
25 pages
ML-4 Cross Validation in Machine Learning
No ratings yet
ML-4 Cross Validation in Machine Learning
13 pages
ML Unit4 Notes
No ratings yet
ML Unit4 Notes
20 pages
4-ResamplingMethods 1
No ratings yet
4-ResamplingMethods 1
23 pages
CH 05 Optimization Technique
No ratings yet
CH 05 Optimization Technique
58 pages
Lec 16
No ratings yet
Lec 16
18 pages
ML Unit 2
No ratings yet
ML Unit 2
86 pages
Data Science Classification Guide
No ratings yet
Data Science Classification Guide
81 pages
Sampling Methods in Machine Learning
No ratings yet
Sampling Methods in Machine Learning
13 pages
Cross-Validation in Machine Learning
No ratings yet
Cross-Validation in Machine Learning
18 pages
ML Nithish
No ratings yet
ML Nithish
16 pages
ML Unit4 Notes
No ratings yet
ML Unit4 Notes
20 pages
ML 1 Lecture 2
No ratings yet
ML 1 Lecture 2
50 pages
ML Chap 5
No ratings yet
ML Chap 5
14 pages
Improving Machine Learning Performance
No ratings yet
Improving Machine Learning Performance
14 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Ch5 Resampling Methods
No ratings yet
Ch5 Resampling Methods
66 pages
Notes - Unit 3 - Machine Learning Lnctu-Bca (Aida) - IV Sem
No ratings yet
Notes - Unit 3 - Machine Learning Lnctu-Bca (Aida) - IV Sem
19 pages
Topic 3
No ratings yet
Topic 3
48 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Regularization CrossValidation
No ratings yet
Regularization CrossValidation
37 pages
Lect 03 Evaluation Part 2
No ratings yet
Lect 03 Evaluation Part 2
40 pages
Lecture-4 Model Evaluation
No ratings yet
Lecture-4 Model Evaluation
28 pages
M.L L-6 Re-Sampling Methods
No ratings yet
M.L L-6 Re-Sampling Methods
24 pages
Hyperparameters and Model Validation - Python Data Science Handbook
No ratings yet
Hyperparameters and Model Validation - Python Data Science Handbook
17 pages
Cross Validation in Machine Learning
No ratings yet
Cross Validation in Machine Learning
4 pages
Chapter2 1 33
No ratings yet
Chapter2 1 33
18 pages
ML.1Lecture.2 (Old)
No ratings yet
ML.1Lecture.2 (Old)
23 pages
Modellingandevaluationunit2june2322 220623063944 5c70ebed
No ratings yet
Modellingandevaluationunit2june2322 220623063944 5c70ebed
53 pages
Cross Validation Techniques
No ratings yet
Cross Validation Techniques
27 pages
Cross Validation 1
No ratings yet
Cross Validation 1
5 pages
Lec 5
No ratings yet
Lec 5
28 pages
Cross Validation
No ratings yet
Cross Validation
22 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
Chapter 3
No ratings yet
Chapter 3
56 pages
List Steps in Data Preparation. Give Short Description of Each Step
No ratings yet
List Steps in Data Preparation. Give Short Description of Each Step
20 pages
Cross Validation
No ratings yet
Cross Validation
5 pages
Lec 11
No ratings yet
Lec 11
26 pages
Unit 6 - Model Selection
No ratings yet
Unit 6 - Model Selection
13 pages
DATA ANALYSIS UNIT 4 Notes
No ratings yet
DATA ANALYSIS UNIT 4 Notes
19 pages
ML Unit 4 Trupesh Patel
No ratings yet
ML Unit 4 Trupesh Patel
56 pages
Ovefitting, Generalization, Cross Validation
No ratings yet
Ovefitting, Generalization, Cross Validation
20 pages
Machine Learning Data Splits Guide
No ratings yet
Machine Learning Data Splits Guide
30 pages
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
17 pages
IML 8 - Grid Search and Cross Validation
No ratings yet
IML 8 - Grid Search and Cross Validation
22 pages
Week7 Lecture 1 ML SPR25
No ratings yet
Week7 Lecture 1 ML SPR25
23 pages
Rajasthan Public Service Commission, Ajmer: Reserve List
No ratings yet
Rajasthan Public Service Commission, Ajmer: Reserve List
1 page
Statistics Exam at BRD College
No ratings yet
Statistics Exam at BRD College
1 page
Mùkjk (K.M U KF D Lsok Flfoy U K K/KH"K Ijh (KK
No ratings yet
Mùkjk (K.M U KF D Lsok Flfoy U K K/KH"K Ijh (KK
6 pages
F. 8 A (12) Exam/Lect.-Ayurved/RPSC/EP-I/2024-25: L Grade Pay L Grade Pay
No ratings yet
F. 8 A (12) Exam/Lect.-Ayurved/RPSC/EP-I/2024-25: L Grade Pay L Grade Pay
1 page
ALP-Objection Tracker 05.12.2024
No ratings yet
ALP-Objection Tracker 05.12.2024
1 page
NB 2025 01 27 03
No ratings yet
NB 2025 01 27 03
17 pages
Event Schedule: Dec 2024 - Jan 2025
No ratings yet
Event Schedule: Dec 2024 - Jan 2025
1 page
Final Result of JSA 2022 - 26124
No ratings yet
Final Result of JSA 2022 - 26124
3 pages
Grade C Steno LDCE 2018 Final Marks Uploading Write Up - 61124
No ratings yet
Grade C Steno LDCE 2018 Final Marks Uploading Write Up - 61124
1 page
Demo 50 YCT 2023 24 UPPCS General Studies Indian History Chapterwise Solved Papers Volume 1 Hindi Medium
No ratings yet
Demo 50 YCT 2023 24 UPPCS General Studies Indian History Chapterwise Solved Papers Volume 1 Hindi Medium
50 pages
Demo 20 YCT 2025 SSC CGL Tier 1 2 Mathematics Solved Papers Hindi Medium
No ratings yet
Demo 20 YCT 2025 SSC CGL Tier 1 2 Mathematics Solved Papers Hindi Medium
20 pages
Logistic - Regression Class 2
No ratings yet
Logistic - Regression Class 2
91 pages
Logistic Regression Class 1
No ratings yet
Logistic Regression Class 1
37 pages
SVM Class 2
No ratings yet
SVM Class 2
87 pages
Evaluation Metrics Class 1
No ratings yet
Evaluation Metrics Class 1
30 pages
Database Management System Weekly Test 01 Test Paper
No ratings yet
Database Management System Weekly Test 01 Test Paper
5 pages
Python Maha Revision
No ratings yet
Python Maha Revision
135 pages
AI Assistant for KDE Plasma
No ratings yet
AI Assistant for KDE Plasma
27 pages
01 AI Overview - ALEX
No ratings yet
01 AI Overview - ALEX
72 pages
Edge Detection: Prof. Fei-Fei Li, Stanford University
No ratings yet
Edge Detection: Prof. Fei-Fei Li, Stanford University
35 pages
JD Program Management
No ratings yet
JD Program Management
183 pages
Whisper Preprint
No ratings yet
Whisper Preprint
7 pages
ADET HardCopyRedux71 1
No ratings yet
ADET HardCopyRedux71 1
6 pages
OBIKE Final Year Project
No ratings yet
OBIKE Final Year Project
49 pages
B16 Paper IEEE
No ratings yet
B16 Paper IEEE
6 pages
AI-Lecture-03-04 (Applications and Hisory of AI)
No ratings yet
AI-Lecture-03-04 (Applications and Hisory of AI)
39 pages
Genesys Cloud CX White Paper - EN
No ratings yet
Genesys Cloud CX White Paper - EN
50 pages
Facial Emotion Recognition Using Deep Learning: Ankit Awasthi (Y8084)
No ratings yet
Facial Emotion Recognition Using Deep Learning: Ankit Awasthi (Y8084)
12 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Conference V1
No ratings yet
Conference V1
14 pages
Intro To Machine Learning Nanodegree Program Syllabus
No ratings yet
Intro To Machine Learning Nanodegree Program Syllabus
13 pages
Mbaex Sixth Edition Compressed
No ratings yet
Mbaex Sixth Edition Compressed
54 pages
8 Risks and Dangers of Artificial Intelligence To Know - Built in
No ratings yet
8 Risks and Dangers of Artificial Intelligence To Know - Built in
14 pages
AI's Impact on Kegite Club
No ratings yet
AI's Impact on Kegite Club
3 pages
Muskan - Ai Brochure
No ratings yet
Muskan - Ai Brochure
2 pages
Virtual Robotics for Schools
No ratings yet
Virtual Robotics for Schools
19 pages
MITRE AI Maturity Model 11 17 2022
No ratings yet
MITRE AI Maturity Model 11 17 2022
45 pages
Feature Engineering for AI Success
No ratings yet
Feature Engineering for AI Success
12 pages
Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches
No ratings yet
Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches
6 pages
Chatbot Template Edit - Compressed
No ratings yet
Chatbot Template Edit - Compressed
38 pages
2023 GKS-G Available Departments (Andong National Univ.)
No ratings yet
2023 GKS-G Available Departments (Andong National Univ.)
3 pages
Responsive Search Ads A Guide To Writing Ads That Perform 2023
No ratings yet
Responsive Search Ads A Guide To Writing Ads That Perform 2023
9 pages
Literature Review On Creative Thinking
100% (3)
Literature Review On Creative Thinking
4 pages
2020 APCS Tom Tat Gioi Thieu Mon Hoc 2
No ratings yet
2020 APCS Tom Tat Gioi Thieu Mon Hoc 2
28 pages
Basics of Machine Learning Notes
No ratings yet
Basics of Machine Learning Notes
4 pages
AlphaGo's AI Training and Achievements
100% (1)
AlphaGo's AI Training and Achievements
16 pages
Final Project Report
No ratings yet
Final Project Report
48 pages

Resampling Methods Class 2

Uploaded by

Resampling Methods Class 2

Uploaded by

Lecture 1

October 30, 2024

October 30, 2024 1 / 15

Does higher value of polynomial degree

October 30, 2024 2 / 15

October 30, 2024 2 / 15

October 30, 2024 2 / 15

October 30, 2024 2 / 15

October 30, 2024 4 / 15

Goal: Estimate the test error for a supervised learning method.

October 30, 2024 4 / 15

Estimates can vary considerably with different train-validation splits.

October 30, 2024 5 / 15

October 30, 2024 6 / 15

October 30, 2024 6 / 15

Prediction for the i sample without using the ith sample.

October 30, 2024 7 / 15

... for a classification problem.

October 30, 2024 8 / 15

October 30, 2024 9 / 15

October 30, 2024 10 / 15

October 30, 2024 12 / 15

October 30, 2024 12 / 15

Typical choices of k are 5, 10, and N, where N is the amount of data.

Ultimately, the tradeoff is between k and computation time.

October 30, 2024 13 / 15

k-fold CV depends on the chosen split.

You might also like