[go: up one dir, main page]

0% found this document useful (0 votes)
4 views59 pages

2025 Slides7 ML Eng

The document is an introduction to Machine Learning (ML), covering its definition, importance, types, and applications. It emphasizes the significance of ML in the era of big data and outlines various learning methods, including supervised, unsupervised, and reinforcement learning. Additionally, it discusses the machine learning project lifecycle, evaluation metrics, and related disciplines.

Uploaded by

23021685
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views59 pages

2025 Slides7 ML Eng

The document is an introduction to Machine Learning (ML), covering its definition, importance, types, and applications. It emphasizes the significance of ML in the era of big data and outlines various learning methods, including supervised, unsupervised, and reinforcement learning. Additionally, it discusses the machine learning project lifecycle, evaluation metrics, and related disciplines.

Uploaded by

23021685
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Introduction to

Machine Learning

Instructor: Nguyễn Văn Vinh, UET - Hanoi VNU

4/15/2025 1
Machine Learning and AI
• Improve task performance through
observation, teaching

• Acquire knowledge automatically for use in


a task

• Learning as a key component in intelligence

4/15/2025 2
ARTIFICIAL INTELLIGENCE
MACHINE LEARNING
DEEP LEARNING /Generative AI

•3
Source Materials
• Text book
– Murphy (2022). Probabilistic Machine Learning : An
Introduction
– Bishop (2006). Pattern Recognition and Machine Learning
– Hal Daumé III (2017) A Course in Machine Learning
– Vietnamese (2 books)
• Học máy - PGS Hoàng Xuân Huấn (2015)
• Học máy - PGS Đinh Mạnh Tường (2016)
• Online Courses and Course Videos
– Machine Learning Specialization (Funix.edu.vn University)
– Andrew Ng (Stanford) on coursera and youtube
– http://machinelearningcoban.com/
– DeepLearning.ai
– Fast.ai (Machine Learning + Deep Learning for coder)
4/15/2025 4
4/15/2025 5
A few quotes
• “A breakthrough in machine learning would be worth
ten Microsofts” (Bill Gates, Chairman, Microsoft)
• “Machine learning is the next Internet” (Tony Tether,
Director, DARPA)
• “Machine learning is the hot new thing” (John
Hennessy, President, Stanford)
• “Web rankings today are mostly a matter of machine
learning” (Prabhakar Raghavan, Dir. Research, Yahoo)
• “Machine learning is going to result in a real
revolution” (Greg Papadopoulos, CTO, Sun)

4/15/2025 6
Why is ML important
• Bigdata Era
– Many Hours of video is uploaded into Youtobe, Tiktok
by second.
– Approximately 328.77 million terabytes of data are
created each day.
– 181 zettabytes of data will be generated in 2025
• Develop systems that are too difficult/expensive
to construct manually because they require
specific detailed skills or knowledge tuned to a
specific task (knowledge engineering bottleneck).
Source: https://explodingtopics.com/blog/data-generated-per-day

4/15/2025 7
Image Net (better than Human)
 The best image classification system declined from 26%
error in 2011 to 3.1% in 2016 equivalent and lower than
humans.

4/15/2025 8
Image Net

4/15/2025 9
Speech Recognition

4/15/2025 10
ChatGPT 4.5/Gemini 2.5/Lama 4/ Deepseek-R1

4/15/2025 11
•12
GPT-o1

•13
What is Machine Learning?
• Machine Learning (ML) is a branch of artificial intelligence
(AI) and computer science
• Define of Machine Learning:
– A computer program is said to learn from experience E with respect to
some class of tasks T, and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E. [Mitchell,
1997]
• Formulate the Learning Task [Mitchell, 1997]
– Improve on task T
– With respect to performance metric P
– Based on experience E

4/15/2025 14
Examples for ML (2)

• Web classifier
– T: Web Classifyer based on specification topics.
– P: Percentage of (%) webs correctly classified
– E: Database of Webs, each web is labeled with topic

4/15/2025 15
4/15/2025 16
Examples for ML (3)

• Face Detection
– T: Face detection for given pictures.
– P: Percentage (%) of face detection accuracy
in picture.
– E: Database of human faces has been defined.

4/15/2025 17
Face Detection

4/15/2025 18
Applications
• Natural Language Processing
• Pattern Recognition: ASR, OCR
• Computer Vision
• Search Engine, Ranking, Information Retrieve
• Diagnosis in medical
• Bioinformatic
• Physical
• Financial fraud
• Stock market analysis
• Game
• Robotic

4/15/2025 19
Traditional algorithms

• Program will likely become a long list of


complex rules, so pretty hard to maintain.

4/15/2025 20
Machine learning algorithms

• The program is much shorter, easier to


maintain, and most likely more accurate.

4/15/2025 21
Machine learning algorithms

• Machine learning allows


us to translate the
complexity from the
program to the data,
which is much easier to
obtain (either naturally
occurring or via
crowdsourcing).

4/15/2025 22
Machine learning algorithms

• Automatically adapting to change

4/15/2025 23
Machine learning algorithms

• Machine Learning
can help humans
learn
4/15/2025 24
Types of Learning

Supervised: Learning with a labeled training set


Example: email classification with already labeled emails

Unsupervised: Discover patterns in unlabeled data


Example: cluster similar documents based on text

Reinforcement learning: learn to act based on feedback/reward


Example: learn to play Go, reward: win or lose

class
A
class
A

Classification Regression Clustering

Anomaly Detection
Sequence labeling

4/15/2025 25
Supervised Learning

Classification

Regression

4/15/2025 26
Unsupervised Learning

• An unlabeled training set for


unsupervised learning

4/15/2025 27
Reinforcement Learning

4/15/2025 28
Reinforcement Learning

4/15/2025 29
Example: Image Classifier
input desired output

apple

pear

tomato

cow

dog

horse

4/15/2025 30
Training data

apple

pear

tomato

cow

dog

horse

4/15/2025 31
Architechture of ML System
Learning(Học) Training
Labels
Training
Samples
Learned
Features Training
model

Inference(Dự đoán)

Learned
Features Prediction
model
4/15/2025 32
Test Sample
Example (Raw or Ripen)

4/15/2025 33
Example (Raw or Ripen)

4/15/2025 34
4/15/2025 35
Importance of the Problem dimensions

4/15/2025 36
4/15/2025 37
Learning phases
• Training: we learn a predictive function f by
optimizing it so that it predicts well on the
training set.
• Use for prediction: we can then use f on new
(test) inputs that were not part of the training
set.
The GOAL of learning is NOT to learn perfectly
(memorize) the training set.
What’s important is the ability for the predictor to
generalize well on new (future) cases.
4/15/2025 38
Expected risk v.s. Empirical risk

• Generalization error = Expected risk

• Empirical risk: average loss on a finite


dataset.

4/15/2025 39
Empirical risk minimization
Examples (x,y) are supposed drawn i.i.d. from an unknown
true distribution p(x,y) (nature or industrial process)
• We’d love to find a predictor that minimizes the
generalization error (the expected risk)
• But can’t even compute it! (expectation over unknown
distribution)

• Instead: Empirical risk minimization principle.


Find predictor that minimizes average loss over a
trainset.

This is the training phase in ML


4/15/2025 40
Learning Function f

4/15/2025 41
A machine learning algorithm

• The choice of a specific function family: F


(often a parameterized family).
• A way to evaluate the quality of a function
f∈F (typically using a cost (or loss) function
L).
• A way to search for the “best” function
f∈F (typically an optimization of function
parameters to minimize the overall loss over
the training set).
4/15/2025 42
Ex. of parameterized function families

4/15/2025 43
Capacity of a learning algorithm

• Choosing a specific Machine Learning


algorithm means choosing a specific
function family F (ex. HMM, NN, SVM,
…).
• How (big, rich, flexible, expressive,
complex) that family is, defines what is
informally called the “capacity” of the ML
algorithm.
Ex: Capacity(Fpolynomial) > Capacity(Flinear)
4/15/2025 44
Popular classifiers their parameters and
hyper-parameters

4/15/2025 45
Turning the capacity

• Capacity must be optimally tuned to ensure


good generalization
– by choosing Algorithm and hyperparameters
– to avoid under-fitting and over-fitting.

4/15/2025 46
Analysis of generalization error

• Bias and Variance Problem

4/15/2025 47
•48
Under and over-parametrization

4/15/2025 49
Bias-variance trade-off of Classical Machine
Learning and Deep Learning
• ML shows the classical expected curve, with increasing
number of parameters in the model, the test error starts to
increase again. DL, however, shows a double descent
curve, a certain amount of overparameterization is
necessary to decrease the test error sustainable

4/15/2025 50
No Free Lunch Theorem

4/15/2025 51
Machine Learning Project lifecycle

4/15/2025 52
Evaluation of ML system

• Classification Accuracy
• Solution correctness
• Solution quality (length, efficiency)
• Speed of performance

4/15/2025 53
Why Study Machine Learning?
The Time is Ripe
• Many basic effective and efficient
algorithms available.
• Large amounts of on-line data available.
• Large amounts of computational resources
available.

4/15/2025 54
Related Disciplines
• Artificial Intelligence
• Data Mining
• Probability and Statistics
• Information theory
• Numerical optimization
• Computational complexity theory
• Control theory (adaptive)
• Psychology (developmental, cognitive) (Tâm lý học)
• Neurobiology (Thần kinh học)
• Linguistics
• Philosophy (Triết học)

4/15/2025 55
4/15/2025 56
Data & Source of Training Data
• Data: The Fuel Powering AI & Digital
Transformation
• Excellent quality of data is as critical for the
success of your AI solution as the quality of
software for your mission critical programs.
• Standard Data (UCI, LDC, PennTree Bank,…)
• Website for Data mining and Machine Learning
Competition: https://www.kaggle.com/
• Learner can construct an arbitrary example and
query an expert for its label.

4/15/2025 57
Summury

• ML is everywhere
• ML is a very fascinating and fast area for
study and apply right now.
• There's no magic in ML! It's all about
representation, optimization, probability and
linear algebra.

4/15/2025 58
References

• Chapter 1, Murphy (2022). Machine


Learning: A Probabilistic Perspective
• Slides of Machine Learing Course for CS,
Universities in the world and ML for
Summer School.

4/15/2025 59

You might also like