2025 Slides7 ML Eng
2025 Slides7 ML Eng
Machine Learning
4/15/2025 1
Machine Learning and AI
• Improve task performance through
observation, teaching
4/15/2025 2
ARTIFICIAL INTELLIGENCE
MACHINE LEARNING
DEEP LEARNING /Generative AI
•3
Source Materials
• Text book
– Murphy (2022). Probabilistic Machine Learning : An
Introduction
– Bishop (2006). Pattern Recognition and Machine Learning
– Hal Daumé III (2017) A Course in Machine Learning
– Vietnamese (2 books)
• Học máy - PGS Hoàng Xuân Huấn (2015)
• Học máy - PGS Đinh Mạnh Tường (2016)
• Online Courses and Course Videos
– Machine Learning Specialization (Funix.edu.vn University)
– Andrew Ng (Stanford) on coursera and youtube
– http://machinelearningcoban.com/
– DeepLearning.ai
– Fast.ai (Machine Learning + Deep Learning for coder)
4/15/2025 4
4/15/2025 5
A few quotes
• “A breakthrough in machine learning would be worth
ten Microsofts” (Bill Gates, Chairman, Microsoft)
• “Machine learning is the next Internet” (Tony Tether,
Director, DARPA)
• “Machine learning is the hot new thing” (John
Hennessy, President, Stanford)
• “Web rankings today are mostly a matter of machine
learning” (Prabhakar Raghavan, Dir. Research, Yahoo)
• “Machine learning is going to result in a real
revolution” (Greg Papadopoulos, CTO, Sun)
4/15/2025 6
Why is ML important
• Bigdata Era
– Many Hours of video is uploaded into Youtobe, Tiktok
by second.
– Approximately 328.77 million terabytes of data are
created each day.
– 181 zettabytes of data will be generated in 2025
• Develop systems that are too difficult/expensive
to construct manually because they require
specific detailed skills or knowledge tuned to a
specific task (knowledge engineering bottleneck).
Source: https://explodingtopics.com/blog/data-generated-per-day
4/15/2025 7
Image Net (better than Human)
The best image classification system declined from 26%
error in 2011 to 3.1% in 2016 equivalent and lower than
humans.
4/15/2025 8
Image Net
4/15/2025 9
Speech Recognition
4/15/2025 10
ChatGPT 4.5/Gemini 2.5/Lama 4/ Deepseek-R1
4/15/2025 11
•12
GPT-o1
•13
What is Machine Learning?
• Machine Learning (ML) is a branch of artificial intelligence
(AI) and computer science
• Define of Machine Learning:
– A computer program is said to learn from experience E with respect to
some class of tasks T, and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E. [Mitchell,
1997]
• Formulate the Learning Task [Mitchell, 1997]
– Improve on task T
– With respect to performance metric P
– Based on experience E
4/15/2025 14
Examples for ML (2)
• Web classifier
– T: Web Classifyer based on specification topics.
– P: Percentage of (%) webs correctly classified
– E: Database of Webs, each web is labeled with topic
4/15/2025 15
4/15/2025 16
Examples for ML (3)
• Face Detection
– T: Face detection for given pictures.
– P: Percentage (%) of face detection accuracy
in picture.
– E: Database of human faces has been defined.
4/15/2025 17
Face Detection
4/15/2025 18
Applications
• Natural Language Processing
• Pattern Recognition: ASR, OCR
• Computer Vision
• Search Engine, Ranking, Information Retrieve
• Diagnosis in medical
• Bioinformatic
• Physical
• Financial fraud
• Stock market analysis
• Game
• Robotic
4/15/2025 19
Traditional algorithms
4/15/2025 20
Machine learning algorithms
4/15/2025 21
Machine learning algorithms
4/15/2025 22
Machine learning algorithms
4/15/2025 23
Machine learning algorithms
• Machine Learning
can help humans
learn
4/15/2025 24
Types of Learning
class
A
class
A
Anomaly Detection
Sequence labeling
…
4/15/2025 25
Supervised Learning
Classification
Regression
4/15/2025 26
Unsupervised Learning
4/15/2025 27
Reinforcement Learning
4/15/2025 28
Reinforcement Learning
4/15/2025 29
Example: Image Classifier
input desired output
apple
pear
tomato
cow
dog
horse
4/15/2025 30
Training data
apple
pear
tomato
cow
dog
horse
4/15/2025 31
Architechture of ML System
Learning(Học) Training
Labels
Training
Samples
Learned
Features Training
model
Inference(Dự đoán)
Learned
Features Prediction
model
4/15/2025 32
Test Sample
Example (Raw or Ripen)
4/15/2025 33
Example (Raw or Ripen)
4/15/2025 34
4/15/2025 35
Importance of the Problem dimensions
4/15/2025 36
4/15/2025 37
Learning phases
• Training: we learn a predictive function f by
optimizing it so that it predicts well on the
training set.
• Use for prediction: we can then use f on new
(test) inputs that were not part of the training
set.
The GOAL of learning is NOT to learn perfectly
(memorize) the training set.
What’s important is the ability for the predictor to
generalize well on new (future) cases.
4/15/2025 38
Expected risk v.s. Empirical risk
4/15/2025 39
Empirical risk minimization
Examples (x,y) are supposed drawn i.i.d. from an unknown
true distribution p(x,y) (nature or industrial process)
• We’d love to find a predictor that minimizes the
generalization error (the expected risk)
• But can’t even compute it! (expectation over unknown
distribution)
4/15/2025 41
A machine learning algorithm
4/15/2025 43
Capacity of a learning algorithm
4/15/2025 45
Turning the capacity
4/15/2025 46
Analysis of generalization error
4/15/2025 47
•48
Under and over-parametrization
4/15/2025 49
Bias-variance trade-off of Classical Machine
Learning and Deep Learning
• ML shows the classical expected curve, with increasing
number of parameters in the model, the test error starts to
increase again. DL, however, shows a double descent
curve, a certain amount of overparameterization is
necessary to decrease the test error sustainable
4/15/2025 50
No Free Lunch Theorem
4/15/2025 51
Machine Learning Project lifecycle
4/15/2025 52
Evaluation of ML system
• Classification Accuracy
• Solution correctness
• Solution quality (length, efficiency)
• Speed of performance
4/15/2025 53
Why Study Machine Learning?
The Time is Ripe
• Many basic effective and efficient
algorithms available.
• Large amounts of on-line data available.
• Large amounts of computational resources
available.
4/15/2025 54
Related Disciplines
• Artificial Intelligence
• Data Mining
• Probability and Statistics
• Information theory
• Numerical optimization
• Computational complexity theory
• Control theory (adaptive)
• Psychology (developmental, cognitive) (Tâm lý học)
• Neurobiology (Thần kinh học)
• Linguistics
• Philosophy (Triết học)
4/15/2025 55
4/15/2025 56
Data & Source of Training Data
• Data: The Fuel Powering AI & Digital
Transformation
• Excellent quality of data is as critical for the
success of your AI solution as the quality of
software for your mission critical programs.
• Standard Data (UCI, LDC, PennTree Bank,…)
• Website for Data mining and Machine Learning
Competition: https://www.kaggle.com/
• Learner can construct an arbitrary example and
query an expert for its label.
4/15/2025 57
Summury
• ML is everywhere
• ML is a very fascinating and fast area for
study and apply right now.
• There's no magic in ML! It's all about
representation, optimization, probability and
linear algebra.
4/15/2025 58
References
4/15/2025 59