ME5418 – Machine
Learning in Robotics
SARTORETTI Guillaume, PhD
Assistant Professor
Mechanical Engineering, NUS
© Copyright National University of Singapore. All Rights Reserved.
© Copyright National University of Singapore. All Rights Reserved.
Outline
1. Course Introduction and Mechanics
Instructor, Resources, Mechanics, Assignments, Learning Objectives
2. Introduction to Machine Learning
Artificial Intelligence, Machine Learning, Deep Learning
3. Robotics and ML
Overview and recent successes
© Copyright National University of Singapore. All Rights Reserved. 2
About me…
PhD, EPFL (Switzerland), 2016.
Postdoc, Biorobotics Lab, Carnegie Mellon University,
USA, 2016-2019.
Assistant Professor (MARMot Lab), NUS, 2019-Now.
Research interests: multi-agent systems (articulated
robots & multi-robot systems),
decentralized/distributed control, artificial intelligence.
Office: E2-02-01, Phone: 6516 4882, Email:
mpegas@nus.edu.sg
Website: http://marmotlab.org
© Copyright National University of Singapore. All Rights Reserved. 3
Topics
Markov Decisions Processes and Tabular Methods
Value Estimation, Q-Learning
Neural Networks
MLPs, CNNs, RNNs, etc.
Deep Reinforcement Learning (dRL)
DQNs, Policy Gradient Methods
Advanced dRL and Robotics Applications
Distributed RL, MBRL, MARL
Motion Planning, Articulated Robots
© Copyright National University of Singapore. All Rights Reserved. 4
Python and Deep Learning Libraries
Self-Learning
Lot of content in the course already
Wide variance in prior knowledge/experience
Online python tutorials are here for you :-)
For example: https://docs.python.org/3/tutorial/index.html
What level is required?
Example of code: https://github.com/marmotlab/ME5406_exampleSAPP
In particular, the sapp_gym.py file.
© Copyright National University of Singapore. All Rights Reserved. 5
Main Resources
Richard Sutton and Andrew Barto. Reinforcement learning: An
introduction. MIT press. 2018 (new revisions since then)
Available online: https://www.andrew.cmu.edu/course/10-
703//textbook/BartoSutton.pdf
Videos and Readings (see CANVAS Syllabus).
Other references will be announced in class and/or on CANVAS.
© Copyright National University of Singapore. All Rights Reserved. 6
Course Mechanics
Flipped Classroom: Learning by doing
Weekly pre-class Videos/Readings & Quizzes.
Lecture time: Hands-on interactive activities (code review, demos, and exercises).
© Copyright National University of Singapore. All Rights Reserved. 7
Assessment Modes
Weekly Quizzes (10%)
Unlimited attempts, highest one counts.
Project Proposal and Peer Feedback exercise (10%)
All details and deadlines available on CANVAS.
Final Project (15% + 15% + 15% + 35% = 80%)
Self-proposed team-based project
3 milestones (15% each), then final submission end of reading week.
Final Code/Video (same for each team), but individual report.
All course details, videos/readings, quizzes, assignments, and
lecture/code material will be found on CANVAS.
© Copyright National University of Singapore. All Rights Reserved. 8
Key Learning Outcomes (KLO)
By the end of this course, you should be able to
1. Explain the main classes of Machine Learning methods,
and their use in robotics.
2. Cast a specific robotic control problem into the
reinforcement learning (RL) framework.
3. Implement and train a deep RL agent for a standard
benchmark task in python.
4. Describe, diagnose, and debug the standard issues that
can hinder agent learning.
5. Debate the use of AI/data-driven methods for a given
robotic problem.
© Copyright National University of Singapore. All Rights Reserved. 9
Learning Alignment
Formative Summative
Weekly Proposal + Gym Code and Neural Network Learning Project Code
Learning outcomes quizzes Peer Review Report (C&R) C&R Agent C&R and Report
10% 10% 15% 15% 15% 35%
Explain the main classes of Machine Learning
methods, and their use in robotics. xxx X x
Cast a specific robotic control problem into
the reinforcement learning (RL) framework. xx xxx
Implement and train a deep RL agent for a
standard benchmark task in python. xx xx xx xxx
Describe, diagnose, and debug the standard
issues that can hinder agent learning. x x x xx xxx
Debate the use of AI/data-driven methods
for a given robotic problem. xx Xxx
© Copyright National University of Singapore. All Rights Reserved. 10
https://canvas.nus.edu.sg/courses/63269/assignments/syllabus
Outline
1. Course Introduction and Mechanics
Instructor, Resources, Mechanics, Assignments, Learning Objectives
2. Introduction to Machine Learning
Artificial Intelligence, Machine Learning, Deep Learning
3. Robotics and ML
Overview and recent successes
© Copyright National University of Singapore. All Rights Reserved. 11
AI, ML, DL
https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
© Copyright National University of Singapore. All Rights Reserved. 12
Artificial Intelligence
Term “officially” coined in 1956, but much older concept.
Oxford Dictionary:
“The theory and development of computer systems able to perform tasks
normally requiring human intelligence, such as visual perception, speech
recognition, decision-making, and translation between languages.”
Narrow AI
Early on: heuristics/domain knowledge
Then: “principled” automation/computation General AI (AGI)
Recently: “brutish”/unexplainable techniques Super Intelligence
© Copyright National University of Singapore. All Rights Reserved. 13
A little history of Chess AI
1770: The Turk (Wolfgang von Kempelen), fake chess automaton.
© Copyright National University of Singapore. All Rights Reserved. 14
A little history of Chess AI
1770: The Turk (Wolfgang von Kempelen), fake chess automaton.
1967: Mac Hack (Richard Greenblatt), first computer AI
checkmates a (weak) human player in 37 moves.
Limited forward search.
50 handcrafted heuristics (based on Greenblatt’s expertise)
1979: Belle (Ken Thompson), first “National Chess Champion) AI!
Openings library, hash map
180,000 predictions/s, horizon depth of up to 9.5 moves!
© Copyright National University of Singapore. All Rights Reserved. 15
A little history of Chess AI
1770: The Turk (Wolfgang von Kempelen), fake chess automaton.
1967: Mac Hack (Richard Greenblatt), first computer AI checkmates a (weak) human player.
1979: Belle (Ken Thompson), first “National Chess Champion) AI!
1997: Deep Blue (Hsu Feng-Hsiung, IBM Research), first Chess
AI to beat a human GM: Garry Kasparov.
IBM Supercomputer
Monte-Carlo Tree Search (MCTS)
Horizon of up to 20 moves!
Minimax with α-β pruning
Hypertuned evaluation function
(Library of end games…)
© Copyright National University of Singapore. All Rights Reserved. 16
A little history of Chess AI
1770: The Turk (Wolfgang von Kempelen), fake chess automaton.
1967: Mac Hack (Richard Greenblatt), first computer AI checkmates a (weak) human player.
1979: Belle (Ken Thompson), first “National Chess Champion) AI!
1997: Deep Blue (Hsu Feng-Hsiung, IBM Research), first Chess AI to beat a human GM.
2017: AlphaZero (DeepMind): trained policy to guide MCTS.
Hybrid Approach
Superhuman performance
Can adapt to other games
Go
Shogi
etc.
© Copyright National University of Singapore. All Rights Reserved. 17
A little history of Chess AI
Deep Learning
Unexplainable, driven by GPU
Golden Age of Tree Search
Driven by Moore’s Law (CPU)
AI Infancy Stockfish: MCTS + tricks
(heuristics, low compute) + lookup table (memory)
© Copyright National University of Singapore. All Rights Reserved. 18
“7” Levels of AI
1. Rule-Based Systems (“if-else AI”)
2. Context Awareness and Retention (Memory / Learning)
3. Domain-Specific Expertise (Narrow AI) Us Now
4. Reasoning Machines (Theory of Mind)
5. Self Aware Systems (AGI)
6. Artificial Superintelligence (ASI)
7. Singularity and Transcendence (…)
© Copyright National University of Singapore. All Rights Reserved. 19
Machine Learning
Oxford Dictionary:
“The use and development of computer systems that are able to learn
and adapt without following explicit instructions, by using algorithms and
statistical models to analyze and draw inferences from patterns in data.”
“Three” Main categories:
Supervised learning
Unsupervised learning
Reinforcement Learning (RL)
Semi-/self-supervised learning, ...
© Copyright National University of Singapore. All Rights Reserved. 20
Supervised Learning
Requires labelled data Classification / Regression
Ground truth, expert information Statistical Modelling
Input/Output correct pairs Ensemble methods
Learn by examples
© Copyright National University of Singapore. All Rights Reserved. 21
Unsupervised Learning
Unlabelled data Clustering / Compression
Tries to find structure/patterns Generative Models
Learns by association/similarity Anomaly Detection
© Copyright National University of Singapore. All Rights Reserved. 22
Reinforcement Learning (RL)
Usually, no dataset Sequential decision-making
Data obtained from free exploration of Markov Decision Processes
the agent’s state-action space Dynamic Programming
Learns from training signal (reward) Value Estimation / Policy Methods
© Copyright National University of Singapore. All Rights Reserved. 23
Machine Learning - Summary
https://arshren.medium.com/supervised-unsupervised-and-reinforcement-learning-245b59709f68
© Copyright National University of Singapore. All Rights Reserved. 24
Deep Learning
Oxford Dictionary:
“A type of machine learning based on artificial neural networks in which
multiple layers of processing are used to extract progressively higher-
level features from data.”
Subset of ML: works with (un)supervised, or RL techniques
Main difference is in the model used to analyze/learn from data
© Copyright National University of Singapore. All Rights Reserved. 25
End-to-End DL for Computer Vision
Trained
“Japanese Spitz”
Black magic
© Copyright National University of Singapore. All Rights Reserved. 26
Naïve Approach: FC Layers?
????
Impossibly large network:
Each neuron in the first layer has access to the entire image as input
(200x200x3 px) x (10 layers) x (500 neurons/layer) = 600M parameters!!
© Copyright National University of Singapore. All Rights Reserved. 27
Convolutional Neural Networks
Low-dimensional, spatially-correlated:
MLP: one neuron for each pixel. Large weight matrix, no
spatial relationship between pixels.
CNN: Only uses neighboring (spatially-related) pixels.
CNNs: usually composed of convolution layers, often
inter-twined with maxpooling operations to iteratively
reduce the dimension of the input.
© Copyright National University of Singapore. All Rights Reserved. 28
Training Basics: Example
1. Compute a forward pass of the network to determine the output
At the beginning of training
the network is unsure
.5 Spitz
.5 Corgi
© Copyright National University of Singapore. All Rights Reserved. 29
Training Basics: Example
1. Compute a forward pass of the network to determine the output
2. Compute the user-defined loss for that output: describes deviation from correct answer
3. Compute a backwards pass through the network, calculating how much to change
each parameter to minimize the loss. This is the gradient (of parameters WRT loss)
0.5 Loss
.5 Spitz
.5 Corgi
© Copyright National University of Singapore. All Rights Reserved. 30
Training Basics: Example
1. Compute a forward pass of the network to determine the output
2. Compute the user-defined loss for that output: describes deviation from correct answer
3. Compute a backwards pass through the network, calculating how much to change each parameter to minimize the loss. This
is the gradient (of parameters WRT loss)
4. Update network weights
.48 Loss
.51 Spitz
.49 Corgi
© Copyright National University of Singapore. All Rights Reserved. 31
AlexNet (2012)
First network fully taking
advantage of large datasets
+ GPU training
Used dropouts + ReLU
+ data augmentation
Trained on 1000 ImageNet
classes
~60M parameters/
5 layers
© Copyright National University of Singapore. All Rights Reserved. 32
Deep Reinforcement Learning (dRL)
Deep Learning + RL
Larger-Scale Problems
Better Interpolation/Generalization
More complex data processing
Approximations / lower precision
Harder to analyze/guarantee
© Copyright National University of Singapore. All Rights Reserved. 33
Outline
1. Course Introduction and Mechanics
Instructor, Resources, Mechanics, Assignments, Learning Objectives
2. Introduction to Machine Learning
Artificial Intelligence, Machine Learning, Deep Learning
3. Robotics and ML
Overview and recent successes
© Copyright National University of Singapore. All Rights Reserved. 34
Robotics
What is a robot?
A robot is what you call something before it starts being useful.
– Howie Choset (CMU).
A robot is a physically embodied artificially intelligent agent that
can take actions that have effects on the physical world.
– Anca Dragan (UC Berkeley).
© Copyright National University of Singapore. All Rights Reserved. 35
Robot Capabilities
Sense: Robotic Perception
Localization and/or
Mapping (SLAM)
Scene Understanding
Move: Motion Planning
Navigation / Obstacle Avoidance
Exploration / Mapping
Articulated Locomotion
Do: Manipulation
Surgery
Assembly / Construction
Pick-and-Place
Robotic Control ≈ Sequential Decision-Making
© Copyright National University of Singapore. All Rights Reserved. 36
Robot Types
Static Robots:
robotic arms, static/partial humanoids
Mobile Robots
ground, aerial, surface, underwater
Task Specifics
sensors, actuators, on/offboard compute
autonomy level, collaborative-ness, ...
Dimensions
Nano-/micro-robots
Vacuum cleaner/dog size
Car/Truck size
Larger?
© Copyright National University of Singapore. All Rights Reserved. 37
Robot Tasks
+ Surgical robots, bio-inspiration/mimicry, space exploration, ...
© Copyright National University of Singapore. All Rights Reserved. 38
Multi-Robot Systems
Cooperative
Path Planning / Obstacle Avoidance
Search / Coverage
Traffic Signal Control
Object Transportation
(Mixed Cooperative-)Competitive
Game AI
Evader-Pursuit
© Copyright National University of Singapore. All Rights Reserved. 39
Robotics and AI
First-principle approaches (MPC, MCTS)
Explainable & Tractable / Slow & Limited
Data-driven Methods (dRL)
© Copyright National University of Singapore. All Rights Reserved. Blackbox / Fast & Powerful 40
Robotics and AI
First-Principle Approaches Data-driven Methods
Safety-Critical tasks When conventional methods fail
Explainability Improved performance/scalability
Domain knowledge “Easier” development (learning)
Driven by CPU progress Driven by GPU progress
Hybrid Approaches
Learning agent within principled framework
Guided graph-/tree-search
Neurosymbolic approaches
Safe/Explainable RL
© Copyright National University of Singapore. All Rights Reserved. 41
Summary
Artificial Intelligence: heuristics, principled search, hybrid neural
Machine Learning: supervised, unsupervised, RL
Deep Learning: wonders of artificial neural networks
Deep RL: DL + RL
Robotics: Sense, Think, Act
Frequent Control/Replanning ~ Sequential Decision-Making
Deep RL as driver for many recent successes
dRL is not a silver bullet; be critical about if/when to use it !
© Copyright National University of Singapore. All Rights Reserved. 42
Let’s see if everything was clear :-)
© Copyright National University of Singapore. All Rights Reserved.
Um, Actually...
Each of these prompts is nearly correct. Can you find the mistake?
1. In many deep learning applications, a simple one-layer FC may
suffice and should always be tried first.
2. Behaviour Cloning is a famous subset of reinforcement learning,
in which an agent learns to copy the optimal actions of an expert.
3. Why do I need to learn about tree/graph search or MPC?
Everything can be solved by throwing some clever dRL at it.
© Copyright National University of Singapore. All Rights Reserved.
http://www.incompleteideas.net/IncIdeas/BitterLesson.html 44
Homework for Next Week
Watch the first video (link on CANVAS, Syllabus)
Markov Decision Processes: Definition, Analysis, and first Approaches
Complete Quiz 1 (graded)
Unlimited approaches, due before next week’s class
Bring your laptop
If possible, dual-boot with Linux (Ubuntu)
At least Windows Subsystem for Linux (WSL) install
© Copyright National University of Singapore. All Rights Reserved. 45
References
https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
https://builtin.com/artificial-intelligence/chess-ai
https://blog.paessler.com/the-history-of-chess-ai
https://chrisbutner.github.io/ChessCoach/high-level-explanation.html
https://www.the-next-tech.com/artificial-intelligence/future-of-ai-7-stages-of-evolution-you-need-to-know-about/
https://www.chessprogramming.org/Deep_Learning
https://www.deepmind.com/blog/alphazero-shedding-new-light-on-chess-shogi-and-go
https://www.techtarget.com/searchenterpriseai/definition/machine-learning-ML
https://arshren.medium.com/supervised-unsupervised-and-reinforcement-learning-245b59709f68
Alex Krizhevsky et al. Imagenet classification with deep convolutional neural networks. NeurIPS 2012.
Ke Chen et al. Mvlidarnet: Real-time multi-class scene understanding for autonomous driving using multiple views. IROS 2020.
Yuhong Cao et al. ARiADNE: A reinforcement learning approach using attention-based deep networks for exploration. ICRA 2023.
https://www.youtube.com/watch?v=cLu4YKCLhIE
https://www.youtube.com/watch?v=W1LWMk7JB80
https://www.caltech.edu/about/news/new-bioinspired-robot-flies-rolls-walks-and-more
https://www.youtube.com/watch?v=K926HAKRFvw
https://www.youtube.com/watch?v=SEuFfONryL0
https://www.intellspot.com/artificial-intelligence-robots/
Yutong Wang et al. SCRIMP: Scalable Communication for Reinforcement-and Imitation-Learning-Based Multi-Agent Pathfinding. IROS 2023.
https://www.youtube.com/watch?v=Y02juH6BDxo
Guillaume Sartoretti et al. Distributed planar manipulation in fluidic environments. ICRA 2016.
Yutong Wang, Yizhuo Wang, & Guillaume Sartoretti. Full Communication Memory Networks for Team-Level Cooperation Learning. JAAMAS.
© Copyright National University of Singapore. All Rights Reserved. 46