Edge AIoT & Microelectronics (EdgeAIoTM) Engineering Skills for Gifted Students
P5L1.1 Getting Familiar with GPU/NPU Architectures,
Edge Platforms and Developer Tools
& Object Detection
Today’s Lesson
❖ Overview of GPU/NPU architecture, applications and types
❖ Getting familiar with GPU/NPU Hardware tools and platforms
❖ Interfacing and testing GPU/NPU Hardware with camera
❖ Overview of YOLO object detection
❖ Lab session:
○ Object Detection with AI PC and Jetson AGX Orin GPU Kit
2
Intended Learning Outcomes
At the end of this lesson, students are expected to:
1. At the end of this lesson, students are expected to:
2. Understand the architecture, applications, and types of
○ GPU (Graphics Processing Unit) and NPU (Neural Processing Unit)
3. Recognise the role of GPU/NPU in accelerating AI and edge computing tasks
4. Get familiar with GPU/NPU hardware tools and platforms, including Jetson
AGX Orin and Ryzen AI NPU
5. Interface and test GPU/NPU hardware with a camera for practical tasks
6. Understand the basics of YOLO object detection for real-time vision
applications
7. Evaluate the suitability of GPU/NPU platforms for different AI workloads and
deployment scenarios
3
Program Structure
First lesson of
Phase 5
4
Further Reading/Recommended Books
[1] Multicore and GPU Programming, An Integrated Approach. Second Edition
[2] GPU Programming with C++ and CUDA, by Paulo Motta
[3] Efficient Processing of Deep Neural Networks,. by Vivienne Sze, Yu Hsin Chen
[4] https://developer.nvidia.com/embedded/jetpack
[5] https://developer.nvidia.com/embedded/learn/tutorials
2
1 3
5
AMD Ryzen AI
● AMD Ryzen AI features an AMD Ryzen processor core, AMD Radeon integrated
graphics engine, and a dedicated Neural Processing Unit (NPU) based on the AMD
XDNA architecture.
● The NPU is designed to efficiently handle AI tasks, enhancing performance and
enabling new features.
6
What is an NPU?
● A Neural Processing Unit (NPU) is a
specialized processor designed
specifically for AI and Machine Learning
tasks.
● It is optimized to perform AI
computations with outstanding energy
efficiency, making it ideal for
applications such as image recognition,
natural language processing, and other
AI-driven functions that run locally on
your laptop.
● With local processing, you benefit from
next-level privacy.
7
Ryzen AI processor
8
NPU / CPU / GPU → All in one
9
Cores and Memory
10
Architectural
11
Floating point vs Integer
FP64
12
Block FP NPU
13
Versal AI
14
YOLO (You Only Look Once)
15
Object Detection
16
Remember Neural Network?
17
Problem setting
18
YOLO Overview
19
Model Training
20
Model Architecture
21
Model Prediction
22
Suppression and Object Function
23
Remarkable Achievements in AI
• AI complexity dramatically increases and requires more powerful machines
24
AI at the Frontier of Autonomous Machines
25
How Much Autonomy achieved?
26
Nvidia: The AI Computing Company
GPU Computing Visual Computing Artificial Intelligence
27
Nvidia GPU: More than Graphics
• Huge Capital
28
Nvidia GPU: Ecosystem
Amaz Baid eBa Facebo
on u y ok
Flic Goog iQI JD.co
kr le YI m
Microso Netfl Perisco Pintere
ft ix pe st
Qihoo Shaza Skyp Sogo
360 m e u
Tence Twitt Yand Yel
nt er ex p
AI-powered AI-as-a-Service AI for Enterprise AI for Auto >1,500 AI
Consumer Services Startups
29
Various GPU Platforms
30
Why Edge AI?
31
Why Edge AI?: Huge demand
32
Edge AI Smart Industries
33
Edge AI Redefines Robotics
34
JETSON : AI AT THE EDGE
Serge Palaric, VP Sales & Marketing EMEA - Embedded
Nvidia Jetson Family: Edge GPU for AI
• Jetson is a powerful GPU series for edge AI
• They achieve high performance and low power
• Use same architecture
• Varying performance and memory
36
37
Jetson Ecosystem
38
Open Framework Support
39
Nvidia Jetson Family Edge GPU
• Available in a wide range of
performance, power-efficiency, and
form factors.
40
Jetson family Boards
41
Jetson family Comparison
42
Jetson Orin Boards Specifications
43
Example Use Cases
44
Example Use Cases
45
Example Use Cases
46
AI Redefines Robotics
47
Jetson AGX Orin
48
https://www.youtube.com/watch?v=eFgsOeHMAW4&t=2s
Jetson AGX Orin
• Server Class AI Performance at edge
• 275/200 TOPS (INT8)
• Many Peripherals: Wifi, USB, PCIe, DP port
• Provides a giant leap forward for Robotics and
Edge AI.
Customers can now deploy large and complex models to solve problems such as
natural language understanding, 3D perception and multi-sensor fusion.
https://www.youtube.com/watch?v=eFgsOeHMAW4&t=2s 49
Jetson AGX Orin Layout/Parts
Mark. Name Note
0 White LED
1 Power button
2 Force Recovery button
3 Reset button
4 USB Type-C port DFP only
5 DC power jack
6 Ethernet port
7 USB Type-A ports 2x USB 3.2 Gen 2
8 DisplayPort output This is the only
display interface on Jetson AGX Orin Developer Kit
9 USB micro-B port For debug
10 USB Type-C port Also for flashing
(UFP and DFP)
11 40-pin connector
12 USB Type-A ports 2x USB 3.2 Gen 1
https://developer.nvidia.com/embedded/learn/jetson-agx-orin-devkit-user-guide/developer_kit_layout.html 50
Carrier Board Layout
51
Jetson AGX Orin Series System-On-Module
52
Jetson AGX Orin Series System-On-Module
53
Jetson AGX Orin Series System-On-Module
54
Jetson AGX Orin
• Jetson AGX Orin delivers 8X the AI performance of Jetson AGX Xavier AI.
55
https://www.youtube.com/watch?v=eFgsOeHMAW4&t=2s
Jetson AGX Orin Quick Setup
• Connect the display cable (DP cable)
• Connect the Mouth and Keyboards
• Connect the power cable
• Power on the board.
• Follow the instructions to configure the Ubuntu
• This setup only installs Ubuntu, not yet enough for AI
56
https://www.youtube.com/watch?v=eFgsOeHMAW4&t=2s
Jetson SDK for Edge AI
57
Jetson SDK for Edge AI
58
Jetpack 59
NVIDIA SDK accelerates every major framework
59
Jetpack 4.1 Components
60
Jetson Jetpack Supports
61
Compute Demands
It is necessary to accelerate every major framework
62
Accelerate inferencing on the GPU using NVIDIA TensorRT
✔ NVIDIA® TensorRT is an SDK for
high-performance deep learning
inference.
✔ It includes a deep learning inference
optimizer and runtime that delivers low
latency and high throughput for deep
learning inference applications.
✔ With TensorRT, developers can focus
on creating novel AI-powered
applications rather than performance
tuning for inference deployment.
63
Tensor RT
Achieves better performance than onnx and CPU
64
Tensor RT
Achieves better performance than onnx and CPU
65
Tensorrt Compatible Hardware
66
66
Jetson Community: Comprehensive Developer Site
67
Additional learning Opportunities
68
Additional learning Opportunities
69
69
Setting the Jetson environment for AI
✔ Jetson Nano, Orin Nano, and Xavier NX: Use the SD card image installation method
✔ Jetson TX1/TX2, AGX Xavier, and AGX Orin: SDK Manager installation method is
recommended. Default setup flow could also be used.
70
https://developer.nvidia.com/embedded/learn/jetson-agx-orin-devkit-user-guide/two_ways_to_set_up_software.html
Install Jetson Software with SDK Manager
❖ NVIDIA SDK Manager is an all-in-one tool that bundles developer software
❖ Provides an end-to-end development environment setup solution for NVIDIA
SDKs.
❖ Allows you to flash and setup Jetson from a host PC
1. Create an Nvidia Developer Account
https://developer.nvidia.com/login
2. Download SDK Manager
https://developer.nvidia.com/sdk-manager
3. Run the setup
https://docs.nvidia.com/sdk-manager/install-with-sdkm-jetson/index.html
71
Install Jetson Software with SDK Manager
❖ Need to put the Jetson on Recovery mode
(Press the Recovery button, then reset while holding the recovery button,
release reset, release Recovery button)
❖ Use the lsusb command in the terminal to confirm
72
Install Jetson Software with SDK Manager 73
Follow the guide to complete:
https://docs.nvidia.com/sdk-manager/install-with-sdkm-jetson/index.html
Last week Lab Exercise
Have you Enjoyed?
PYNQ. Adafruit .
74
Lab Exercise
❖ Object detection using AMD AI PC and the Jetson AGX Orin GPU Kit
Jetson
AGX Orin
75
Control your home
• Some can control your home appliance remotely, centralized and
automated
76
Lab Exercise:
Board Username: eeuser
Board Password: eeuser
77
Other Resources
1. https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html
2. https://developer.nvidia.com/embedded-computing
3. https://developer.nvidia.com/embedded/jetpack
4. https://docs.ultralytics.com/guides/nvidia-jetson/
5. https://developer.nvidia.com/embedded/develop/software
6. https://developer.nvidia.com/embedded/learn/tutorials
7. https://docs.ultralytics.com/guides/object-counting/
8. https://dipankarmedh1.medium.com/real-time-object-detection-with-yolo-and-
webcam-enhancing-your-computer-vision-skills-861b97c78993
9. https://docs.nvidia.com/sdk-manager/index.html
78
Happy learning, please ask questions !!!
18