0% found this document useful (0 votes)

27 views25 pages

Environment Setup

Uploaded by

Da HUANG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views25 pages

Environment Setup

Uploaded by

Da HUANG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Machine Learning

[Tutorial: Environment Setup]

I-Ching Tseng
d08922025@csie.ntu.edu.tw
mlta-2022-spring@googlegroups.com
National Taiwan University
March 2022
Outline
q Overview
q Package Management Tools
q GPU
q Docker
q Conclusion

2
Overview
q To run a machine learning (ML) model
Ø You have to set up an environment first
Ø Using virtualization or package management tools is a good practice
• You can migrate the code and reproduce the result easily
• Different applications will not affect each other
• If your environment is broken, just create a new environment

q In this tutorial
Ø We will provide some guidelines for setting up environment
Ø We will help you understand the environment
• The software stack
• NVIDIA GPUs

3
Outline
q Overview
q Package Management Tools
Ø Prerequisites
Ø Conda
Ø Pipenv
Ø Summary
q GPU
q Docker
q Conclusion

4
Prerequisites
q Package management tools
Ø Help you to manage to environment
Ø Do not manage the GPU driver
q To utilize GPUs, make sure the GPU driver is intalled
Application

Conda/Pipenv

PyTorch

NVIDIA Driver
Software
Hardware
NVIDIA GPU
5
Conda
q Conda
Ø An open source package and environment management system
Ø Supports Windows, MacOS, and Linux

q We take Anaconda as an example

6
Quick Start - Anaconda
Steps Linux Command

Install Anaconda with the installer bash Anaconda3-2021.11-Linux-x86_64.sh

(Check the document for details)

Create an environment
(You can replace test_env with conda create -n test_env
your desired environment name)

Install packages conda install -n test_env pytorch torchvision

(You can find the command in torchaudio cudatoolkit=11.3 -c pytorch
the PyTorch official website)

Activate the environment conda activate test_env

Run your application python ml.py

Leave the environment conda deactivate

7
Pipenv
q Pipenv
Ø A tool that creates and manages a virtualenv

8
Quick Start - Pipenv
q To know more about Pipenv, please check the document
Steps Linux Command

Install Pipenv with pip3 pip3 install pipenv

pipenv install numpy torchvision torch --index

Install packages https://download.pytorch.org/whl/cu113

Activate the environment pipenv shell

Run your application python ml.py

Leave the environment Ctrl + D

9
Summary
q To utilize GPU, you must install driver on your host machine

q Using Conda or Pipenv to build environments is recommended

Ø Portable
Ø Reproducible
Ø Applications do not affect each other

q You can stop here if you just want to finish the homework

q Why is PyTorch so convenient?

Ø "We ship with everything in-built (PyTorch binaries include CUDA,
CuDNN, NCCL, MKL, etc.)." [Reference]

10
Outline
q Overview
q Package Management Tools
q GPU
Ø NVIDIA GPUs
Ø Software Stack
Ø NVIDIA Driver
Ø CUDA
q Docker
q Conclusion

11
NVIDIA GPUs
q General Purpose Graphics Processing Units (GPGPU)
Ø GPUs are originally designed for computer graphic applications
Ø GPU is good at parallelizing "simple and repetitive" computations
• E.g., matrix multiplication
Ø There are massive matrix multiplication computations in ML models
• We use GPU to accelerate ML model training

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html 12
Software Stack
Application
Translation Image Classification Regression

Frameworks (Caffe, Tensorflow, PyTorch, etc.)

Generic Convolution Layer cuDNN Optimized Conv. Layer

BLAS Libraries
OpenBLAS MKL2019 cuDNN/cuBLAS

NVIDIA Driver

Hardware
CPU FPGA GPU

13
NVIDIA Driver
q NVIDIA driver
Ø The software that allows operating systems (OS) to communicate with
GPUs
Ø Includes kernel modules

Frameworks
cuDNN Conv. Layer
cuDNN/cuBLAS
BLAS Lib
cuDNN/cuBLAS CUDA Runtime API

User space
NVIDIA Driver CUDA Driver Kernel space

Hardware
GPU
14
CUDA
q Compute Unified Device Architecture (CUDA)
Ø "A parallel computing platform and application programming interface
that allows software to use NVIDIA GPUs" [Wikipedia]
q CUDA Runtime API vs. CUDA Driver API
Ø The driver CUDA version must ≥ the runtime CUDA version
Ø Check the driver CUDA version

Ø When we "install CUDA"

• We usually refer to CUDA runtime
• You should check the framework compatibility
• The version should not be greater than the driver CUDA version
• You should choose the runtime CUDA version carefully

15
Outline
q Overview
q Package Management Tools
q GPU
q Docker
Ø Virtualization
Ø Why using Container?
Ø Contanerization with Docker
Ø Pulling Docker Images
Ø NVIDIA Docker
q Conclusion

16
Virtualization
q Virtual machine (VM) and container

q You only have to know that

Ø Containers only virtualize software layers above the OS level
• It is a good choice if we only focus on specific hardware (e.g., NVIDIA GPUs)
Ø Containers are relatively lightweight

https://www.docker.com/resources/what-container 17
Why using Container?
q Containers can virtualize more complex environments
Ø Even if you "only want to train models"
• You may use other frameworks that do not ship with CUDA and cuDNN
• You may need NCCL to perform efficient parallel and distributed training
• You may need to run an old version PyTorch, but the default CUDA version is
too old to communicate with the latest powerful GPU

q Slurm and Kubernetes are popular server management tools

in both academia and industry
Ø Slurm supports singularity container
Ø Kubernetes runs application in Docker containers

18
Containerization with Docker
q Docker
Ø A platform for you to build and run with containers
Ø Docker installation
• Docker Desktop (for Mac and Windows) runs a VM
q Docker image
Ø A set of instructions for creating a Docker container
q Steps of setting up environment with Docker
Ø Install Docker
• One-time effort
Ø Build/pull an image
• There are lots of built images
Ø Run the container
Ø Run your application

19
Pulling Docker Images
q Docker Hub
Ø A place for finding and sharing Docker images
• E.g., Docker Hub repository of PyTorch
q Check the Docker Hub and find the image tag
Ø 1.9.1-cuda11.1-cudnn8-devel vs. 1.9.1-cuda11.1-cudnn8-runtime?

Ø Run "docker pull <image_tag>"

20
NVIDIA Docker (1/2)
q Using GPUs in Docker container makes container less portable
Ø Containers work in user space
• Root privilege only means you can use some privileged system calls
Ø Using NVIDIA GPUs requires kernel modules and user-level libraries
• The CUDA version of the driver user-space modules must be exactly the same
as the CUDA version of the driver kernel modules
• The runtime CUDA version can be smaller than the driver CUDA version
Ø The host driver must exactly match the version of the driver installed in
the container

q We should use NVIDIA Docker

Ø Install NVIDIA Docker
Ø You do not have to install the NVIDIA driver in the container

https://github.com/NVIDIA/nvidia-docker 21
NVIDIA Docker (2/2)
q Steps
Ø Install the latest NVIDIA driver
• One-time effort
Ø Install NVIDIA Docker
• One-time effort
Ø Build/pull an image
Ø Run the container
Ø Run your application

22
Outline
q Overview
q Package Management Tools
q GPU
q Docker
q Conclusion

23
Conclusion
q Whether or not you virtualize your environment
Ø You must install the NVIDIA driver on the host to utilize NVIDIA GPUs
Ø The runtime CUDA version must be less than or equal to the driver
CUDA version

q If you want to use NVIDIA GPUs in containers

Ø Using NVIDIA Docker makes your life easier
• You do not need to install NVIDIA drivers in containers
• Containers are more portable
Ø You only have to pull the built Docker image from Docker Hub
• You do not have to set up CUDA, cuDNN, and frameworks yourself
• This is useful especially when the environment is complex

24
Q&A

Thank You!

Setting Up A Deep Learning Workplace With An NVIDIA Graphics Card (GPU) - For Windows OS
No ratings yet
Setting Up A Deep Learning Workplace With An NVIDIA Graphics Card (GPU) - For Windows OS
25 pages
Lab 1 - Setting Up The Environment
No ratings yet
Lab 1 - Setting Up The Environment
4 pages
Deep Learning with PyTorch Guide
0% (1)
Deep Learning with PyTorch Guide
65 pages
Docker Guide For AI Research
No ratings yet
Docker Guide For AI Research
8 pages
CUDA Installation Guide Windows
No ratings yet
CUDA Installation Guide Windows
28 pages
Tensorlayer Documentation: Release 1.11.1
No ratings yet
Tensorlayer Documentation: Release 1.11.1
258 pages
NB4-06 PT I Using CNN
No ratings yet
NB4-06 PT I Using CNN
21 pages
CUDA Toolkit Documentation 12.9 Update 1
No ratings yet
CUDA Toolkit Documentation 12.9 Update 1
10 pages
Getting Started With CUDA Samples
No ratings yet
Getting Started With CUDA Samples
9 pages
Set Up Python Env
No ratings yet
Set Up Python Env
8 pages
Python GPU Acceleration Webinar
No ratings yet
Python GPU Acceleration Webinar
33 pages
Linux
No ratings yet
Linux
7 pages
Tensor Flow Installation Guide PDF
No ratings yet
Tensor Flow Installation Guide PDF
41 pages
An Introduction To Fine-Tuning LLMs at Home With Axolotl #3 - The Register
No ratings yet
An Introduction To Fine-Tuning LLMs at Home With Axolotl #3 - The Register
3 pages
CUDA Getting Started Guide For Linux
No ratings yet
CUDA Getting Started Guide For Linux
16 pages
Tensorflow Object Detection Api Tutorial PDF
No ratings yet
Tensorflow Object Detection Api Tutorial PDF
41 pages
CUDA Zone - Library of Resources - NVIDIA Developer
No ratings yet
CUDA Zone - Library of Resources - NVIDIA Developer
7 pages
TensorFlow User Guide
No ratings yet
TensorFlow User Guide
24 pages
ACA Unit3 Revised
No ratings yet
ACA Unit3 Revised
53 pages
CUDA Quick Start Guide
No ratings yet
CUDA Quick Start Guide
32 pages
CUDA Getting Started Linux
No ratings yet
CUDA Getting Started Linux
19 pages
BACKTRACK CUDA v2.0
No ratings yet
BACKTRACK CUDA v2.0
31 pages
w13s1 MultiprocessingGPU
No ratings yet
w13s1 MultiprocessingGPU
21 pages
TensorRT Installation Guide
No ratings yet
TensorRT Installation Guide
45 pages
Nvidia Cuda C Getting Started Guide For Linux: Installation and Verification On Linux Systems
No ratings yet
Nvidia Cuda C Getting Started Guide For Linux: Installation and Verification On Linux Systems
16 pages
CUDA Class Lecture01
No ratings yet
CUDA Class Lecture01
26 pages
CUDA Installation Guide Windows
No ratings yet
CUDA Installation Guide Windows
28 pages
CUDA Getting Started Windows
No ratings yet
CUDA Getting Started Windows
15 pages
GPU Cluster4
No ratings yet
GPU Cluster4
31 pages
CUDA Getting Started Linux PDF
No ratings yet
CUDA Getting Started Linux PDF
32 pages
NGC Registry Launch Technical Overview
No ratings yet
NGC Registry Launch Technical Overview
11 pages
CuDNN Installation Guide
No ratings yet
CuDNN Installation Guide
13 pages
NVIDIA Datacenter Drivers
No ratings yet
NVIDIA Datacenter Drivers
18 pages
CUDA Toolkit Release Notes
No ratings yet
CUDA Toolkit Release Notes
26 pages
GPU Programming: Dr. Florian Ferreira
No ratings yet
GPU Programming: Dr. Florian Ferreira
101 pages
09 Tensorflow101 Slide
No ratings yet
09 Tensorflow101 Slide
78 pages
Deep Learning With PyTorch Guide For Beginners and Intermediate
100% (7)
Deep Learning With PyTorch Guide For Beginners and Intermediate
120 pages
2013 07 22-Python-CUDA
No ratings yet
2013 07 22-Python-CUDA
25 pages
CUDA Installation Guide Windows
No ratings yet
CUDA Installation Guide Windows
28 pages
Cuuda Nvidai Guide - Part1
No ratings yet
Cuuda Nvidai Guide - Part1
15 pages
Nvidia Cuda Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
No ratings yet
Nvidia Cuda Getting Started Guide For Microsoft Windows: Installation and Verification On Windows
15 pages
Slide 3.2 Virtual Machines and Containers - PPTX 1 1
No ratings yet
Slide 3.2 Virtual Machines and Containers - PPTX 1 1
31 pages
Nvidia Learning Training Course Catalog
No ratings yet
Nvidia Learning Training Course Catalog
34 pages
MLOps, Cloud, Docker in AI
No ratings yet
MLOps, Cloud, Docker in AI
19 pages
Getting Started
No ratings yet
Getting Started
7 pages
Setup Deep Learning Workstation With Ubuntu 22.04 - by Venky - MLearning - Ai - Medium
No ratings yet
Setup Deep Learning Workstation With Ubuntu 22.04 - by Venky - MLearning - Ai - Medium
18 pages
E
No ratings yet
E
7 pages
Cuda Lab Manual
100% (1)
Cuda Lab Manual
22 pages
NVIDIA OpenCL JumpStart Guide
No ratings yet
NVIDIA OpenCL JumpStart Guide
15 pages
Cuda
No ratings yet
Cuda
15 pages
CUDA Wikipedia
No ratings yet
CUDA Wikipedia
10 pages
Gpu, Cuda and Pycuda
No ratings yet
Gpu, Cuda and Pycuda
11 pages
OpenCL Jumpstart Guide
No ratings yet
OpenCL Jumpstart Guide
17 pages
Software Installation and Verification
No ratings yet
Software Installation and Verification
17 pages
Agents and Goals in Evolution Samir Okasha Instant Download
No ratings yet
Agents and Goals in Evolution Samir Okasha Instant Download
110 pages
IEEE 13 Bus Power System
No ratings yet
IEEE 13 Bus Power System
5 pages
Melissa Leaf Dry Extract
No ratings yet
Melissa Leaf Dry Extract
2 pages
Getting Started With ABAP Cloud Development
No ratings yet
Getting Started With ABAP Cloud Development
15 pages
Ms0003 A4 Home Letter Size
No ratings yet
Ms0003 A4 Home Letter Size
25 pages
Sheikh & Uzumeri - 1982 - Analytical Model For Concrete Confinement in Tied Columns
No ratings yet
Sheikh & Uzumeri - 1982 - Analytical Model For Concrete Confinement in Tied Columns
20 pages
8 A Review Paper On Vedic Mathematics
No ratings yet
8 A Review Paper On Vedic Mathematics
4 pages
Ai Catalyst Hub Report
No ratings yet
Ai Catalyst Hub Report
64 pages
Standard Costing and Variance Analysis
0% (1)
Standard Costing and Variance Analysis
3 pages
Important Questions From Microwave Engineering (REC-601) : o SH 11 R R
No ratings yet
Important Questions From Microwave Engineering (REC-601) : o SH 11 R R
3 pages
METOLAT ETC 1 External
No ratings yet
METOLAT ETC 1 External
21 pages
Extra Questions
No ratings yet
Extra Questions
10 pages
Sedimentary Geology
No ratings yet
Sedimentary Geology
18 pages
Zhao Et Al., 2014
No ratings yet
Zhao Et Al., 2014
13 pages
AudioThing Hats
No ratings yet
AudioThing Hats
7 pages
Solution Review Set P DS
No ratings yet
Solution Review Set P DS
10 pages
LZZBJ9 10ag PDF
No ratings yet
LZZBJ9 10ag PDF
1 page
ROCEDURE - INSPECTION & TEST Valves
No ratings yet
ROCEDURE - INSPECTION & TEST Valves
10 pages
Compiler Design - r16 Lab Manual - Final
No ratings yet
Compiler Design - r16 Lab Manual - Final
55 pages
Offshore Engineering - An Overview
100% (1)
Offshore Engineering - An Overview
39 pages
FINAL (SG) - PR2 11 - 12 - UNIT 5 - LESSON 3 - Research Instruments For Quantitative Research
100% (3)
FINAL (SG) - PR2 11 - 12 - UNIT 5 - LESSON 3 - Research Instruments For Quantitative Research
18 pages
HKIMO 2018 G6 - Primary 6
100% (1)
HKIMO 2018 G6 - Primary 6
6 pages
Super 100 Questions For CGL Prelim 2024 Hindi+English PDF
No ratings yet
Super 100 Questions For CGL Prelim 2024 Hindi+English PDF
33 pages
Introduction To EMC: Electronic Components
No ratings yet
Introduction To EMC: Electronic Components
26 pages
Milanjeet
No ratings yet
Milanjeet
58 pages
Assignment 4 Dig Tech Harith Aqasha 3AVM2
No ratings yet
Assignment 4 Dig Tech Harith Aqasha 3AVM2
3 pages
CS3501-CD Lab Manual
No ratings yet
CS3501-CD Lab Manual
36 pages
Geosynthetic-Reinforced Pavements
No ratings yet
Geosynthetic-Reinforced Pavements
56 pages
Debate & Analysis Guide
No ratings yet
Debate & Analysis Guide
5 pages
FSUIPC7 Offsets Status
No ratings yet
FSUIPC7 Offsets Status
119 pages

Environment Setup

Uploaded by

Environment Setup

Uploaded by

Machine Learning

[Tutorial: Environment Setup]

q We take Anaconda as an example

Install Anaconda with the installer bash Anaconda3-2021.11-Linux-x86_64.sh

Install packages conda install -n test_env pytorch torchvision

Activate the environment conda activate test_env

Run your application python ml.py

Leave the environment conda deactivate

Install Pipenv with pip3 pip3 install pipenv

pipenv install numpy torchvision torch --index

Activate the environment pipenv shell

Run your application python ml.py

Leave the environment Ctrl + D

q Using Conda or Pipenv to build environments is recommended

q Why is PyTorch so convenient?

Frameworks (Caffe, Tensorflow, PyTorch, etc.)

Ø When we "install CUDA"

q You only have to know that

q Slurm and Kubernetes are popular server management tools

Ø Run "docker pull <image_tag>"

q We should use NVIDIA Docker

q If you want to use NVIDIA GPUs in containers

You might also like