[go: up one dir, main page]

0% found this document useful (0 votes)
113 views34 pages

Chapter 4 Introduction To Huawei Cloud ModelArts

Uploaded by

williamlaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views34 pages

Chapter 4 Introduction To Huawei Cloud ModelArts

Uploaded by

williamlaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Introduction to Huawei Cloud ModelArts

Foreword
⚫ ModelArts is a one-stop AI development platform. It provides data
preprocessing, semi-automated data labeling, large-scale
distributed training, automated model generation, and model
deployment on devices, edge devices, and the cloud. It helps you
develop and deploy models quickly and manage the AI
development lifecycle for machine learning and deep learning.
⚫ This chapter describes the main functions of Huawei ModelArts,
helping you better understand and use it.

1
Objectives
⚫ Upon completion of this course, you will understand:
 Basic concepts of ModelArts.
 Functions and usage of ModelArts.

2
Contents
1. ModelArts Overview
2. ModelArts Functions

3
From AI+ to +AI : AI Explores Industry Best Practices

AI+ +AI
Exploring AI capabilities AI is empowering enterprises' core
production systems
EfficientNet model accuracy:
Image 98.7% for the top 5 labels,
Smart city Smart campus
classification surpassing the human-level
accuracy of 96%

Speech RNN-T model accuracy: 96.8%,


surpassing the human-level Industrial Internet Smart healthcare
..
recognition

.
accuracy of 94.17%

BERT model accuracy: 90.9%,


Reading Autonomous driving Smart finance
surpassing the human-level
comprehension accuracy of 82%

4
A Long Way to Go from the Thought of "AI + Industry"
to Product-based Applications
Algorithm vs. Precision iteration acceleration
(image classification as an example)
AI development difficulties AI implementation challenges
Fast algorithm update; difficult
algorithm selection and tuning

50% 100+
Annual growth rate of Annual algorithm iteration Difficult data Difficult industry Difficult knowledge
global papers on AI in a single domain acquisition knowledge distillation computing

The compute power need for model


PFlops/day
training doubles every 3.5 months
1e+4 GPT-3
AlphaGoZero Compute power
1e+3 requirements
AI compute power scarcity AlphaZero
1e+2 Neural Machine
Expensive AI model training Translation
1e+1 TI7 Dota 1v1
Huge gap
1e+0 Xception
AI skill shortage
DeepSpeech2
1e-1 ResNet Severe shortage of AI talent, uneven distribution
GPT-3 BERT 1e-2
VGG
SEQ2SEQ
GoogleNet
US$4.6 million US$15,000 on 1e-3
AlexNet Virtual Machine AI compute power
average Understanding Conv growth rate
1e-4
2012 2013 2014 2015 2016 2017 2018 2019+ Source: Global AI Talent Report 2020

5
Global AI Research Deployment: Building Core
Capabilities Around Computing, Algorithm, and Data
Governance
Ireland
Video algorithm
Algorithm
Minsk, Russia Vancouver/Toronto/Ottawa/Montreal Computer
vision algorithm, interpretability, and security
Inference and decision-making • Deep learning
Germany algorithm center and big data Vancouver
Streaming, cryptography and • Data mining
competence center Big data scheduling algorithm competence
security, and simulator • Knowledge graph
center, hardware acceleration, data
France • Cognitive computing
Mathematics lake/data lineage, and supercomputing
research Israel competence center
Algorithm acceleration
Hong Kong, Shenzhen, Data
Beijing, Shanghai, Hangzhou,
governance Compute
Xi'an, and Nanjing
Huawei power
India • Big data mgmt platform
Big data open
source competence Singapore
• Mass data storage
platform
AI •• Ascend processor
Kunpeng processor
center, CarbonData Algorithm and security
• Intelligent data lake

AI platform/
Algorithm
framework
• ModelArts
• MindSpore

No. 1 10%+ 15+ 85+ 5,000+


Ranking by Percentage R&D centers in 4 Cooperation with R&D
patents of PhDs continents global organizations engineers Top community contributor

6
Building Powerful AI Technologies for ModelArts by
Continuous Innovation and Research
No.1 on MS COCO
No. 1 on ImageNet-1000 56.8% single-model accuracy and 58.8%
The highest accuracy of 85.8% on the multi-model accuracy on MS-COCO, one of
industry's most widely used, large-scale the most widely used large-scale object
image classification dataset
Intelligent detection datasets

perception No. 1 on NuScenes


No.1 in WebVision 64.2% accuracy based on multi-modal data
82.97% accuracy in image classification convergence on one of the most widely used
with only slightly labeled WebVision dataset large-scale 3D object detection datasets

Huawei Cloud
EI

Decision- Intelligent WSDM No.1 NLPCC No.1


No. 1 on ESICUP making cognition Knowledge graph Built-in language
No. 1 utilization of 81.25% on ESICUP, and data mining models
the most widely used public dataset on support
CCKS No.1 CCF BDCI No.1
cutting and packing, by leveraging Event extraction Entity-level
Huawei's proprietary decision for finance sentiment analysis
optimization algorithm

7
ModelArts: Ascend-based Full-Stack AI Platform
Fully controllable AI technology stack Advantages

Intelligent Scientific Future-proof


application Research
Transportation Education Manufacturing Finance Healthcare City
ecosystem 100+ AI models
AI-assisted drug screening
ModelArts, full-process AI enablement platform 1000+ algorithms/models
AI-enabled
Algorithm AI-enabled genome analysis
Industry industries
development ExeML Pangu models
datasets
Software platform

environment
Distributed AI scientific Built-in Ascend

Double first
Data labeling Distributed • A single task supports
debugging computing algorithms
Data AI asset optimization ≥ 4,096 cards
AI training AI inference
management marketplace
• A single cluster supports
Full-stack AI

Controllable in-house chip enablement tool chain Large-scale AI ≥ 100,000 cards


World's first EB-class
Deep learning computing Ascend operator development compute cluster
framework MindSpore tool MindStudio

Ascend heterogeneous computing driver CANN 1802


Hardware- ResNet-50 (images/second)

Same 965
software synergy (images/second)
In-house chips with intellectual property
infrastructure

computing
Hardware

Kunpeng Ascend NPUs power


(general compute power) (AI compute power) High- Mainstream training Ascend 910 +
performance AI card + TensorFlow ModelArts
Infrastructure
construction

Planning Civil
Electricity
Cooling, 800+ projects & practices in 10+ industries,
and design works ventilation, etc
creating new value with AI software
8
ModelArts: Core Technologies for AI Adoption
Multimodal knowledge Adversarial learning and
Knowledge graph Auto labeling
representation sample generation

Industry expression
Reduce 𝑓𝑇𝐼 →𝑇𝐴

Pre-trained models Transfer learning and Meta-learning/Few-shot Operations


and NAS federated learning learning optimization solver

AI development

Reduce 𝑓𝑇𝐴 →𝑆𝐴

Model explainability Device-cloud synergy Online learning and


Model compression
and security and edge AI lifelong learning
AI
implementation

Reduce 𝑓𝑆𝐴 →𝑆𝐴𝐼

9
ModelArts: AI Enablement Platform That Drives Intelligent
Upgrade Across Industries
City Transport Manufacturing Healthcare Finance Network Scientific Meteorology Water e-Gov
research
800+ Huawei projects, AI entered 30%+ of enterprise production systems, 18% higher profitability

Resource
scheduling AI Gallery
engine

Unleashed Pangu Industry/ Cultivating AI talent


compute power models Domain suites
and developing an
industry ecosystem
New way of industrial AI Industry AI deployment
development
Device-edge-
Data cloud synergy
preprocessing

MoXing MLOps

Model evaluation
ExeML

Elastic training

10
ModelArts Empowers RFCx to Conserve
Tropical Rainforests

11
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

12
ModelArts Service Overview

13
ExeML Engine for Creating an AI Model in Three Steps

Step 1:

Zero Upload data and


label it.
coding

Step 2:
Train a model.

Zero
AI experience Step 3:
Evaluate and
publish the model.
The training job is completed within 20 minutes.

14
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

15
ModelArts notebook for seamless in-cloud and on-
premises collaboration

16
ModelArts CodeLab Makes AI Exploration and
Teaching

Easier
In-cloud notebook, case access and sharing
in seconds Keywords
• Serverless instance management for Out-of-the-
Case access Flexible
automated resource reclamation, which is free box usage
Free AI Sharing and
• Free compute power and on-demand change compute power exploration
of specifications

In-cloud notebook for exploration and


learning from MindSpore cases

17
ModelArts IDE SDKs
• Code development and debugging: Local IDE + ModelArts
plugins for remote development, tailored to your needs Keywords
• Cloud-based development environment with AI compute
Remote
resources, cloud storage, and built-in AI engines ModelArts IDE
development
• Customizable runtime environment: Development Multi-person Flexible AI Efficient
environment saved as an image for training and inference collaboration compute resources development

ModelArts on cloud
Developers' local Remote
environment Development
development
environment

Data
+ ModelArts plugins
management

Training system

+ ModelArts plugins ...


Capability Inference
integration system

18
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

19
Data Management

20
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

21
Dual-stack AI computing power: Stable and Secure
Computing Base, Fast and Simple Model Training
 GPU and Ascend dual-stack AI computing power supports the management of 10,000-node compute
clusters.
 Large-scale distributed training accelerates foundation model development.

桂洵

22
Integrated Development and Training with
Intelligent Fault Diagnosis
Algorithm debugging Before training In training After training

IDE-based debugging FAQs Job exception prompts Job failure cause prompts

Automatically filtering out Automatic fault tolerance Automatic fault


Trial run
abnormal nodes and resumable training classification

✓ Process I/O check


✓ Quick job submission ✓ Hardware failures: ✓ Resource usage check ✓ Service log analysis
using IDE plug-ins GPU/CPU/memory/disk/network ✓ Hardware failures: isolation ✓ Container return code and
✓ Non-blocking trial run ✓ Software failures: driver/domain and rectification event analysis
✓ ... name/system settings ✓ Automatic retry upon ✓ Analysis of failed tenant
random faults tasks
✓ Suspension detection ✓ ...
✓ Performance degradation
detection
✓ ...

Large-scale training: foundation model development, AIGC, autonomous Small-scale training: industrial AI application
driving, and more development
• Large model size, complex logic, ① Multiple nodes, long training time, and • Model development ① Insufficient problem
and multiple parallel modes increased likelihood of errors using open-source locating
• Large-scale data trained across ② Lack of tools and inefficient fault locating algorithms ② Insufficient
multiple nodes ③ Few monitoring metrics for performance • Lack of AI development performance analysis
analysis professionals

23
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

24
Flexible Model Deployment on Devices, Edge
Devices, or the Cloud
Real-time service
API ▪ High throughput, low latency, and
automatic scale-in
▪ Inference optimization

Batch
Batch service
▪ Batch data inference task
▪ Efficient distributed computing
Edge model optimization
(devices, latency, and accuracy)
AI model Edge service
Model compression ▪ In-depth integration with IEF
▪ Support for Huawei Ascend AI chips
Network Test bed
Model pruning
distillation verification
Model quantization Edge service
▪ Support for Huawei HiLens, SDC, and CloudLink

25
Model Repositories for Unified Management of
Models with Different Frameworks and Functions
from Different Vendors
Customer models Huawei-built models Third-party algorithm vendors
Highlights
⚫ AI algorithms with different frameworks and Training
Models/Images Models/Images
functions from different vendors can be centrally jobs/Models/Images
managed.
⚫ Third-party AI algorithms can be quickly released
as inference services through Docker images. Model repository
⚫ Algorithms can be quickly released as inference
services in the form of model files.
01 02 03
⚫ Large model deployment and model update
within seconds Importing Importing Importing a
a model a model model
⚫ Separate maintenance and upgrade of different from a from a from a custom
AI models training job template image
⚫ You can customize the model inference
specifications according to your needs. The
supported granularity of CPU, GPU, and Ascend
310 is 0.01. The memory allocation has a Trained models: You can Model templates: Each Custom images: You can
granularity of 1 MB. import ModelArts template corresponds to a create a Docker image using
⚫ Easy-to-use operations and management on UI training job results. specific AI engine and your model or upload your
inference mode. With the model to OBS and import it
templates, you can quickly as a model specification
import models to ModelArts. package.

26
Contents
1. ModelArts Overview
2. ModelArts Functions
2.1 ExeML
2.2 Development Environment
2.3 Data Management
2.4 Training Platform
2.5 Inference Platform
2.6 AI Gallery

27
AI Gallery: Bridging Supply and Demand in the
AI Ecosystem
Supply AI Gallery Demand
AI Provide AI use cases to AI use cases Leverage AI use cases to
gain profit and reputation. solve business problems.
implementation (Scenario-specific AI asset portfolios,
and ways to deploy them)
AI enterprises and AI assets, which can be
Industry customers
senior developers combined to build AI use cases

Provide AI assets to gain AI assets Use AI assets to improve AI


(Atomic components and optimization
profit and reputation. development efficiency.
AI development solutions during AI development, such
as datasets, algorithm optimization,
Enterprises and and model acceleration) Enterprises and
individual developers individual developers
AI content, which can be provided as
practice examples with AI use cases
Provide AI content to gain Use AI content to obtain
profit and reputation.
AI learning AI content advanced capabilities.
AI experts and AI education (Courses, papers, articles, and more) AI beginners
and training institutions
Use ModelArts to produce AI Provide base Use ModelArts to utilize AI
assets and use cases. support. assets and use cases.

ModelArts
Data processing Code development Algorithm management Model training Model management Service deployment

28
Building a Comprehensive Learning System for AI
Developers
Comprehensive courses One-stop learning experience Follow-up teaching plans

AI beginners AI engineers AI application engineers

AI concept, structure, and development Python programming knowledge Action recognition Wildlife identification

Development and strategic planning of the AI Popular frameworks Predictive


MindSpore zone Face mask detection
industry and tools maintenance

Plant disease Speech recognition


Huawei's full-stack, all-scenario AI strategies Machine learning Deep learning
detection practices

AI model creation without writing code Domain knowledge application Lane line detection More

29
Q&A
1. Which of the following function modules are provided by
ModelArts? ( )
A.ExeML
B. Data management
C. Inference platform
D.AI Gallery

30
Summary
• This chapter describes the main functions of Huawei one-stop
development platform ModelArts, including ExeML, data
management, development environment, training, inference, and
AI Gallery.

31
Acronyms and Abbreviations
⚫ Bidirectional Encoder Representations from Transformers (BERT),
a pre-trained language representation model
⚫ Recurrent Neural Network (RNN)

32
Thank You.
Copyright © 2024 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including,
without limitation, statements regarding the future financial and operating results,
future product portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially from those
expressed or implied in the predictive statements. Therefore, such information is
provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.

33

You might also like