0% found this document useful (0 votes)

12 views11 pages

Module 5 Notes

The document discusses key aspects of model serialization in machine learning, highlighting its importance for saving time, ensuring portability, and maintaining reproducibility. It also covers the use of FastAPI for serving models, emphasizing real-time predictions and modular integration, as well as the significance of defining input/output schemas for data validation and developer experience. Additionally, it addresses the efficiency of handling batch predictions through vectorized methods to improve performance.

Uploaded by

bino52104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views11 pages

Module 5 Notes

Uploaded by

bino52104

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

0.

Topics
Sunday, May 25, 2025 11:14 AM

1. Model Serialization

2. Pickle vs Joblib

3. Designing Model I/O Schemas

4. Serving ML Models with FastAPI

5. Handling Batch Predictions

5. ML Integration Page 54
1. Model Serialization
Sunday, May 25, 2025 11:23 AM

What is Model Serialization?

• Serialization is the process of converting a trained machine learning model into a byte stream that can be
saved to a file or database
• This can later be deserialized (loaded) to recreate the model in memory without retraining it from scratch

Common Libraries:
• Pickle
• Joblib
• Keras (.h5, .keras)
• Tensorflow (SavedModel)
• Pytorch (.pt, .pth)

Common Formats:
• JSON
• Binary
• HDF5

Why Model Serialization is Important:

1. Saves Time and Computational Resources:

• Training ML models, especially deep learning models, can take minutes to hours—or even days
• Serialization allows you to store the trained model once and use it repeatedly without incurring the cost of
retraining
• This is especially critical in:
○ Production deployments
○ Iterative testing
○ Rapid prototyping
○ Resource-constrained environments (like edge devices)

2. Portability Across Platforms and Environments:

• A serialized model can be moved across different machines, operating systems, or cloud services
• It facilitates collaboration across teams—one team trains the model, another team deploys it
• Serialization also allows ML models to be integrated into mobile apps, IoT devices, or cloud containers (ex.
Docker)

3. Reproducibility and Consistency:

• Serialization preserves the trained state exactly—including model parameters, weights, and internal
configurations

5. ML Integration Page 55
configurations
• Ensures consistent predictions across different runs, which is crucial for:
○ Model validation
○ A/B testing
○ Regulatory or compliance audits

4. Model Serving and Integration:

• Serialization enables seamless deployment of ML models into real-world systems like REST APIs, web
applications, dashboards, or edge devices
• Allows decoupling of training and inference phases:
○ Training happens offline (e.g., Jupyter notebook)
○ Inference happens online (e.g., real-time request to a FastAPI or Flask server)

5. Foundation for Model Versioning and CI/CD Pipelines:

• Serialization plays a vital role in MLOps:
○ Store models in versioned model registries (e.g., MLflow, DVC, AWS SageMaker)
○ Automate model deployment and rollback
• Helps teams track changes and compare performance across versions

5. ML Integration Page 56
5. ML Integration Page 57
2. Pickle vs Joblib
Sunday, May 25, 2025 12:38 PM

5. ML Integration Page 58
3. Designing Model I/O Schemas
Sunday, May 25, 2025 12:24 PM

Why Define Input and Output Schemas?

1. Data Validation and Type Safety:

• Without schema validation, you’re blindly trusting that the incoming data is well-formed—which is risky
• Can implement automatic type checking using Pydantic and rules using Field

2. Clear API Contracts:

• Schemas define a contract for how your API should be used
○ InputSchema = what users should send
○ OutputSchema = what your app will return

3. Improved Developer Experience:

• FastAPI auto-generates beautiful interactive docs (Swagger UI)
• Built-in validation error messages help frontend/backend developers debug easily
• Less guesswork = faster development and fewer bugs

4. Cleaner Code and Reusability

5. Secure and Robust APIs

6. Managing Nested & Complex Structures:

• ML applications often involve:
○ Nested JSON structures
○ Optional fields
○ Lists of structured items
• Schemas make these easy to define and validate using BaseModel, Optional, List, etc.

7. Logging, Auditing, and Testing:

• Well-defined schemas simplify structured logging
• Makes it easier to write tests with known input/output formats
• Help trace issues back to specific schema validation failures

5. ML Integration Page 59
• Help trace issues back to specific schema validation failures

5. ML Integration Page 60
4. Serving ML Models
Sunday, May 25, 2025 12:39 PM

Why Serve ML Models via FastAPI?

1. Decoupling Model from Application Logic:

• Keeps ML logic separate from UI or business logic
• Enables modular, reusable models that can be consumed by multiple applications (web apps, mobile
apps, internal tools, etc.) via HTTP requests

2. Real-Time Predictions:
• Clients can send input and receive predictions instantly via a /predict endpoint
• Essential for use-cases like fraud detection, recommendation engines, or medical triaging where
immediate inference is needed

3. Platform-Agnostic Integration:
• The model is accessible through a standard REST API
• Any frontend, mobile app, or backend service, regardless of language (JS, Java, etc.), can access the
model using HTTP

4. Production-Readiness:
• FastAPI supports ASGI, async I/O, and is built for performance
• FastAPI is faster than Flask, with excellent concurrency, making it scalable for real-world traffic

5. Built-in Validation with Pydantic:

• Ensures clean, validated data before it reaches the model (InputSchema)
• The response is returned in a structured format (OutputSchema)
• Prevents runtime errors due to bad input, reduces bugs, and improves model reliability

6. Docker & Cloud Friendly:

• FastAPI apps can be containerized and deployed on AWS, Azure, GCP, Hugging Face, Render, etc.
• Perfect fit for CI/CD pipelines, Kubernetes, and serverless deployment models

5. ML Integration Page 61
7. Scalable Infrastructure:
• Works with ASGI servers like Uvicorn for high-concurrency
• Allows to serve thousands of predictions per second with proper load balancing

5. ML Integration Page 62
5. Handling Batch Predictions
Sunday, May 25, 2025 2:03 PM

Vectorized Predictions for Speed:

• When making predictions using a machine learning model, especially for multiple data points, it's inefficient to loop over
each input and call the .predict() method one-by-one
• Instead, we should leverage vectorized predictions, i.e., send a whole batch into the model at once
• This drastically improves performance due to:
○ Optimized linear algebra libraries under the hood
○ Reduced I/O overhead
○ Parallelized CPU/GPU execution

Accepting List of Inputs:

• To handle batch predictions via API, the endpoint should accept a list of input objects (typically JSON dictionaries)
• This means instead of a single input, the user will POST a list of inputs to the endpoint

Benefits:

5. ML Integration Page 63
5. ML Integration Page 64

How To Deploy Machine Learning Model As Microservices
No ratings yet
How To Deploy Machine Learning Model As Microservices
7 pages
Fast API
No ratings yet
Fast API
14 pages
MLOps & ML Lifecycle Mastery
No ratings yet
MLOps & ML Lifecycle Mastery
106 pages
AI Engineering Internship Assignment
No ratings yet
AI Engineering Internship Assignment
4 pages
CT1-MLOPs S1 2
No ratings yet
CT1-MLOPs S1 2
68 pages
FastAPI ML Deployment Guide
No ratings yet
FastAPI ML Deployment Guide
14 pages
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
No ratings yet
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
234 pages
ML Deployment & MLOps Guide
No ratings yet
ML Deployment & MLOps Guide
56 pages
ML System Architecture Guide
No ratings yet
ML System Architecture Guide
47 pages
L1 Serving 2
No ratings yet
L1 Serving 2
119 pages
Deploy Machine Learning Models
100% (1)
Deploy Machine Learning Models
45 pages
Week 9-Module 10 Build and Deploy ML Models
No ratings yet
Week 9-Module 10 Build and Deploy ML Models
27 pages
Getting Started With MLOPs 21 Page Tutorial
No ratings yet
Getting Started With MLOPs 21 Page Tutorial
21 pages
Lecture 6 - Model Deployment
No ratings yet
Lecture 6 - Model Deployment
22 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
Notesv 1
No ratings yet
Notesv 1
6 pages
05 Versioning
No ratings yet
05 Versioning
47 pages
MLOps Research Work by Arka Roy
No ratings yet
MLOps Research Work by Arka Roy
21 pages
Pa Unit 5
No ratings yet
Pa Unit 5
17 pages
Machine Learning Systems
No ratings yet
Machine Learning Systems
300 pages
Machine Learning Model Deployment
No ratings yet
Machine Learning Model Deployment
88 pages
Deploying ML Production (Flask - API)
No ratings yet
Deploying ML Production (Flask - API)
27 pages
End-to-End Machine Learning Project Workflows
No ratings yet
End-to-End Machine Learning Project Workflows
5 pages
Deep Learning With Databricks: Srijith Rajamohan, Ph.D. John O'Dwyer
No ratings yet
Deep Learning With Databricks: Srijith Rajamohan, Ph.D. John O'Dwyer
38 pages
Advanced Data Science with Spark
No ratings yet
Advanced Data Science with Spark
47 pages
Expertise in Building and Deploying AI-Powered Autonomous Agents and Scalable Generative AI Solutions
No ratings yet
Expertise in Building and Deploying AI-Powered Autonomous Agents and Scalable Generative AI Solutions
98 pages
Session 29 - MLOps Tools Overview-New
100% (1)
Session 29 - MLOps Tools Overview-New
40 pages
Week 13 GCP Lec Notes
No ratings yet
Week 13 GCP Lec Notes
28 pages
Ben G Weber - Data Science in Production - Building Scalable Model Pipelines With Python-Independently Published (2020)
No ratings yet
Ben G Weber - Data Science in Production - Building Scalable Model Pipelines With Python-Independently Published (2020)
234 pages
AI Web Store - A Quantum-Federated Architecture
No ratings yet
AI Web Store - A Quantum-Federated Architecture
9 pages
Introduction To Machine Learning Serving and Packaging - v2
No ratings yet
Introduction To Machine Learning Serving and Packaging - v2
33 pages
FSDL 2022 Lecture5 Deployment
No ratings yet
FSDL 2022 Lecture5 Deployment
85 pages
2-ML Principles
No ratings yet
2-ML Principles
34 pages
MLOps Asilla 20221124
No ratings yet
MLOps Asilla 20221124
16 pages
CCD Chapter 6 Notes
No ratings yet
CCD Chapter 6 Notes
18 pages
How To Deploy and Test Your Models Using FastAPI and Google Cloud Run - by Antons Tocilins-Ruberts - Towards Data Science
No ratings yet
How To Deploy and Test Your Models Using FastAPI and Google Cloud Run - by Antons Tocilins-Ruberts - Towards Data Science
25 pages
Fast API
No ratings yet
Fast API
4 pages
Project Report
No ratings yet
Project Report
8 pages
ML Libraries PPT (3.3)
No ratings yet
ML Libraries PPT (3.3)
10 pages
Deep Learning Library PDF
No ratings yet
Deep Learning Library PDF
12 pages
DA Python Env Intro
No ratings yet
DA Python Env Intro
47 pages
MLOps Interview Q&A Guide 2024
No ratings yet
MLOps Interview Q&A Guide 2024
19 pages
Applied ML
No ratings yet
Applied ML
74 pages
7 - From ML To Production
No ratings yet
7 - From ML To Production
23 pages
Base Paper 3 - Master Theises
No ratings yet
Base Paper 3 - Master Theises
75 pages
ML Engine
No ratings yet
ML Engine
3 pages
Machine Learning in Python - Main Developments and Technology Trends in Data Science, ML, and AI
No ratings yet
Machine Learning in Python - Main Developments and Technology Trends in Data Science, ML, and AI
2 pages
Unit 4
No ratings yet
Unit 4
28 pages
Latest Python Developments
No ratings yet
Latest Python Developments
5 pages
Machine Learning Systems: Vĳay Janapa Reddi
No ratings yet
Machine Learning Systems: Vĳay Janapa Reddi
1,474 pages
Toward An Open Source MLOps Architecture
No ratings yet
Toward An Open Source MLOps Architecture
6 pages
Deploy A Machine Learning Model As An API On AWS, Step by Step
No ratings yet
Deploy A Machine Learning Model As An API On AWS, Step by Step
12 pages
Mlflow Workshop Part 3
No ratings yet
Mlflow Workshop Part 3
25 pages
Mlops Productionalization Brochure
No ratings yet
Mlops Productionalization Brochure
7 pages
Webinar Slides Mlops
100% (1)
Webinar Slides Mlops
35 pages
Catalago Oficial Corpo Nov
No ratings yet
Catalago Oficial Corpo Nov
47 pages
P10AN01 Service Manual
No ratings yet
P10AN01 Service Manual
109 pages
JavaFX Programs
No ratings yet
JavaFX Programs
8 pages
Fronius Single Interface Devicnet R-J3iB and Higher
No ratings yet
Fronius Single Interface Devicnet R-J3iB and Higher
34 pages
Lecture 1
No ratings yet
Lecture 1
34 pages
EZ-BIST User-Manual v3.3
No ratings yet
EZ-BIST User-Manual v3.3
95 pages
Selenium Document
No ratings yet
Selenium Document
6 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
16 pages
List of Intel Core I5 Microprocessors
No ratings yet
List of Intel Core I5 Microprocessors
4 pages
Aspiring Software Engineer Profile
No ratings yet
Aspiring Software Engineer Profile
2 pages
Linux Virtual Server Tutorial
No ratings yet
Linux Virtual Server Tutorial
27 pages
Testttt
No ratings yet
Testttt
11 pages
PRGV2
No ratings yet
PRGV2
504 pages
6.2.1.2 Packet Tracer - Connect IoT Devices To A Registration Server PDF
No ratings yet
6.2.1.2 Packet Tracer - Connect IoT Devices To A Registration Server PDF
7 pages
Linear Search
No ratings yet
Linear Search
8 pages
Cost Benefit Analysis Template
No ratings yet
Cost Benefit Analysis Template
9 pages
Subqueries: SELECT Select - List From WHERE Expr Operator
No ratings yet
Subqueries: SELECT Select - List From WHERE Expr Operator
12 pages
Ac2000 CDC PDF
No ratings yet
Ac2000 CDC PDF
2 pages
VMware NSX-T Data Center 3.2.2 Release Notes
No ratings yet
VMware NSX-T Data Center 3.2.2 Release Notes
47 pages
Unit 3
No ratings yet
Unit 3
17 pages
AUTOSAR and Functional Safety
100% (1)
AUTOSAR and Functional Safety
26 pages
Operating System
No ratings yet
Operating System
16 pages
Migrating IBM AIX To IBM Hyperconverged Systems Powered by Nutanix - IBM Developer
No ratings yet
Migrating IBM AIX To IBM Hyperconverged Systems Powered by Nutanix - IBM Developer
13 pages
CS536 Final Study Guide: 1 Topic Overview
No ratings yet
CS536 Final Study Guide: 1 Topic Overview
4 pages
MPMC Lab Manual Exps
No ratings yet
MPMC Lab Manual Exps
29 pages
Queue Operations Using Linked List - The Code Cracker
No ratings yet
Queue Operations Using Linked List - The Code Cracker
7 pages
Ift 305
No ratings yet
Ift 305
3 pages
GET 211 - Computing and Software Engineering-N1
100% (2)
GET 211 - Computing and Software Engineering-N1
5 pages
Java - Httpilpintcs - Blogspot.in
No ratings yet
Java - Httpilpintcs - Blogspot.in
152 pages
High Level Batch Language (HLBL) User's Guide: I/A Series System
No ratings yet
High Level Batch Language (HLBL) User's Guide: I/A Series System
98 pages

Module 5 Notes

Uploaded by

Module 5 Notes

Uploaded by

0.

3. Designing Model I/O Schemas

4. Serving ML Models with FastAPI

5. Handling Batch Predictions

What is Model Serialization?

Why Model Serialization is Important:

1. Saves Time and Computational Resources:

2. Portability Across Platforms and Environments:

3. Reproducibility and Consistency:

4. Model Serving and Integration:

5. Foundation for Model Versioning and CI/CD Pipelines:

Why Define Input and Output Schemas?

1. Data Validation and Type Safety:

2. Clear API Contracts:

3. Improved Developer Experience:

4. Cleaner Code and Reusability

5. Secure and Robust APIs

6. Managing Nested & Complex Structures:

7. Logging, Auditing, and Testing:

Why Serve ML Models via FastAPI?

1. Decoupling Model from Application Logic:

5. Built-in Validation with Pydantic:

6. Docker & Cloud Friendly:

Vectorized Predictions for Speed:

Accepting List of Inputs:

You might also like