[go: up one dir, main page]

0% found this document useful (0 votes)
5 views8 pages

Why Docker Compose

Docker Compose is essential for developing real-world machine learning applications that consist of multiple independent services, as it allows for the definition, building, and running of multi-container Docker applications. The document outlines a step-by-step process to create a real-time feature pipeline using Docker Compose, which includes writing Dockerfiles for each service, creating a docker-compose.yml file, building the services, and running the entire stack. This approach ensures that the application is portable from local development to production environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views8 pages

Why Docker Compose

Docker Compose is essential for developing real-world machine learning applications that consist of multiple independent services, as it allows for the definition, building, and running of multi-container Docker applications. The document outlines a step-by-step process to create a real-time feature pipeline using Docker Compose, which includes writing Dockerfiles for each service, creating a docker-compose.yml file, building the services, and running the entire stack. This approach ensures that the application is portable from local development to production environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Why Docker Compose?

Docker is great, because it makes your ML code 100% portable

 from your local development environment, to


 the production environment, for example a Kubernetes Cluster.

However, Docker alone is often not enough when you develop real-
world ML apps.

Why?

Because ML applications

 are not all-in-one monoliths, that you can dockerize into a


single container,

 but multiple independent services that communicate


through a shared message bus, like Apache Kafka or
Redpanda.
This is the micro-services way of building software, that applies to
ML engineering as well.

So, when you develop your services locally, and want to test they
work as expected, you need a tool that helps you define, build and
run multi-container Docker applications in your local environment.

And this is precisely what Docker Compose helps you with.

Let’s go through a step-by-step example.

Example
All the code that I show here is available in this Github repository.
→ Give it a star ⭐ on Github to support my work.

Let’s develop and build a real-time feature pipeline with the help
of Docker Compose.

Our real-time feature pipeline has 3 steps:

 (producer)→ reads trades from the Kraken


trade_producer
Websocket API and saves them in a Kafka topic.
 trade_to_ohlc (transformation) → reads trades from Kafka
topic, computes Open-High-Low-Close candles (OHLC) using
Stateful Window Operators, and saves them in another Kafka
topic.
 ohlc_to_feature_store (consumer) → saves these final features to
an external Features Store.

Each of these steps is implemented as an independent service


real-time-feature-pipeline
├── ohlc_to_feature_store
│ ├── main.py
│ └── requirements.txt
├── trade_producer
│ ├── main.py
│ └── requirements.txt
└── trade_to_ohlc
├── main.py
└── requirements.txt
and communication between them in production happens through a
message bus like Redpanda.

Let’s fully dockerize this stack, using Docker Compose, in 4 steps.

Step 1 → Write a Dockerfile for each


service
For each service you need to write and commit a Dockerfile, that
defines the Docker image
real-time-feature-pipeline
├── ohlc_to_feature_store
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
├── trade_producer
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
└── trade_to_ohlc
├── Dockerfile
├── main.py
└── requirements.txt
In this case, our Dockerfiles would look as follows:

Step 2 → Write the docker-compose.yml file


The Docker compose file is a YAML file you write and commit in the
root of your directory.
real-time-feature-pipeline
├── docker-compose.yml
├── ohlc_to_feature_store
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
├── trade_producer
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
└── trade_to_ohlc
├── Dockerfile
├── main.py
└── requirements.txt
In the the docker-compose you specify the list of services you need
to spin up, in this case:

 The 3 pipeline steps: trade_producer, trade-to-


ohlc and ohlc-to-feature-store, and

 The Redpanda message broker used by these services to


communicate, that requires two independent
services: redpanda and redpanda-console
Now, for each of these service, you need to provide

 build instructions, including path to Dockerfile or any


required environment variables, and
 runtime information, like number of replicas or forced
restarts for your containers.

For example, this is what the trade-producer section looks like:


Step 3 → Build your services
From the root directory of your project run
$ docker-compose build
to build your full stack, including the 5 services and the necessary
networking for them to communicate.
Step 4 → Run your entire stack
To run and test your entire stack locally, you simply run
$ docker-compose up -d
And if works, YOU ARE DONE.

Because this is the magic of Docker and Docker Compose.

If it works on your laptop. It also works on production.

You might also like