Skip to content

🐳 Docker Integration

What you'll learn

How to run pipelines in isolated Docker containers β€” eliminate "it works on my machine" bugs forever.

Containerize your pipelines for reproducible execution anywhere β€” from a laptop to a Kubernetes cluster.


Why Docker?

Feature Benefit
Isolation Each step runs in a clean environment
Reproducibility Identical code and dependencies in dev, staging, prod
Portability Move from local Docker to K8s or cloud without code changes
Dependency Control No conflicts between different step requirements

🐳 Running on Docker

FlowyML can automatically build and run your steps in Docker containers:

1
2
3
4
5
6
7
8
from flowyml.integrations.docker import DockerOrchestrator

pipeline.run(
    orchestrator=DockerOrchestrator(
        image="python:3.11-slim",    # Base image
        install_deps=True,            # Auto-install requirements.txt
    )
)

Configuration

Parameter Type Default Description
image str python:3.11-slim Base Docker image
install_deps bool True Auto-install requirements.txt
dockerfile str None Path to custom Dockerfile
build_context str "." Docker build context
volumes dict {} Volume mounts (host:container)
env_vars dict {} Environment variables

πŸ›  Custom Dockerfiles

For complex dependencies, provide your own Dockerfile:

1
2
3
4
orchestrator = DockerOrchestrator(
    dockerfile="./Dockerfile",
    build_context=".",
)

Example Dockerfile

FROM python:3.11-slim

# System dependencies
RUN apt-get update && apt-get install -y gcc libgomp1 && rm -rf /var/lib/apt/lists/*

# Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Application code
COPY . /app
WORKDIR /app

πŸ”— Volume Mounts

Mount local directories into the container for data access:

1
2
3
4
5
6
7
orchestrator = DockerOrchestrator(
    image="python:3.11-slim",
    volumes={
        "/data/training": "/app/data",       # Host β†’ Container
        "/models/registry": "/app/models",
    },
)

Best Practices

Pin image versions

Use python:3.11.7-slim instead of python:3.11-slim for reproducible builds.

Multi-stage builds

Use multi-stage Dockerfiles to keep images small β€” build dependencies in one stage, copy only artifacts to the final stage.

GPU support

For GPU steps, use NVIDIA base images (e.g., nvidia/cuda:12.0-runtime) and install nvidia-docker2.