Skip to content

πŸ“š Quick Reference

πŸš€ Quick Reference

The essential cheat sheet for FlowyML. Find the syntax you need in seconds β€” CLI commands, Python API patterns, YAML configuration, and common recipes.

πŸ’» CLI 🐍 Python API πŸ“„ YAML Config πŸ“‹ Recipes

What this page is

A cheat sheet for common FlowyML commands and patterns. You know what you want to do, you just need the syntax.

πŸ“ Decision Guide: Which Command Do I Need?

I want to... Use this command
Create a new project flowyml init
Run a pipeline locally python pipeline.py or flowyml run pipeline.py
Run with different config flowyml run pipeline.py --context key=value
Deploy to production flowyml run pipeline.py --stack production
Use GPUs flowyml run pipeline.py --resources gpu_training
See what would run flowyml run pipeline.py --dry-run
List available stacks flowyml stack list
Start the web UI flowyml ui start or flowyml go
Import all ZenML components flowyml zenml import-all
Build a Docker image flowyml docker build
Build + push to registry flowyml docker build --push --registry URI
Preview the Dockerfile flowyml docker generate
See auto-detected config flowyml docker inspect
Login to registry flowyml docker login REGISTRY

CLI Commands

Initialize Project

flowyml init                    # Create flowyml.yaml in current directory

When to use: Starting a new FlowyML project. Creates configuration scaffolding.

Stack Management

flowyml stack list             # List all configured stacks
flowyml stack show STACK_NAME  # Show detailed stack configuration
flowyml stack set-default NAME # Set which stack runs by default

When to use: Managing multiple deployment targets (local, staging, production).

Tip

Run flowyml stack list to verify your configuration before deploying to production.

ZenML Integration

Import and use the entire ZenML ecosystem with simple commands:

flowyml zenml status          # Check if ZenML is installed
flowyml zenml list            # List available integrations
flowyml zenml list --installed # Show only installed integrations
flowyml zenml install mlflow   # Install an integration
flowyml zenml import mlflow    # Import components from an integration
flowyml zenml import-all       # Import ALL installed integration components

When to use: Leveraging ZenML's 50+ integrations (MLflow, Kubernetes, AWS, etc.) in FlowyML.

Python equivalent:

from flowyml.stacks import import_all_zenml
components = import_all_zenml()  # One-liner to import everything

Run Pipelines

# Basic run (uses default stack from flowyml.yaml)
flowyml run pipeline.py

# Specify stack
flowyml run pipeline.py --stack production
flowyml run pipeline.py -s production

# Specify resources
flowyml run pipeline.py --resources gpu_training
flowyml run pipeline.py -r gpu_training

# Pass context variables
flowyml run pipeline.py --context data_path=/path/to/data
flowyml run pipeline.py --context key1=val1 --context key2=val2
flowyml run pipeline.py -ctx data_path=/path -ctx model_id=123

# Custom config file
flowyml run pipeline.py --config custom.yaml
flowyml run pipeline.py -c custom.yaml

# Dry run (show what would be executed)
flowyml run pipeline.py --stack production --dry-run

# Combined example
flowyml run pipeline.py --stack production --resources gpu_training --context data_path=gs://bucket/data.csv

Decision guide: - Use --stack when deploying to different environments (local/staging/prod) - Use --resources when you need specific compute (CPU vs GPU) - Use --context to override parameters without changing code - Use --dry-run to verify configuration before expensive runs

Configuration File (flowyml.yaml)

Minimal Configuration

stacks:
  local:
    type: local

default_stack: local

Full Configuration

# Stack definitions
stacks:
  local:
    type: local
    artifact_store:
      path: .flowyml/artifacts
    metadata_store:
      path: .flowyml/metadata.db

  production:
    type: gcp
    project_id: ${GCP_PROJECT_ID}
    region: us-central1
    artifact_store:
      type: gcs
      bucket: ${GCP_BUCKET}
    container_registry:
      type: gcr
      uri: gcr.io/${GCP_PROJECT_ID}
    orchestrator:
      type: vertex_ai

# Default stack
default_stack: local

# Resource presets
resources:
  default:
    cpu: "2"
    memory: "8Gi"

  gpu_training:
    cpu: "8"
    memory: "32Gi"
    gpu: "nvidia-tesla-v100"
    gpu_count: 2

# Docker configuration
docker:
  dockerfile: ./Dockerfile     # Auto-detect existing Dockerfile
  use_poetry: true            # Or use Poetry from pyproject.toml
  # requirements_file: requirements.txt  # Or use requirements.txt
  base_image: python:3.11-slim
  env_vars:
    PYTHONUNBUFFERED: "1"

Environment Variables

Create .env file:

GCP_PROJECT_ID=my-project
GCP_BUCKET=my-artifacts
GCP_SERVICE_ACCOUNT=my-sa@project.iam.gserviceaccount.com

Reference in flowyml.yaml:

stacks:
  production:
    project_id: ${GCP_PROJECT_ID}

Pipeline Code (Clean & Simple)

from flowyml import Pipeline, step, context

@step(outputs=["result"])
def my_step(input_data: str):
    # Your logic
    return input_data.upper()

# NO infrastructure code needed!
ctx = context(input_data="hello")
pipeline = Pipeline("my_pipeline", context=ctx)
pipeline.add_step(my_step)

if __name__ == "__main__":
    result = pipeline.run()
    print(f"Result: {result.outputs['result']}")  # "HELLO"

Common Workflows

Installation

# Basic
pip install flowyml

# With GCP support
pip install flowyml[gcp]

# With ML frameworks
pip install flowyml[tensorflow]
pip install flowyml[pytorch]

# Everything
pip install flowyml[all]

🎯 Real-World Workflow Examples

Scenario 1: Development β†’ Production

Goal: Test locally, then deploy to production with GPUs.

# 1. Develop and test locally
flowyml run train.py

# 2. Verify on staging with production-like resources
flowyml run train.py --stack staging --resources gpu_small

# 3. Deploy to production with full resources
flowyml run train.py --stack production --resources gpu_large

Why this pattern works: Same code, different infrastructure. Zero rewrites.

Scenario 2: Multi-Region Deployment

Goal: Deploy the same pipeline to different cloud regions.

# flowyml.yaml
stacks:
  us-prod:
    type: gcp
    region: us-central1
  eu-prod:
    type: gcp
    region: europe-west1
# Deploy to US region
flowyml run pipeline.py --stack us-prod

# Deploy to EU region
flowyml run pipeline.py --stack eu-prod

Why this pattern works: Data residency compliance, latency optimization, disaster recovery.

Scenario 3: Different Workload Types

Goal: Run preprocessing on CPU, training on GPU.

# flowyml.yaml
resources:
  cpu_only:
    cpu: "4"
    memory: "16Gi"
  gpu_large:
    cpu: "16"
    memory: "64Gi"
    gpu: "nvidia-tesla-v100"
    gpu_count: 4
# Step 1: Preprocess data (CPU only)
flowyml run preprocess.py --resources cpu_only

# Step 2: Train model (GPU required)
flowyml run train.py --resources gpu_large

Why this pattern works: Optimize costs β€” only pay for GPUs when you actually need them.


πŸ“œ Configuration Patterns

When to Use Each Docker Pattern

Pattern Use When Example
Existing Dockerfile You have custom setup needs dockerfile: ./Dockerfile
Poetry You use Poetry for dependency management use_poetry: true
requirements.txt Simple dependencies, widely compatible requirements_file: requirements.txt
Dynamic build Quick prototypes, minimal config base_image: python:3.11-slim

Docker: Use Existing Dockerfile

docker:
  dockerfile: ./Dockerfile
  build_context: .

Use when: You have specialized dependencies, custom binary installs, or complex build steps.

Docker: Use Poetry

docker:
  use_poetry: true

Use when: Managing dependencies with Poetry and want deterministic builds.

Docker: Use Requirements.txt

docker:
  requirements_file: requirements.txt

Use when: Simple, traditional Python projects with pip.

Docker: Dynamic Build

docker:
  base_image: python:3.11-slim
  requirements:
    - tensorflow>=2.12.0
    - pandas>=2.0.0
  env_vars:
    PYTHONUNBUFFERED: "1"

Use when: Rapid prototyping or simple dependencies that don't need a Dockerfile.


☁️ Cloud Stack Presets

FlowyML ships with ready-to-use stacks for Google Cloud, AWS, and Azure. Install the matching optional dependency and instantiate the stack directly from Python.

Google Cloud (Vertex AI pipelines + endpoints)

pip install "flowyml[gcp]"
from flowyml import Pipeline, step
from flowyml.stacks.gcp import GCPStack

stack = GCPStack(
    name="vertex-prod",
    project_id="my-gcp-project",
    region="us-central1",
    bucket_name="flowyml-artifacts",
    registry_uri="gcr.io/my-gcp-project",
)

@step
def train():
    ...

pipeline = Pipeline("trainer", stack=stack)
pipeline.add_step(train)
pipeline.run()

# Deploy a model artifact later
stack.vertex_endpoints.deploy_model(
    model_display_name="fraud-detector",
    artifact_uri="gs://flowyml-artifacts/runs/run-123/model",
    serving_image="gcr.io/my-gcp-project/fraud:prod",
    endpoint_display_name="fraud-prod",
)

AWS (SageMaker or Batch)

pip install "flowyml[aws]"
from flowyml import Pipeline
from flowyml.stacks.aws import AWSStack

stack = AWSStack(
    name="aws-prod",
    region="us-east-1",
    bucket_name="flowyml-artifacts",
    account_id="123456789012",
    orchestrator_type="sagemaker",  # or "batch"
    role_arn="arn:aws:iam::123456789012:role/SageMakerExecutionRole",
)

pipeline = Pipeline("trainer-aws", stack=stack)
pipeline.run()

Azure (Azure ML + Blob + ACR + Cloud Run alternative)

pip install "flowyml[azure]"
from flowyml import Pipeline
from flowyml.stacks.azure import AzureMLStack

stack = AzureMLStack(
    name="azure-prod",
    subscription_id="00000000-0000-0000-0000-000000000000",
    resource_group="flowyml-rg",
    workspace_name="flowyml-ws",
    compute="cpu-cluster",
    account_url="https://flowymlstorage.blob.core.windows.net",
    container_name="artifacts",
    registry_name="flowymlacr",
)

pipeline = Pipeline("trainer-azure", stack=stack)
pipeline.run()

Tip

Authenticate with each cloud provider (gcloud, aws configure, az login) before running remote stacks. The optional extras install the required SDKs (google-cloud-aiplatform, boto3, azure-ai-ml, etc.).

πŸ“ˆ Production Metrics API

Enable digital teams to push real-time model health data straight into flowyml.

Log metrics

curl -X POST https://your-flowyml/api/metrics/log \
  -H "Authorization: Bearer <WRITE_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
        "project": "fraud-monitoring",
        "model_name": "fraud-detector-v3",
        "environment": "prod",
        "metrics": {"precision": 0.92, "recall": 0.87},
        "tags": {"region": "eu-west-1"}
      }'

Query latest metrics

curl -H "Authorization: Bearer <READ_TOKEN>" \
  "https://your-flowyml/api/metrics?project=fraud-monitoring&model_name=fraud-detector-v3&limit=20"

Tokens scoped to a project can only write/read metrics for that project.

Tip

The UI and CLI also surface the same data via /api/projects/<project>/metrics?model_name=..., which is perfect for dashboards scoped to a single project.


πŸ› οΈ Troubleshooting Quick Fixes

"Stack not found"

# Check what stacks you have
flowyml stack list

# Verify config file exists and is valid
cat flowyml.yaml

# Set a default stack if none is set
flowyml stack set-default local

"Missing dependencies"

# Install required extras for GCP
pip install "flowyml[gcp]"

# Or install everything
pip install "flowyml[all]"

"Permission denied" on GCP

# Authenticate with Google Cloud
gcloud auth login
gcloud auth application-default login
gcloud config set project YOUR-PROJECT-ID

# Verify authentication
gcloud auth list

"Pipeline runs locally but fails on cloud"

Common causes: 1. Missing dependencies in Docker image 2. Incorrect file paths (use cloud URIs like gs:// not local paths) 3. Authentication not configured

Solution pattern:

# Test with dry-run first
flowyml run pipeline.py --stack production --dry-run

# Check the generated Docker image
docker images | grep flowyml

# Test the Docker image locally
docker run -it <image-id> /bin/bash


βœ… Pro Tips

  1. Use flowyml init to start new projects β€” Creates proper structure
  2. Store flowyml.yaml in version control β€” Reproducible deployments
  3. Use .env for secrets β€” Never commit credentials!
  4. Define resource presets for common workloads β€” Consistency across team
  5. Use --dry-run to verify configuration β€” Catch errors before expensive runs
  6. Keep pipeline code infrastructure-free β€” Business logic separate from deployment
  7. Use environment variables for dynamic values β€” One config, many environments
  8. Pin dependencies in production β€” Avoid surprise breakages
  9. Test stack configs locally first β€” Use local stack with production patterns
  10. Name stacks by purpose, not just environment β€” us-prod-gpu > prod2

πŸ†• New Features Quick Reference

Map Tasks

from flowyml import map_task

@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process(item: dict) -> dict:
    return transform(item)

result = process(items)  # MapTaskResult with .successes, .failures, .success_ratio

Dynamic Workflows

from flowyml import dynamic, Pipeline

@dynamic(outputs=["best_model"])
def hp_search(config: dict):
    sub = Pipeline("search")
    for lr in config["lrs"]:
        sub.add_step(train_with_lr(lr))
    return sub

Sub-Pipeline Composition

parent = Pipeline("training")
parent.add_sub_pipeline(preprocess_pipeline, inputs=["raw"], outputs=["clean"])
parent.add_step(train_model)

Artifact Catalog

from flowyml import ArtifactCatalog

catalog = ArtifactCatalog()  # Auto-selects local or remote
aid = catalog.register(name="model", artifact_type="Model", tags={"stage": "prod"})
catalog.search("model")
catalog.get_lineage(aid)

Pipeline Snapshots

from flowyml import freeze_pipeline

snapshot = freeze_pipeline(pipeline)
snapshot.verify()  # True if unchanged

Build-Time Type Validation

pipeline.build()  # Automatically validates type annotations between connected steps

Selective Re-Execution

# Resume from checkpoint
result = pipeline.rerun(run_id="abc-123")
# Resume from a specific step
result = pipeline.rerun(run_id="abc-123", from_step="train_model")

Prompt Asset

from flowyml import Prompt

prompt = Prompt(name="summarize", template="Summarize: {text}", model="gpt-4", temperature=0.7)
rendered = prompt.render(text="Long document...")

# Chat-style
chat = Prompt.create(template=[{"role": "user", "content": "Explain {topic}"}], name="chat")
messages = chat.render(topic="ML pipelines")

Checkpoint Asset

from flowyml import Checkpoint

ckpt = Checkpoint.create(data=state_dict, name="epoch_10", epoch=10, metrics={"loss": 0.23})
ckpt.save("checkpoints/epoch_10.pt")
print(ckpt.epoch, ckpt.checkpoint_metrics, ckpt.is_best)

Stack Hydration

from flowyml.plugins.config import PluginConfig
from flowyml.plugins.stack_config import StackManager

config = PluginConfig("flowyml.yaml")
manager = StackManager(config)
live_stack = manager.get_stack("gcp-prod").to_stack()  # YAML β†’ live Stack
pipeline = Pipeline("train", stack=live_stack)

with manager.use_stack("gcp-prod"):  # Temporary switch
    pipeline.run()

Docker Build & Push

# Auto-detect deps, build image
flowyml docker build

# Build with GPU and push to registry
flowyml docker build --gpu --push --registry myregistry.azurecr.io

# Preview generated Dockerfile
flowyml docker generate --deps poetry

# Inspect auto-detected config
flowyml docker inspect

# Login to registry
flowyml docker login myregistry.azurecr.io -u admin

See full guide: Docker Image Management


πŸ“š Learn More


Tip

Bookmark this page! Use it as your go-to reference when you need quick command syntax.


πŸ“ What's Next?

✨ Features Explorer

Deep dive into all FlowyML capabilities.

Features β†’

πŸ“š Cheatsheet

Detailed API cheatsheet with more examples.

Cheatsheet β†’

πŸ“š Examples

Full working code examples.

Examples β†’