π’ Enterprise Stacks
Centrally define, govern, and version execution stacks β platform teams own infrastructure, data scientists focus on code.
π StackDefinition π StackResolver π‘οΈ PolicyEngine π AuditStore
π’ Enterprise Stacks
What you'll learn
How to define governed execution stacks using YAML manifests, resolve them from multiple sources, enforce policies, and manage stack lifecycles in enterprise environments.
Overview
Enterprise Stacks decouple infrastructure configuration from pipeline code. Platform teams define stacks centrally; data scientists consume them by name.
# Data scientist's code β no infrastructure details
pipeline = Pipeline("training", stack="prod-gpu-large")
pipeline.run()
The stack prod-gpu-large is resolved, validated against policies, and wired into the pipeline automatically.
Stack Resolution Flow
When a pipeline references a stack, FlowyML resolves it through a priority chain:
graph TD
A["Pipeline(stack='...')"] --> B{Explicit stack arg?}
B -->|Yes| C[Use directly]
B -->|No| D{env= parameter?}
D -->|Yes| E[Resolve from project config]
D -->|No| F{FLOWYML_STACK env var?}
F -->|Yes| G[Resolve from registry]
F -->|No| H{flowyml.yaml?}
H -->|Yes| I[Load active stack]
H -->|No| J[Default LocalStack]
The stack argument accepts:
| Input Type | Example | Resolution |
|---|---|---|
str (name) |
"prod-gpu" |
Registry / project config lookup |
str (URI) |
"github://org/repo@v1#stack" |
Remote source resolution |
Stack instance |
my_stack |
Used directly |
StackDefinition |
definition |
Converted via .to_stack() |
StackDefinition YAML
Define stacks as declarative YAML manifests:
apiVersion: flowyml/v1
kind: StackDefinition
metadata:
name: prod-gpu-large
version: "2.1.0"
labels:
team: ml-platform
environment: production
approved: "true"
spec:
compute:
backend: azureml
size: Standard_NC6s_v3
min_nodes: 0
max_nodes: 4
runtime:
python_version: "3.11"
docker_image: myregistry.azurecr.io/flowyml:latest
storage:
artifact_store: abfs://artifacts@mystorage.dfs.core.windows.net
metadata_store: sqlite:///shared/metadata.db
tracking:
experiment_tracker: mlflow
mlflow_uri: https://mlflow.company.com
secrets:
provider: azure_keyvault
vault_url: https://my-vault.vault.azure.net
Stack Sources
The StackResolver can load definitions from multiple sources:
Local File
GitHub / GitLab
# Or programmatically via URI
pipeline = Pipeline("train", stack="github://my-org/ml-stacks@v2.1.0#prod")
HTTP
Stack Registry
from flowyml.stacks.enterprise.resolver import StackResolver
resolver = StackResolver()
definition = resolver.resolve(stack="prod-gpu-large", env="production")
Policy Engine
Enforce organizational policies on all stacks:
# flowyml.yaml β project-level governance
project:
name: fraud-detection
org: fintech-corp
policies:
allowed_stacks:
- prod-gpu-*
- staging-*
required_labels:
- approved
- team
image_policy:
allowed_registries:
- myregistry.azurecr.io
- gcr.io/my-project
required_labels:
- maintainer
- version
The PolicyEngine validates stacks before execution:
- β Stack name matches allowed patterns
- β Required metadata labels are present
- β Docker images come from approved registries
- β Base images pass security checks
Audit & Locking
Audit Store
Every stack operation is recorded:
from flowyml.stacks.enterprise.audit import AuditStore
audit = AuditStore()
events = audit.get_events(stack_name="prod-gpu-large")
# β [StackApplied, StackValidated, PolicyChecked, ...]
Stack Locking
Prevent concurrent modifications:
from flowyml.stacks.enterprise.lock import StackLockManager
lock_manager = StackLockManager()
with lock_manager.acquire("prod-gpu-large"):
# Exclusive access to stack configuration
...
CLI Commands
# List all available stacks
flowyml enterprise stack list
# Show stack details
flowyml enterprise stack show prod-gpu-large
# Validate a stack definition
flowyml enterprise stack validate ./stacks/prod.yaml
# Apply a stack (register/update)
flowyml enterprise stack apply ./stacks/prod.yaml
Environment-Based Resolution
Map environments to stacks in your project config:
# flowyml.yaml
environments:
dev:
stack: local
staging:
stack: staging-cpu
production:
stack: prod-gpu-large
Best Practices
Version your stacks
Pin stack versions in production to ensure reproducibility. Use semantic versioning for stack definitions.
Separate concerns
Platform teams own stack definitions (compute, storage, networking). Data scientists own pipeline code (steps, models, metrics).
Policy enforcement
Always enable policy enforcement in production. Unapproved stacks should be blocked before execution, not after.