Skip to content

πŸ€” Why FlowyML?

πŸ€” Why FlowyML?

Every ML framework promises to make pipelines easy. FlowyML actually delivers β€” by rethinking the problem from first principles. Here's why teams switch from traditional orchestrators and never look back.

πŸ“¦ Artifact-Centric ⚑ Zero Boilerplate ☁️ Multi-Cloud πŸ€– GenAI Native


🎯 The Problem with Traditional Orchestrators

Most ML orchestrators were born from data engineering β€” they think in terms of tasks (verbs). You manually wire steps together, manage data handoff paths, and hope nothing breaks when you switch clouds.

The Traditional Way

# Airflow / Prefect / Luigi style
load_task = LoadDataTask()
train_task = TrainModelTask()
eval_task = EvaluateTask()

# Manual wiring β€” YOU decide the order
load_task >> train_task >> eval_task

# Manual data passing β€” YOU manage paths
train_task.set_upstream(load_task)
train_task.params["data_path"] = "s3://bucket/data/train.csv"

This approach creates three fundamental problems:

  1. Brittle wiring β€” Add a step, rewire everything. Remove a step, rewire everything.
  2. Lost lineage β€” Data flows through opaque paths. "Which model was trained on which dataset?" becomes a detective game.
  3. Cloud lock-in β€” Hardcoded s3:// paths and cloud-specific APIs everywhere.

πŸ’Ž The FlowyML Way: Artifacts First

FlowyML flips the paradigm. Instead of telling the system how steps connect, you declare what each step produces and consumes. The DAG builds itself.

The FlowyML Way

from flowyml import step, Pipeline, context, Model, Dataset

@step(outputs=["dataset"])
def load_data() -> Dataset:
    return Dataset.from_csv("data.csv")

@step(inputs=["dataset"], outputs=["model"])
def train(dataset: Dataset, learning_rate: float) -> Model:
    return Model(train_classifier(dataset, lr=learning_rate))

@step(inputs=["model", "dataset"], outputs=["metrics"])
def evaluate(model: Model, dataset: Dataset) -> dict:
    return {"accuracy": model.score(dataset)}

# No arrows. No wiring. Just run.
pipeline = Pipeline("training", context=context(learning_rate=0.01))
pipeline.add_step(load_data).add_step(train).add_step(evaluate)
pipeline.run()

FlowyML inferred that train depends on load_data (both reference dataset) and that evaluate depends on both. The DAG is built from data dependencies, not manual arrows.


πŸ“Š FlowyML vs. The Competition

Feature Comparison

Capability Airflow Prefect ZenML Metaflow FlowyML
Core paradigm Task DAGs Task flows Pipeline/Step Flow/Step Artifact-Centric
DAG construction Manual >> Manual .submit() Manual wiring Linear @step Auto-inferred
Data handoff XCom / files Results Artifact Store S3 datastore Typed catalog
Type safety None None Runtime None Build-time
Cloud switching Rewrite DAGs Use blocks Stack swap @batch only One env var
Built-in UI Yes Cloud only Dashboard Metadata UI Full dashboard
GenAI observability No No No No Built-in
Evaluation framework No No No No 29+ scorers
Model registry No No Plugin No Built-in
LLM cost tracking No No No No Built-in
Learning curve High (YAML) Medium Medium Low Very Low
License Apache 2.0 Mixed Apache 2.0 Apache 2.0 Apache 2.0

Philosophy Comparison

πŸ”§ Airflow

Built for ETL and data engineering. DAGs defined in Python but execution model is task-centric. Complex scheduler, complex deployment (Kubernetes, Celery). Best for: batch data pipelines at scale.

Weakness for ML: No native artifact types, no model registry, no experiment tracking.

πŸŒ€ Prefect

Built for modern data workflows. Python-native with decorators. Cloud-first with Prefect Cloud. Good DX. Best for: data engineering teams who want a modern Airflow alternative.

Weakness for ML: No artifact-centric design, no ML-specific features, cloud lock-in for full features.

πŸš€ ZenML

Built for MLOps pipelines. Stack-based infrastructure abstraction. Closest to FlowyML in philosophy. Best for: teams who want MLOps without vendor lock-in.

Weakness for ML: Manual step wiring, no built-in GenAI observability, smaller eval ecosystem.


⚑ What Makes FlowyML Different

πŸ“¦ Artifacts Are First-Class

Models, Datasets, Metrics aren't just files β€” they're typed objects with automatic versioning, lineage tracking, and cloud routing. Define the type, and FlowyML handles storage.

πŸ”€ DAGs Build Themselves

Declare inputs and outputs on each step. FlowyML analyzes data dependencies and builds the execution graph. Add or remove steps without rewiring anything.

☁️ One Env Var to Production

FLOWYML_STACK=production python pipeline.py β€” same code, different infrastructure. Switch from local to GCP Vertex AI to AWS SageMaker with zero code changes.

πŸ€– GenAI Native

Built-in LLM tracing, cost tracking, and evaluation for LangGraph, LangChain, OpenAI SDK, or any framework. No LangSmith subscription needed β€” it's all included.

🎯 29+ Evaluation Scorers

Classification, regression, and GenAI scorers with CI/CD quality gates. Adapters for DeepEval, RAGAS, and Phoenix. Judge Arena for A/B testing evaluators.

πŸ–₯️ Beautiful Dashboard

Dark-mode web UI with DAG visualization, experiment comparison, model training curves, GenAI trace viewer, and asset browser β€” all in real-time via WebSocket.


🏁 Ready to Try?

πŸš€ 5-Minute Quick Start

Build your first pipeline from scratch.

pip install flowyml

Get Started β†’

πŸ““ Visual Pipeline Design

Use the reactive notebook companion.

pip install flowyml-notebook

FlowyML Notebook β†’

✨ Explore Features

Deep dive into all 20+ capabilities.

Features Explorer β†’