Getting Started with FlowyML ๐
Welcome to FlowyML! In the next 5-10 minutes, you'll go from zero to running your first production-ready pipeline. No prior MLOps experience required.
What You'll Learn & Why
[!NOTE] What you'll build: A complete ML pipeline with data loading, training, and real-time monitoring.
What you'll master: The core concepts that make FlowyML powerful: steps, pipelines, context injection, and the visual UI.
Why this matters: These same patterns scale from quick prototypes to enterprise deployments serving millions of predictions.
Installation ๐ฆ
FlowyML requires Python 3.9 or higher.
Basic Installation
[!TIP] Pro Tip: Use a virtual environment (
venvorconda) to avoid dependency conflicts with other projects.
Full Installation (Recommended)
Includes UI support and common ML dependencies:
What this gets you: The web dashboard, Keras integration, cloud storage backends, and everything you need for production deployments. Start with this unless you have size constraints.
Verify Installation
You should see the version number. If not, check that your Python PATH is configured correctly.
Your First Project ๐
Let's create a new project using the CLI.
This creates a directory structure like this:
my-first-project/
โโโ flowyml.yaml # Project configuration
โโโ README.md # Project documentation
โโโ requirements.txt # Python dependencies
โโโ src/
โโโ pipeline.py # Your pipeline code lives here
[!TIP] Why this structure? It separates code (
src/), configuration (flowyml.yaml), and dependencies (requirements.txt) โ exactly what you need for clean version control and team collaboration.
Creating a Pipeline ๐งช
Open src/pipeline.py and replace its content with this simple example:
from flowyml import Pipeline, step, context
# 1. Define Steps
@step(outputs=["data"])
def fetch_data():
print("Fetching data...")
return [1, 2, 3, 4, 5]
# 2. Define another Step with inputs
@step(inputs=["data"], outputs=["processed"])
def process_data(data):
print(f"Processing {len(data)} items...")
return [x * 2 for x in data]
# 3. Create and configure the Pipeline
if __name__ == "__main__":
# Create pipeline
pipeline = Pipeline("my_first_pipeline")
# Add steps in order
pipeline.add_step(fetch_data)
pipeline.add_step(process_data)
# 4. Run it!
result = pipeline.run()
if result.success:
print(f"โ Pipeline finished successfully!")
print(f"Result: {result.outputs}")
else:
print(f"โ Pipeline failed")
Understanding What Just Happened
Let's break down the key concepts:
-
@stepdecorator: Turns any Python function into a pipeline step. Theoutputs=["data"]tells FlowyML what this step produces. -
Data flow: The
@step(inputs=["data"], ...)onprocess_dataautomatically connects it tofetch_data's output. No manual wiring needed. -
Pipeline assembly:
pipeline.add_step()builds your DAG. FlowyML figures out the execution order based on data dependencies. -
Execution:
pipeline.run()executes all steps in the right order and returns a result object with status and outputs.
[!IMPORTANT] Why this matters: This same pattern works whether you have 3 steps or 300. The complexity doesn't grow with your pipeline.
Running the Pipeline โถ๏ธ
Execute the script:
You should see output indicating the steps are executing:
Fetching data...
Processing 5 items...
โ Pipeline finished successfully!
Result: {'processed': [2, 4, 6, 8, 10]}
[!TIP] Pro Tip: Pipelines are idempotent by default. Run it again and watch how caching kicks in โ steps that haven't changed won't re-execute.
Visualizing with the UI ๐ฅ๏ธ
Now, let's see your pipeline in the FlowyML UI โ this is where the magic happens for debugging and monitoring.
Step 1: Start the UI server
You'll see:
๐ FlowyML UI server started
๐ Dashboard: http://localhost:8080
๐ API: http://localhost:8080/api
[!NOTE] What's running: A lightweight FastAPI server that displays your pipeline runs, DAG visualizations, and artifact inspection โ all in real-time.
Step 2: Run your pipeline (in a separate terminal)
Step 3: Watch it live!
Open your browser to http://localhost:8080. You'll see:
- Pipeline DAG: Visual graph showing step dependencies
- Real-time execution: Steps highlight as they run
- Artifact inspection: Click any step to see its inputs/outputs
- Run history: Compare different runs side-by-side
Why the UI matters: Imagine debugging a failed step at 3 AM in production. Instead of grep'ing through logs, you see exactly: - Which step failed - What its inputs were - The full error traceback - What downstream steps were skipped
Adding Context & Parameters ๐๏ธ
Let's make the pipeline configurable using context โ one of FlowyML's killer features.
Update your pipeline:
from flowyml import Pipeline, step, context
@step(outputs=["data"])
def fetch_data(dataset_size: int = 5): # โ Parameter with default
print(f"Fetching {dataset_size} items...")
return list(range(dataset_size))
@step(inputs=["data"], outputs=["processed"])
def process_data(data, multiplier: int = 2): # โ Another parameter
print(f"Processing with multiplier={multiplier}...")
return [x * multiplier for x in data]
if __name__ == "__main__":
# Create context with your config
ctx = context(
dataset_size=10,
multiplier=3
)
# Pass context to pipeline
pipeline = Pipeline("configurable_pipeline", context=ctx)
pipeline.add_step(fetch_data)
pipeline.add_step(process_data)
result = pipeline.run()
print(f"Result: {result.outputs}")
Run it again:
Output:
The Power of Context Injection
[!TIP] Why this is revolutionary: You just separated configuration from code. The same pipeline can run with different configs for: - Dev: Small dataset for fast iteration - Staging: Medium dataset for integration testing - Production: Full dataset for real predictions
Change the context, not the code. This is how you go from prototype to production without rewriting.
Next Steps ๐
Congratulations! You've built a complete pipeline with monitoring. Here's where to go next based on your goals:
๐ฏ I want to build production pipelines
โ Projects & Multi-Tenancy: Learn to organize multiple pipelines, isolate environments, and manage teams
โ Scheduling: Automate your pipelines with cron-style scheduling
โ Versioning: Track pipeline changes and rollback when needed
๐ I want to optimize performance
โ Caching Strategies: Save compute time and costs with intelligent caching
โ Parallel Execution: Run independent steps concurrently
โ Performance Guide: Benchmark and optimize your pipelines
๐ฌ I want advanced ML features
โ Assets & Lineage: Work with typed artifacts (Datasets, Models, Metrics)
โ Model Registry: Version and manage models
โ LLM Tracing: Track GenAI costs and performance
๐ง I want to understand concepts deeply
โ Core Concepts: Pipelines: Master pipeline design patterns
โ Core Concepts: Steps: Learn step best practices
โ Core Concepts: Context: Advanced context injection techniques
๐จ I want to integrate with my stack
โ Keras Integration: Automatic experiment tracking for Keras
โ GCP Integration: Deploy to Google Cloud Platform
โ Custom Components: Extend FlowyML for your needs
Questions or stuck? Check out the Resources page for community links, tutorials, and support channels.
Ready to dive deeper? The User Guide is your next stop for production-grade patterns.