β Step
π£ Step API
Steps are the atomic units of work in a FlowyML pipeline β pure functions decorated with @step.
π― Decorator πΎ Caching π Retries π₯οΈ Resources
@step Decorator Parameters
from flowyml import step
@step(
outputs=["model"],
inputs=["data"],
cache=True,
retries=3,
timeout=600,
)
def train(data):
...
| Parameter | Type | Default | Description |
|---|---|---|---|
outputs |
list[str] |
[] |
Named artifacts this step produces. Used to wire the DAG. |
inputs |
list[str] |
[] |
Named artifacts this step consumes. Automatically resolved from upstream steps. |
cache |
bool |
True |
Enable content-addressed caching. Skips re-execution when inputs haven't changed. |
retries |
int |
0 |
Number of automatic retries on failure before the step is marked as failed. |
timeout |
int | None |
None |
Maximum execution time in seconds. None means no limit. |
resources |
dict | None |
None |
Resource hints (e.g., {"gpu": 1, "memory": "8Gi"}). Interpreted by the orchestrator. |
tags |
list[str] |
[] |
Arbitrary tags for filtering and grouping in the UI. |
description |
str | None |
None |
Human-readable description shown in the UI DAG view. |
Step Function Patterns
Single Output
Multiple Outputs
Context Injection
How Context Injection Works
If the pipeline's context contains a key matching a step parameter name, the context value is injected automatically. The function default is used as a fallback.
No Outputs (Side-Effect Steps)
Common Examples
Cached Training Step with Retries
Dynamic Step with map_task
When to Disable Caching
Disable caching (cache=False) for steps that depend on external state β e.g., fetching live data from an API or reading the latest file from a bucket.
Autodoc
Decorator @step
Decorator to define a pipeline step with automatic context injection.
Can be used as @step or @step(inputs=...)
Every decorated function is automatically registered in a global
StepRegistry, enabling Pipeline(auto_discover=True) to build
the DAG without any manual add_step() calls.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
_func
|
Callable | None
|
Function being decorated (when used as @step) |
None
|
inputs
|
list[str] | None
|
List of input asset names |
None
|
outputs
|
list[str] | None
|
List of output asset names |
None
|
cache
|
bool | str | Callable
|
Caching strategy ("code_hash", "input_hash", callable, or False) |
'code_hash'
|
retry
|
int
|
Number of retry attempts on failure |
0
|
timeout
|
int | None
|
Maximum execution time in seconds |
None
|
resources
|
Union[dict[str, Any], ResourceRequirements, None]
|
Resource requirements (ResourceRequirements object or dict for backward compat) |
None
|
tags
|
dict[str, str] | None
|
Metadata tags for the step |
None
|
name
|
str | None
|
Optional custom name for the step |
None
|
condition
|
Callable | None
|
Optional callable that returns True if step should run |
None
|
execution_group
|
str | None
|
Optional group name for executing multiple steps together |
None
|
pipeline
|
str | None
|
Optional pipeline name for scoped auto-discovery.
When set, the step is only auto-discovered by pipelines
that match this name (or by |
None
|
register
|
bool
|
If False, the step is NOT added to the global registry.
Defaults to True. Set to False for helper/utility steps that
should only be used via explicit |
True
|
Example
@step ... def simple_step(): ... ... @step(inputs=["data/train"], outputs=["model/trained"]) ... def train_model(train_data): ... ...
Scoped to a specific pipeline
@step(pipeline="training", outputs=["model"]) ... def train(data): ... ...
With resource requirements
from flowyml.core.resources import ResourceRequirements, GPUConfig @step(resources=ResourceRequirements(cpu="4", memory="16Gi", gpu=GPUConfig(gpu_type="nvidia-v100", count=2))) ... def gpu_train(data): ... ...
Source code in flowyml/core/step.py
272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 | |
Class Step
A pipeline step that can be executed with automatic context injection.
Source code in flowyml/core/step.py
Functions
__call__(*args: Any, **kwargs: Any) -> Any
Execute the step function.
Source code in flowyml/core/step.py
get_cache_key(inputs: dict[str, Any] | None = None, context_params: dict[str, Any] | None = None) -> str
Generate cache key based on caching strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
dict[str, Any] | None
|
Input data for the step |
None
|
context_params
|
dict[str, Any] | None
|
Context parameters injected into the step. Included in the cache key to prevent stale cache hits when the same step runs with different context values. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Cache key string |
Source code in flowyml/core/step.py
get_code_hash() -> str
Compute hash of the step's source code.
Source code in flowyml/core/step.py
get_input_hash(inputs: dict[str, Any]) -> str
Generate hash of inputs for caching.
π What's Next?
π Context API
Inject parameters into steps with environment-specific configs.