π Glossary
π Glossary
A comprehensive reference of FlowyML terminology. Whether you're new to the framework or need a quick refresher, this glossary covers every key concept.
π¦ Artifacts π’ Pipelines β Steps π Plugins
Artifact
A typed, versioned data object that flows through a pipeline. Artifacts are the "nouns" in FlowyML's artifact-centric architecture. Common artifact types include Model, Dataset, Metrics, and FeatureSet. Each artifact is automatically tracked with full lineage.
Artifact Catalog
A centralized registry of all artifacts produced across pipeline runs. Enables discovery, tagging, and lineage tracing. See Assets & Lineage.
Artifact Store
The storage backend where FlowyML saves pipeline artifacts (models, datasets, metrics). Can be local filesystem, GCS, S3, or Azure Blob Storage. See Plugins Overview.
Auto-DAG
FlowyML's ability to automatically construct the execution graph (DAG) from step input/output declarations. You declare what each step produces and consumes; FlowyML infers how they connect. No manual wiring needed.
Cache / Cache Key
FlowyML's content-based hashing system that skips re-execution of steps whose code and inputs haven't changed. See Caching Guide.
Content Hash
A deterministic hash computed from a step's source code, input artifacts, and configuration. Used for cache invalidation. See Caching Guide.
Context
An immutable configuration object that provides parameters to pipeline steps via automatic injection. Eliminates the need for global variables or manual parameter passing. See Context & Parameters.
DAG (Directed Acyclic Graph)
The execution graph of a pipeline. Nodes are steps, edges are data dependencies. FlowyML auto-constructs the DAG from artifact types declared in step decorators.
Decorator
Python decorators (@step, @dynamic, @map_task) that transform regular functions into FlowyML pipeline components. See Steps.
Evaluation Scorer
A callable that measures model quality. FlowyML includes 29+ built-in scorers for classification, regression, and GenAI tasks. See Evaluations.
ExecutionStatus
An enum representing the current state of a pipeline run (PENDING, RUNNING, COMPLETED, FAILED). See Migration Guide.
FlowyML Notebook
A companion reactive notebook environment. Write Python cells with automatic dependency tracking, then promote directly to FlowyML pipelines. See FlowyML Notebook.
Hook
A callback function registered to fire on specific pipeline lifecycle events (on_start, on_success, on_failure). See Migration Guide.
Judge Arena
An A/B testing framework for evaluation scorers. Run multiple evaluators on the same data and compare their outputs against human labels to find the most reliable judge.
Lineage
The tracking of parent-child relationships between artifacts across pipeline runs. See Artifact Catalog.
LLM-as-a-Judge
Using a Large Language Model to evaluate outputs of another model. FlowyML supports this pattern natively with built-in GenAI scorers and custom prompt templates.
Map Task
A pattern for executing a step in parallel across a collection of inputs. Similar to map() in functional programming but distributed across workers. See Advanced Features.
Materializer
A serializer/deserializer that converts Python objects to/from storage format. Supports custom types. See Materializers.
Metadata Store
The database backend that stores pipeline run history, step metadata, and artifact references. See Plugins Overview.
Model Registry
A versioned catalog of trained models with deployment status tracking. See Model Registry.
Orchestrator
The execution backend that runs pipeline steps (local, Docker, Vertex AI, SageMaker). See Plugins Overview.
Pipeline
The top-level orchestration unit in FlowyML. A pipeline is composed of steps that process and produce artifacts. Pipelines auto-construct their DAG, manage execution ordering, and provide full observability. See Pipelines.
Plugin
An extension module that adds functionality to FlowyML β storage backends, experiment trackers, cloud integrations, notification channels, and more. Plugins are the primary extensibility mechanism. See Plugin System.
Plugin Registry
FlowyML's discovery system that finds and loads plugins via Python entry points. See Plugins Overview.
Quality Gate
An automated pass/fail check in a CI/CD pipeline based on evaluation metrics. If model quality drops below a threshold, the deployment is blocked.
Recipe
A reusable code template available in FlowyML Notebook. 43 built-in recipes across 9 categories (Core, Assets, Parallel, Observability, Evals, Data, ML, Viz, Ecosystem).
SmartPrep Advisor
A feature in FlowyML Notebook that auto-detects data quality issues (missing values, skew, outliers, high cardinality) and generates ready-to-run fix code.
Stack
A named collection of plugins (artifact store + metadata store + orchestrator + optional extras) that defines a deployment target. Change from local to cloud with FLOWYML_STACK=production. See Stack Architecture.
Step
The atomic unit of work in a FlowyML pipeline. A Python function decorated with @step that declares its inputs and outputs. Steps are automatically cached, retriable, and tracked. See Steps.
Step Group
A set of steps that share resources or execution configuration. See Step Grouping.
Sub-Pipeline
A pipeline nested within another pipeline. Enables modular composition of complex workflows from reusable pipeline components.
SubmissionResult
The return type from non-blocking pipeline execution, containing run ID and methods to check status. See Async Execution.
Type-Based Routing
The mechanism by which FlowyML automatically routes artifacts to appropriate storage backends based on their type. For example, Model artifacts may go to a model registry while Dataset artifacts go to a data lake.