π¦ Materializers
What you'll learn
How to teach FlowyML to serialize and deserialize your custom objects. If you can't save it, you can't cache it, version it, or inspect it in the UI.
Materializers control how FlowyML persists and loads artifacts. Built-in materializers handle common types automatically; custom materializers let you support any Python object.
Why Custom Serialization Matters π€
| Without Materializers | With Materializers |
|---|---|
Relying on pickle for everything (brittle, insecure) |
Optimized format per type |
| Saving a model as bytes loses its metadata | Save with hyperparameters and metadata |
| The UI can't show a preview of custom objects | Rich visualization in the dashboard |
| Non-portable: only Python can read pickle | Standard formats (Parquet, ONNX, CSV) |
Built-in Materializers π¦
FlowyML automatically selects the appropriate materializer based on type hints:
| Materializer | Types | Format | When Used |
|---|---|---|---|
PandasMaterializer |
pd.DataFrame, pd.Series |
Parquet or CSV | DataFrames and Series |
NumpyMaterializer |
np.ndarray |
.npy |
NumPy arrays |
JsonMaterializer |
dict, list, str, int, float |
JSON | Simple Python types |
PickleMaterializer |
Anything else | Pickle | Fallback for arbitrary objects |
Usage is automatic β just add type hints to your steps:
π Creating Custom Materializers
Subclass BaseMaterializer to support your own types:
Example: PyTorch Model Materializer
Example: ONNX Model Materializer
Registering Materializers π§
Register once at startup β FlowyML will auto-select it whenever the matching type appears:
Using Custom Types in Steps π§©
Once registered, FlowyML automatically uses your materializer when a step returns the associated type:
BaseMaterializer API
| Method | Description |
|---|---|
handle_input(data_type) |
Deserialize artifact from storage β Python object |
handle_return(obj) |
Serialize Python object β artifact in storage |
| Class Variable | Type | Description |
|---|---|---|
ASSOCIATED_TYPES |
tuple[type, ...] |
Types this materializer handles |
Best Practices π‘
Use type hints
FlowyML selects materializers based on type hints. Always annotate your step return types for automatic serialization.
Prefer standard formats
Save models as ONNX, datasets as Parquet, configs as JSON. This makes artifacts usable outside Python.
Avoid pickle in production
The PickleMaterializer is the fallback β it's insecure and non-portable. Register custom materializers for production types.