Scikit-Learn Integration π§
Classic ML pipelines made robust and reproducible.
What you'll learn
How to version and deploy sklearn models with automatic metadata extraction. Turn notebook scripts into production pipelines.
Why Scikit-Learn + flowyml?
- Auto-Extracted Metadata: Hyperparameters, feature importance, coefficientsβall captured automatically.
- Pipeline Versioning: Version the entire preprocessing + model chain.
- Model Registry: Promote the best Random Forest to production.
- Easy Serving: Deploy sklearn models as APIs.
π― Model.from_sklearn() Convenience Method
The easiest way to create Model assets with full metadata extraction:
π§ Pipeline Pattern
Return a sklearn.pipeline.Pipeline object for automatic serialization:
π§ Auto-Extracted Properties
The following properties are automatically extracted from sklearn models:
| Property | Description |
|---|---|
framework |
Always 'sklearn' |
model_class |
Class name (RandomForestClassifier, etc.) |
architecture |
Same as model_class |
hyperparameters |
All model hyperparameters |
is_fitted |
Whether model has been fitted |
has_feature_importances |
True for tree-based models |
num_features |
Number of features (if fitted) |
n_features_in |
Number of input features |
n_estimators |
For ensemble models |
num_estimators_fitted |
Actual estimators fitted |
max_depth |
For tree-based models |
num_classes |
For classifiers |
classes |
Class labels (for classifiers) |
coef_shape |
Coefficient shape (for linear models) |
intercept |
Intercept (for linear models) |
π³ Supported Model Types
Full auto-extraction works for all sklearn estimators:
- Classifiers: RandomForest, GradientBoosting, SVM, LogisticRegression, etc.
- Regressors: RandomForest, Ridge, Lasso, ElasticNet, etc.
- Transformers: StandardScaler, PCA, etc.
- Ensembles: VotingClassifier, StackingClassifier, etc.
- Pipelines: sklearn.pipeline.Pipeline