Google Cloud Platform (GCP) βοΈ
Scale your pipelines from local prototypes to production workloads on Google Cloud.
What you'll learn
How to run flowyml pipelines on Vertex AI and store data in GCS. Develop locally, then flip a switch to run on a 100-GPU cluster.
Why Use GCP with flowyml?
Local limitations: - Memory: "OOM Error" on large datasets - Compute: Training takes days on a CPU - Storage: Hard drive full of model checkpoints
GCP advantages: - Infinite Scale: Spin up as many machines as you need - Managed Services: Vertex AI handles the infrastructure - Unified Data: Store everything in GCS, accessible from anywhere
βοΈ GCS Artifact Store
Store your pipeline artifacts (datasets, models) in Google Cloud Storage. This makes them accessible to your team and production systems.
Configuration
π Vertex AI Execution
Run your pipeline steps as Vertex AI Custom Jobs. flowyml handles the Dockerization and submission automatically.
Recommended: Stack-Based Execution (Automatic Orchestrator)
The best practice is to use a GCP stack, which automatically provides the orchestrator, executor, artifact store, and all other components. This is the core concept of stacks - they encapsulate all infrastructure configuration.
Or use active stack from configuration:
Alternative: Explicit Orchestrator Override
If you need to override the stack's orchestrator for a specific run:
Stack-Based is Better
Using a stack ensures all components (orchestrator, artifact store, metadata store, container registry) work together seamlessly.
Cost Control
Vertex AI charges by the second. flowyml ensures resources are only provisioned while your steps are running.