Implementing MLOps on GCP
Introduction
In this hack, you’ll implement the full lifecycle of an ML project. We’ll provide you with a sample code base and you’ll work on automating continuous integration (CI), continuous delivery (CD), and continuous training (CT) for a machine learning (ML) system.
Note This gHack is inspired by the methodology from this article.
There’s no coding involved, we’ve already prepared the code to train a simple scikit-learn model; this could’ve been any other framework too, the model code has no dependencies on any Google Services or libraries.
We’re using the New York Taxi dataset to build a RandomForestClassifier to predict whether the tip for the trip is going to be more than 20% of the fare.
First step is all about exploration and running that code in an interactive environment for development and experimentation purposes.
Then we’ll store that code in a version control system so the whole team has access to it and we can keep track of all changes.
After that we’ll automate continuous integration and building of packages through build pipelines in Challenge 3.
Challenge 4 is all about data-to-model pipelines, orchestrating data extraction, validation, preparation, model training, evaluation and validation.
Once the model has been trained, in Challenge 5 we’ll deploy that model to an API endpoint for real-time inferencing, or choose for the batch option and run batch inferencing.
Challenge 6 is all about monitoring that endpoint/batch predictions and detecting any drift/skew between training data and inferencing data.
And finally in Challenge 7 we’ll bring all these things together by tapping into model monitoring and triggering re-training when the model starts to behave off.
Warning As of June 2024 Cloud Source Repositories is end of sale. However, any organization that has created at least one CSR repository in the past, will still have access to existing repositories and will be able to create new ones. If you’re running this in a Qwiklabs environment you’re good to go, but if you’re running this in your own environment, please verify that you have access to Cloud Source Repositories in your organization.
Learning Objectives
This hack will help you explore the following tasks:
- Using Cloud Source Repositories for version control
- Using Cloud Build for automating continuous integration and delivery
- Vertex AI for
- Exploration through an interactive environment
- Training on diverse hardware
- Model registration
- Managed pipelines
- Model serving
- Model monitoring
Challenges
- Challenge 1: Let’s start exploring!
- Challenge 2: If it isn’t in version control, it doesn’t exist
- Challenge 3: You break the build, you buy cake
- Challenge 4: Automagic training with pipelines
- Challenge 5: Make it work and make it scale
- Challenge 6: Monitor your models
- Challenge 7: Close the loop
Prerequisites
- Knowledge of Python
- Knowledge of Git
- Basic knowledge of GCP
- Access to a GCP environment
Contributors
- Murat Eken