Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
How Does the "View" Method Work in Python PyTorch?
Next article icon

How to deploy PyTorch models on Vertex AI

Last Updated : 29 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

PyTorch is a freely available machine learning library that can be imported and used inside the code for performing machine learning operations based on requirements. The front-end api is written in Python and the tensor operations are implemented using C++. It is developed by Facebook's AI Research Lab (FAIR). It is easy to use, adaptive (provides flexibility) and most importantly it poses dynamic computation graph capability (providing graphs based on the input in run time at that instant).

  • Overview of Vertex AI
  • Installation of libraries
  • Implement Pytorch Model
  • Flask Application and Pytorch Model
  • Dockerizing the Flask Application
  • Setting up Google Cloud Environment
  • Push the Dockerized Flask Image to GCP Container Registry
  • Deploy the GCP Container to GCP Vertex AI
  • Test and Monitor the Deployed Container

Overview of Vertex AI

Vertes AI is a service which is provided by Google Cloud Platform (GCP) which allow developers to deploy the machine learning model , to build the machine learning model and most importantly to scale it up very conveniently. It comprises of various tools and services that a developer can access and can efficiently manage the the machine learning model from building the model to deployment of the model to scaling of the model everything can be done inside the service provided by Vertex AI.

Terminologies related to Vertex AI

Model Artifact :- when a machine learning model produce files and data in the training process it is known as Model Artifact . It is required because without this Artifact the model after the training phase cannot be deployed in the production or it cannot be used.

Vertex AI Model Registry :- It basically acts as a overall container (repository) for storing and managing the various type of machine learning model and developers can access it throughout the development phase.

Google Cloud Storage (GCS) :- This is also a service provided by the google in which it offers a scalable storage option to store data on demand basis and charge accordingly. It can handle a huge volume of data also as it is scalable and efficient.

Containerization :- It means that the dependencies and the application are packed into a container and this container performs in all possible computing environment which assure that it should perform same regardless of the its deployment environment.

Model Endpoint :- It is basically a dedicated URL or network location which can be accessed and can be use for making prediction for the deployed machine learning model. It plays a important role as it help in sending data to the model and receiving result from it which a client can do by accessing the endpoints.

Installation of the required library

Let's add the required libraries to the requirements.txt file.

Flask==3.0.3
Flask-Cors==4.0.1
numpy==2.0.0
torch==2.3.1
gunicorn==22.0.0

By using the pip install command one can install the libraries mentioned in the requirements.tx file. The command is as follows:

pip install -r requirements.txt

Since we are dockerizing the application, we will mention the above installation command in the Dockerfile.

Implementation of the PyTorch model

Let's implement a Pytorch model by applying a linear transformation to the incoming data. We can make use of one nn.Lienar model, one of the fundamental component of PyTorch. The code is as follows:

Python
import torch  import torch.nn as nun  class SimpleModel(nun.Module):      def __init__(self):         super(SimpleModel, self).__init__()         self.linear = nun.Linear(10, 1)     def forward(self, x):          return self.linear(x) 

The nn.Linear applies a linear transformation to input data using weight and biases. The  module takes two parameters called in_features and out_features. These parameters represent the number of input and output features. Upon object creation, it randomly initializes a weight matrix and a bias vector.

Let's try a sample prediction before we save the model.

Python
model= SimpleModel() model.linear 

Output

Linear(in_features=10, out_features=1, bias=True)

Here we have iniatialzed a linear Pytorch model. Now let's create a random input data of same matrix and make the prediction

Python
x = torch.randn(1, 10) t1 = x.to(torch.float)  with torch.no_grad():     prediction = model(t1).tolist()      prediction 

Output

[[-0.26785045862197876]]

So our model works fine. Next we can save our model so that our flask application can load it and make prediction.

Saving the PyTorch model

The model can be saved by using the following code

Python
model= SimpleModel() torch.save(model.state_dict(),'model.pth') 


Flask Application and Pytorch Model

As a next step, we need to create a flask application and load the Pytorch model. Finally, one can make predictions using the model by invoking a REST API.

Create a Flask Application

Let's create a directory called 'app'. Inside 'app' folder we can create a main.py file. The file contains the code to create flask application. The main.py file is as follows:

Python
from flask import (   Flask, request, jsonify) from flask_cors import CORS  @app.route('/health', methods=['GET']) def health():   return jsonify(status='healthy'), 200  @app.route('/predict', methods=['POST']) def predict():   return None      app = Flask(__name__) CORS(app)  if __name__ == '__main__':   app.run(host='0.0.0.0', port=8080) 

Here we created a basic structure needed for a FLASK application. Now the predict() method does nothing. To make it functional we need to load our Pytorch model and make predictions when the user invokes the REST API ('/predict'). Also, we created a health monitoring API, which is used to check the health of the deployed model. The /health route is mentioned while creating the endpoint in the GCP Vertex AI.

Loading the Pytorch model in Flask

To load the Pytorch model, it is necessary to mention the linear module in our Flask application. The code is as follows:

Python
import torch  import torch.nn as nun   # linear module class SimpleModel (nun.Module):     def __init__(self):          super(SimpleModel, self). __init__()          self.linear = nun.Linear(10, 1)     def forward(self,x):         return self.linear(x)  # initialize the module model = SimpleModel()  # load the module model.load_state_dict(torch.load('model.pth')) 

Here we created a linear Pytorch model and loaded the saved Pytorch model. Now we can construct the predict() method.

Python
@app.route('/predict', methods=['POST']) def predict():     data = request.json['inputs']     data = torch.tensor(data)     with torch.no_grad():         prediction = model(data).tolist()         return jsonify(prediction=prediction) 

The complete code is as follows:

Python
from flask import (   Flask, request, jsonify) from flask_cors import CORS  import torch  import torch.nn as nun   # linear module class SimpleModel (nun.Module):     def __init__(self):          super(SimpleModel, self). __init__()          self.linear = nun.Linear(10, 1)     def forward(self,x):         return self.linear(x)  # initialize the module model = SimpleModel()  # load the module model.load_state_dict(torch.load('model.pth'))  @app.route('/health', methods=['GET']) def health():   return jsonify(status='healthy'), 200  @app.route('/predict', methods=['POST']) def predict():     data = request.json['inputs']     data = torch.tensor(data)     with torch.no_grad():         prediction = model(data).tolist()         return jsonify(prediction=prediction)      app = Flask(__name__) CORS(app)  if __name__ == '__main__':   app.run(host='0.0.0.0', port=8080) 


Dockerizing the Flask Application

To dockerize the flask application, it is necessary to create a docker file with necessary installation and running commands.

Creation of a Dockerfile

You need to create a Dockerfile in the same folder where 'app' directory is mentioned. The commands in the Dockerfile is as follows.

FROM python:3.9-slim

# Install libraries
COPY ./requirements.txt ./
RUN pip install -r requirements.txt && \
rm ./requirements.txt

# container directories
RUN mkdir /app

# Copy app directory (code and Pytorch model) to the container
COPY ./app /app

# run server with gunicorn
WORKDIR /app
EXPOSE 8080
CMD ["gunicorn", "main:app", "--timeout=0", "--preload", \
"--workers=1", "--threads=4", "--bind=0.0.0.0:8080"]

Now we need to build a docker conatiner based on the above docker file. Before that lets check the directory structure. The directory structure is as follows:

app
|-- main.py
|-- model.pth
Dockerfile
requirements.txt

The app directory contains our flask based python code (main.py) and the Pytorch model (model.pth).

Build a Docker Container

To build a docker container you need to execute the command below:

docker build -t flask-torch-docker .

The above command will execute the Dockerfile and build a docker named 'flask-torch-docker'

Run a Docker Container

Let's run the 'flask-torch-docker' docker using below command

docker run -it -p 8080:8080 flask-torch-docker

Testing a Docker Container Locally

It can be tested by using the curl command, below is the mentioned code.

curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"inputs": [[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]]}'

Output

{  "prediction": 0.785}

The command prompt screenshot is as follows:

Screenshot-2024-07-12-171531

Push the dockerized flask image (Pytorch model) to GCP

In the above steps we created a Pytorch model using Flask, dockerized it and ensured that the dockerized application is working locally. Now it's time to move to the crucial step, which is deploying the Pytorch model on Vertex AI. In this article we will deploy our dockerized image to the Google Cloud Registry (container) and then later deploy the container to VertexAI. As a first step we need to set up Google Cloud Environment

Setting up Google Cloud Environment

To set up google cloud environment, the user needs to create an account or sign in through the google account and add the payment details and after that user have the access to Google Cloud CLI (for managing the resources and services ). You can create a Google cloud project and install the gcloud CLI. Now we can focus on how to push our dockerised image to Google Cloud Registry (GCR).

Steps to push dockerized image to GCR

Let's look at the steps to push the dockerized image to GCR. The steps are as follows:

Step 1: Initializing the google cloud SDK (software development kit )

gcloud init

Step 2: Setting up Docker to authenticate requests to GCR (Google Cloud Registry) using the gcloud command-line tool

gcloud auth configure-docker

Step 3: Building the Docker image

docker build -t flask-torch-docker:latest .

Step 4: Add your GCP project ID and the preferred region to the Docker image

docker tag flask-torch-docker:latest gcr.io/your-project-id/flask-torch-docker:latest

In the above command replace the your-project-id with the project ID of your system. You can use the below command to get all the project IDs.

gcloud projects list

The above command will list the project IDs from where you can choose and replace it there and run the command.

Step 5: Pushing the Docker image to GCR (Google Cloud Registry )

docker push gcr.io/your-project-id/flask-torch-docker:latest

Same in the above command replace the project ID of your system in place of your-project-id.

Screenshot-2024-07-18-161626
The pushed image can be checked under the Artifact Registry

Deploying the GCP container to Vertex AI

Now we have deployed our dockerized Py torch model to the Google Cloud Registry, which acts as a container. Next step is to deploy the container to Vertex AI. You can login in to your google cloud account and search for Vertex AI. The page is as shown below:

Screenshot-2024-07-12-180937
This is the home page of Vertex AI , after this you can click on enable all APIs below.

Import the Model Using Model Registry

To import the model, you can choose the model registry functionality from Vertex AI. The model registry page is as follows:

Screenshot-2024-07-12-182107
After clicking on the model registry

Here user can create a new model for registry or can import the model model from container registry or artifact registry. In this article we will deploy the model by importing the container from artifact registry and provide necessary model setting details. The steps are as follows:

Step 1: Create new model and provide appropriate name and region

Here we create a new model (for existing model, you can update the version) and provide a name and appropriate region.

Step 2: Import an existing custom container

conatiner_from_artifact
import custom container from artifact registry

Here we choose the option to import the existing custom conatiner from artifact registry and browse the container from artifact registry, which has the dockerized Pytorch model (flask application).

Step 3: Provide Model Setting details

prediction_route
setting details

You can set the prediction route as /predict and port as 8080 (as mentioned in dockerized flask app). For Health route you can mention it as "/health".

Step 4: Click the 'Import model' to create the model in the Model Registry

Finally, one can click the Import model to create the model in the Model Registry.

Define the endpoint and Deploy the Model

In the above steps, we created a model in the model registry. Now we need to define the endpoint and deploy the model. In the Model Registry we have the option to Deploy the endpoint. Select the Endpoints from navigation menu and the click on create and then configure it.

Step 1: Enter the model name and select the region

Screenshot-2024-07-18-163548
After clicking on Deploy End-Points.

Step 2: Mention details in Model Settings

Screenshot-2024-07-18-164047
In the Model Setting first set the Traffic split.
Screenshot-2024-07-18-164523
Then set the number of computing node and click on Done at the bottom side

Here we set the traffic split and also set the number of computing node.

Step 3: Deploy the model

After configuring the necessary endpoint details you can deploy the model by clicking on 'DEPLOY TO ENDPOINT'.

Screenshot-2024-07-18-170401
After deploying the model it will be displayed

After creating the endpoints click on deploy then select the model name and configure the rest setting according to the requirement and click on deploy.

Testing the Endpoints and Monitoring the Model

For testing the Endpoints you can use the following curl command.

curl -X POST https://<your-endpoint-url>/predict \
-H "Content-Type: application/json" \
-d '{"inputs": [[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]]}'

Here replace the "your-endpoint-URL" with your system endpoint URL and then run the command it will return a JSON output.

{
"prediction": 0.785
}

For monitoring the Deployed model user can navigate to deploy and use and there choose monitoring and the following page will appear where user can monitor the deployed model and also can configure the monitoring according to the user convenience .

Screenshot-2024-07-12-193643
After clicking on monitoring
Screenshot-2024-07-18-171527
Monitoring the Deployed Model.

Additional features of Vertex AI

  • User can customize the labels of the dataset and set other parameter and after that deploying the model as per the prizing of google cloud vertex AI service and it will be successfully deployed.
  • Apart from this there are other services which are provided like workbench , colab Enterprise , etc. to customize and to Collaboratory work together.

Applications

  • It can be used in classification of any skin related diseases and other different variety of diseases and correctly identifying it.
  • It can be used in prediction overall weather forecast for the upcoming years and the expected values ranges.
  • It can be used in self driven cars by providing quality data set which will help in learning the trajectory of paths and smooth operations can be performed.

Next Article
How Does the "View" Method Work in Python PyTorch?

N

nrudrz5sc
Improve
Article Tags :
  • Data Science
  • Blogathon
  • Vertex AI
  • Data Science Blogathon 2024

Similar Reads

  • Train and Deploy Model With Vertex AI
    Machine learning models may be created, deployed, and managed on Google Cloud using the Vertex AI service. Vertex AI may be used to supply models with live or batch predictions and train models using a variety of techniques, including AutoML or custom training. This post will demonstrate how to use
    5 min read
  • Save and Load Models in PyTorch
    It often happens that we need to use the already-trained models to perform some operations in our development environment. In this case, would you create the model again and again? Or, you will save the model somewhere else and load it as per the requirement. You would definitely choose the second o
    10 min read
  • How to use a DataLoader in PyTorch?
    Operating with large datasets requires loading them into memory all at once. In most cases, we face a memory outage due to the limited amount of memory available in the system. Also, the programs tend to run slowly due to heavy datasets loaded once. PyTorch offers a solution for parallelizing the da
    2 min read
  • How to Convert a TensorFlow Model to PyTorch?
    The landscape of deep learning is rapidly evolving. While TensorFlow and PyTorch stand as two of the most prominent frameworks, each boasts its unique advantages and ecosystems. However, transitioning between these frameworks can be daunting, often requiring tedious reimplementation and adaptation o
    6 min read
  • How Does the "View" Method Work in Python PyTorch?
    PyTorch, a popular open-source machine learning library, is known for its dynamic computational graphs and intuitive interface, particularly when it comes to tensor operations. One of the most commonly used tensor operations in PyTorch is the .view() function. If you're working with PyTorch, underst
    5 min read
  • How to Get the Value of a Tensor in PyTorch
    When working with PyTorch, a powerful and flexible deep learning framework, you often need to access and manipulate the values stored within tensors. Tensors are the core data structures in PyTorch, representing multi-dimensional arrays that can store various types of data, including scalars, vector
    5 min read
  • How to Make a grid of Images in PyTorch?
    In this article, we are going to see How to Make a grid of Images in PyTorch. we can make a grid of images using the make_grid() function of torchvision.utils package. make_grid() function: The make_grid() function accept 4D tensor with [B, C ,H ,W] shape. where B represents the batch size, C repres
    3 min read
  • Train a Deep Learning Model With Pytorch
    Neural Network is a type of machine learning model inspired by the structure and function of human brain. It consists of layers of interconnected nodes called neurons which process and transmit information. Neural networks are particularly well-suited for tasks such as image and speech recognition,
    6 min read
  • How to handle overfitting in PyTorch models using Early Stopping
    Overfitting is a challenge in machine learning, where a model performs well on training data but poorly on unseen data, due to learning excessive noise or details from the training dataset. In the context of deep learning with PyTorch, one effective method to combat overfitting is implementing early
    7 min read
  • How to Install Pytorch on MacOS?
    PyTorch is an open-source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license. Prerequisites:
    2 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences