Skip to content

Google Cloud Anthos

You can run CVEDIA-RT in a GKE cluster using Google Hybrid Cloud Management Anthos

Anthos can deploy CVEDIA-RT pods at scale with GPU acceleration. CVEDIA-RT will run headless you will need to use our REST API to interact with it.

CVEDIA-RT supports any NVIDIA GPU instance, you should use instances with a single GPU attached. Models for the specific platform you're using will be downloaded on fly.


AMD GPU acceleration is not supported, CVEDIA-RT will fallback to CPU inference.

Artifact Setup

Before you can run anything, you need to copy CVEDIA-RT docker image to a Artifact Registry You can find what is the latest CVEDIA-RT docker image at our release page or directly at docker hub

Cluster Setup

If you don't yet have a GKE cluster created, Anthos support 2 types of cluster, autopilot and standard.

When running on autopilot you won't have fine control on what kind of machines are executing tasks, meaning they will all be CPU based instances.

When running on standard you can control your nodes and node pool, allowing you to create much faster GPU instances, however there's less automanagement in terms of costs.

Creating autopilot GKE cluster

  1. Open Anthos
  2. Click Configure GKE Autopilot
  3. Set to public
  4. Click Create

Creating standard GKE cluster

This type of cluster is recommended if you want to run on GPUs, prices are significantly higher. You might need to ask for GPU quotas, read more in how to gpus.

  1. Open Anthos
  2. Click Configure GKE Standard
  3. Click Node Pools -> default-pool -> nodes
  4. In Machine Configuration, select GPU
  5. Select a GPU type that best suits you, CVEDIA-RT will run in any NVIDIA gpu
  6. Number of GPUs should be always 1
  7. Click Create

Creating a Workload

  1. Open Anthos
  2. Click Workloads in the sidebar
  3. Click + Deploy
  4. Select CVEDIA-RT image from your Artifact Registry
  5. Click + Add environment variable, add name: RUN_UI value: 0
  6. Click Done
  7. Click Continue
  8. Select a Cluster
  9. Click Deploy

Exposing a Workload

Once your workload is running, you will need to expose it in order to be able to access it.

  1. Open Anthos
  2. Click Workloads in the sidebar
  3. Click your running CVEDIA-RT workload instance
  4. In the top header, click on Actions -> Expose
  5. Set port 80, target port 80 and Service type Load Balancer
  6. Click Expose

Once the cluster applies your changes, will be able to access CVEDIA-RT's REST API using the public IP reported by the load balancer in the workload deployment details screen.

Session Affinity

Usually all CVEDIA-RT solutions require a consistent time / frame order for it's tracking systems to work, when you run it with multiple replicas each one of them will have a it's own isolated state, without session affinity tracking will be poor and CVEDIA-RT outputs maybe non sense.

To avoid this problem you should enable sessionAffinity, this will assure that clients will always talk with the same workload instance, you can read more at ingress features

Alternatively, for small deployments, you can run CVEDIA-RT with a single replica, so all traffic gets routed to the same instance.


CVEDIA-RT container automatically provide health metrics back to the cluster, in the top of that you can use the API /status to query for the instance metrics, allowing for easy scalability.