Google Cloud Anthos ¶

You can run CVEDIA-RT in a GKE cluster using Google Hybrid Cloud Management Anthos

Anthos can deploy CVEDIA-RT pods at scale with GPU acceleration. CVEDIA-RT will run headless you will need to use our REST API to interact with it.

CVEDIA-RT supports any NVIDIA GPU instance, you should use instances with a single GPU attached. Models for the specific platform you're using will be downloaded on fly.

AMD GPUS

AMD GPU acceleration is not supported, CVEDIA-RT will fallback to CPU inference.

Artifact Setup ¶

Before you can run anything, you need to copy CVEDIA-RT docker image to a Artifact Registry You can find what is the latest CVEDIA-RT docker image at our }{:target="_blank"release page or directly at docker hub

Cluster Setup ¶

If you don't yet have a GKE cluster created, Anthos support 2 types of cluster, autopilot and standard.

When running on autopilot you won't have fine control on what kind of machines are executing tasks, meaning they will all be CPU based instances.

When running on standard you can control your nodes and node pool, allowing you to create much faster GPU instances, however there's less automanagement in terms of costs.

Creating `autopilot` GKE cluster ¶

Open Anthos
Click Configure GKE Autopilot
Set to public
Click Create

Creating `standard` GKE cluster ¶

This type of cluster is recommended if you want to run on GPUs, prices are significantly higher. You might need to ask for GPU quotas, read more in how to gpus.

Open Anthos
Click Configure GKE Standard
Click Node Pools -> default-pool -> nodes
In Machine Configuration, select GPU
Select a GPU type that best suits you, CVEDIA-RT will run in any NVIDIA gpu
Number of GPUs should be always 1
Click Create

Creating a Workload ¶

Open Anthos
Click Workloads in the sidebar
Click + Deploy
Select CVEDIA-RT image from your Artifact Registry
Click + Add environment variable, add name: RUN_UI value: 0
Click Done
Click Continue
Select a Cluster
Click Deploy

Exposing a Workload ¶

Once your workload is running, you will need to expose it in order to be able to access it.

Open Anthos
Click Workloads in the sidebar
Click your running CVEDIA-RT workload instance
In the top header, click on Actions -> Expose
Set port 80, target port 80 and Service type Load Balancer
Click Expose

Once the cluster applies your changes, will be able to access CVEDIA-RT's REST API using the public IP reported by the load balancer in the workload deployment details screen.

Session Affinity ¶

Usually all CVEDIA-RT solutions require a consistent time / frame order for it's tracking systems to work, when you run it with multiple replicas each one of them will have a it's own isolated state, without session affinity tracking will be poor and CVEDIA-RT outputs maybe non sense.

To avoid this problem you should enable sessionAffinity, this will assure that clients will always talk with the same workload instance, you can read more at ingress features

Alternatively, for small deployments, you can run CVEDIA-RT with a single replica, so all traffic gets routed to the same instance.

Notes ¶

CVEDIA-RT container automatically provide health metrics back to the cluster, in the top of that you can use the API /status to query for the instance metrics, allowing for easy scalability.