Benchmarking ¶

CVEDIA-RT comes with an benchmarking tool that's able to measure raw model performance.

Raw benchmarks

This is raw inference throughput, when running a full solution there's other factors and CPU overhead at play.

Running examples ¶

In these examples we're running using openvino.CPU as backend, you can use auto to get the best device or, for example, tensorrt.1 to run on a NVIDIA GPU.

Linux Docker

./run.sh -b openvino.CPU://pva_det/rgb/medium_y6_mosaic_rot90_320x320_v1/230927 -- -- -i 1 -n 1000

Windows / Linux Native

rtcmd inference benchmark -u openvino.CPU://pva_det/rgb/medium_y6_mosaic_rot90_320x320_v1/230927 -i 1 -n 1000

Benchmark outputs look like this:

           Time     Preprocess      Inference      Inference    Postprocess
                      cpu time        latency     throughput       cpu time
    -----------    -----------    -----------    -----------    -----------
       00:00:01      2.27659ms      12.7302ms      57.6541/s      2.19597ms
       00:00:02      2.27185ms       17.026ms      45.7256/s      2.16367ms
       00:00:03      2.27645ms      16.9761ms      46.7662/s      2.17962ms
       00:00:04      2.27391ms      16.6448ms      46.7197/s      2.14666ms
       00:00:05      2.27481ms      16.6908ms      47.7137/s      2.15048ms
       00:00:06      2.27685ms      16.5005ms      47.7137/s      2.17683ms
       00:00:07      2.27733ms      16.4393ms      47.7612/s       2.1931ms
       00:00:08      2.29306ms      16.3252ms      47.7137/s      2.21609ms

Different backends ¶

Depending on your hardware you may have different hardware backends, you can find which ones are available calling the listnndevices binary:

Linux Docker

./run.sh -B

Windows / Linux Native

rtcmd vpu

Outputs look like this:

Found the following devices:
GUID                                    Description
-----------------------                 -----------------------
blaize.auto                             Runs on best available device
mnn.auto                                Runs on best available device
onnx.cpu                                CPU
onnx.tensorrt                           TensorRT
onnx.directml                           DirectML
onnx.cuda                               CUDA
openvino.CPU                            CPU
openvino.GNA                            GNA
tensorrt.1                              NVIDIA GeForce GTX 1080 Ti (pci bus id: 1:0:0)

When running benchmark, you can relay on auto:// prefix or specify a GUID, such as tensorrt.1:// to benchmark the model in a specific device.

Depending on the solution you're running CVEDIA-RT might use different models, you can inspect what models are being used by checking the base_config.json within the solution you're running or by using the modelforge to list models.

Benchmarking¶

Running examples¶

Different backends¶

Different models¶

Benchmarking ¶

Running examples ¶

Different backends ¶

Different models ¶