Benchmarking¶
CVEDIA-RT comes with an benchmarking tool that's able to measure raw model performance.
Raw benchmarks
This is raw inference throughput, when running a full solution there's other factors and CPU overhead at play.
Running examples¶
In these examples we're running using openvino.CPU
as backend, you can use auto
to get the best device or, for example, tensorrt.1
to run on a NVIDIA GPU.
Linux Docker
./run.sh -b openvino.CPU://pva_det/rgb/medium_y6_mosaic_rot90_320x320_v1/230927 -- -- -i 1 -n 1000
Windows / Linux Native
rtcmd inference benchmark -u openvino.CPU://pva_det/rgb/medium_y6_mosaic_rot90_320x320_v1/230927 -i 1 -n 1000
Benchmark outputs look like this:
Time Preprocess Inference Inference Postprocess
cpu time latency throughput cpu time
----------- ----------- ----------- ----------- -----------
00:00:01 2.27659ms 12.7302ms 57.6541/s 2.19597ms
00:00:02 2.27185ms 17.026ms 45.7256/s 2.16367ms
00:00:03 2.27645ms 16.9761ms 46.7662/s 2.17962ms
00:00:04 2.27391ms 16.6448ms 46.7197/s 2.14666ms
00:00:05 2.27481ms 16.6908ms 47.7137/s 2.15048ms
00:00:06 2.27685ms 16.5005ms 47.7137/s 2.17683ms
00:00:07 2.27733ms 16.4393ms 47.7612/s 2.1931ms
00:00:08 2.29306ms 16.3252ms 47.7137/s 2.21609ms
Different backends¶
Depending on your hardware you may have different hardware backends, you can find which ones are available calling the listnndevices
binary:
Linux Docker
./run.sh -B
Windows / Linux Native
rtcmd vpu
Outputs look like this:
Found the following devices:
GUID Description
----------------------- -----------------------
blaize.auto Runs on best available device
mnn.auto Runs on best available device
onnx.cpu CPU
onnx.tensorrt TensorRT
onnx.directml DirectML
onnx.cuda CUDA
openvino.CPU CPU
openvino.GNA GNA
tensorrt.1 NVIDIA GeForce GTX 1080 Ti (pci bus id: 1:0:0)
When running benchmark, you can relay on auto://
prefix or specify a GUID, such as tensorrt.1://
to benchmark the model in a specific device.
Different models¶
Depending on the solution you're running CVEDIA-RT might use different models, you can inspect what models are being used by checking the base_config.json
within the solution you're running or by using the modelforge
to list models.