Skip to main content
Question

Reproducing the Model Zoo benchmarks

  • May 6, 2026
  • 2 replies
  • 13 views

AlbertaBeef
Cadet

Hello, 

I am having issues reproducing the Axelera Model Zoo benchmarks on a Axelera Metis M.2 module.

My issues seem to be related to pre/post processing, which is why I started with a classification model.

I was hoping to use “fakevideo” as input, to bypass the pre-processing, but that seems to be fixed at 30fps.

Another unresolved issue I have is that opencl is not working on my AMD PC.

For ResNet-50 v1.5, here is what I am getting … 

(venv) voyager-sdk$ ./inference.py resnet50-imagenet data/coco --disable-opencl --no-display --show-stats
========================================================================           
Element                                         Time(𝜇s)   Effective FPS
========================================================================
axinplace-addstreamid0                                16        61,066.3
vaapipostproc0                                     1,927           518.9
videoconvert0                                         17        56,486.4
axinplace0                                             7       130,629.7
inference-task0:libtransform_resizeratiocropexcess_0
                                                     149         6,677.7
inference-task0:libtransform_totensor_0                7       142,241.8
inference-task0:libinplace_normalize_0                16        59,300.1
inference-task0:libtransform_padding_0                20        47,901.4
inference-task0:inference                          2,884           346.7
inference-task0:Inference latency                 42,056             n/a
inference-task0:libtransform_paddingdequantize_0
                                                       5       184,454.6
inference-task0:libdecode_classification_0             8       116,059.4
inference-task0:Postprocessing latency               726             n/a
inference-task0:Total latency                     45,652             n/a
========================================================================
End-to-end average measurement                                     351.1
========================================================================
Core Temp  : 37.0°C
CPU %      : 5.5%
End-to-end : 351.1fps
Latency    : 42.8ms (min:9.1 max:57.3 σ:3.9 x̄:42.7)ms

 

 

2 replies

AlbertaBeef
Cadet
  • Author
  • Cadet
  • May 7, 2026

After posting my question, I found the solution provide by ​@Steven Hunsche in another post:


This got me through my opencl issue on my AMD Ryzen AI MAX+ 395 PC.
This effectively doubled the throughput achieved (from 350 FPS to 716 FPS).

Still a ways from the 1756 FPS benchmark for ResNet-50 on M.2 Metis.

Here is where I am currently at:

(venv) abbeefai@AlbertaBeefAI:/media/abbeefai/TheExpanse/shared_with_docker/voyager-sdk$ ./inference.py resnet50-imagenet data/coco --enable-opencl --no-display --show-stats
========================================================================                                    
Element                                         Time(𝜇s)   Effective FPS
========================================================================
axinplace-addstreamid0                                10        94,033.6
axtransform-colorconvert-cl0                         323         3,092.8
inference-task0:libtransform_centrecropextra_0
                                                       0     1,357,590.2
inference-task0:libtransform_resize_cl_0              81        12,201.0
inference-task0:libtransform_padding_0                39        25,571.2
inference-task0:inference                          1,415           706.5
inference-task0:Inference latency                 22,083             n/a
inference-task0:libtransform_paddingdequantize_0
                                                       4       238,789.7
inference-task0:libdecode_classification_0             6       160,551.4
inference-task0:Postprocessing latency               614             n/a
inference-task0:Total latency                     25,152             n/a
========================================================================
End-to-end average measurement                                     716.8
========================================================================
Core Temp  : 37.0°C
CPU %      : 5.0%
End-to-end : 716.8fps
Latency    : 19.6ms (min:7.0 max:27.4 σ:1.9 x̄:19.7)ms

 


Spanner
Axelera Team
Forum|alt.badge.img+3
  • Axelera Team
  • May 7, 2026

Ah, nice work on ramping it up anyway, ​@AlbertaBeef !

In terms of the differences with the benchmarks, a few things that come to mind could be the host system (I think the benchmarks were run on an Intel i9. Not sure about performance difference, but it is an architectural difference, I guess). 

And I think the benchmark numbers are measured against multiple input streams running in parallel to spread the load across all four cores. This might have some interesting directions to test in terms of multiple inputs: https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.5/docs/reference/inference.md

Let me know how it goes!