Regarding Performance

Question

I created a custom model based on yolo26n, 'yolo26n-aya'.
The performance results for source code (bitmap (256x256) images and mp4 video) are shown below.

(a) images

   (venv) ubuntu@antelao-3588:~/voyager-sdk$ AXELERA_USE_CL_DOUBLE_BUFFER=0 ./inference.py yolo26n-aya ./data/aya100/images --show-stats --no-display
========================================================================
   Element Time(??s) Effective FPS
   ========================================================================
   axinplace-addstreamid0 141 7,076.4
   inference-task0:libtransform_resize_cl_0 371 2,692.7
   inference-task0:libtransform_padding_0 844 1,184.0
   inference-task0:inference 2,961 337.7
   inference-task0:Inference latency 208,874 n/a
   inference-task0:libdecode_yolov8_0 653 1,531.2
   inference-task0:Postprocessing latency 25,097 n/a
   inference-task0:Total latency 308,970 n/a
   ========================================================================
   End-to-end average measurement 0.0
   ========================================================================

(b)mp4

(venv) ubuntu@antelao-3588:~/voyager-sdk$ AXELERA_USE_CL_DOUBLE_BUFFER=0 ./inference.py yolo26n-aya ./media/traffic3_720p.mp4 --show-stats --no-display

   ========================================================================
   Element Time(??s) Effective FPS
   ========================================================================
   qtdemux0 117 8,546.7
   h264parse0 646 1,546.7
   capsfilter0 153 6,516.0
   mppvideodec0 6,368 157.0
   decodebin-link0 115 8,636.4
   inference-task0:libtransform_resize_cl_0 542 1,844.6
   inference-task0:libtransform_padding_0 778 1,285.0
   inference-task0:inference 550 1,816.0
   inference-task0:Inference latency 24,772 n/a
   inference-task0:libdecode_yolov8_0 581 1,719.9
   inference-task0:Postprocessing latency 2,782 n/a
   inference-task0:Total latency 35,448 n/a
   ========================================================================
   End-to-end average measurement 674.6
   ========================================================================

I have a question:
(1) Why are the values of inference-task0:inference different?
(2) My goal is to measure the time (end-to-end) from the input (image) buffer in main memory to the output (result) buffer.

Thank you in advance for your assistance.

Spanner · Answer

Hi @kmiura,From the docs, it seems the Time(µs) column in --show-stats is the inter-frame period at that stage (1 / Effective FPS), not the actual time the AIPU spends on one frame. With the mp4, the pipeline streams continuously and inference reports ~550 µs because that's how often a frame leaves that stage. With an image folder, the source is bounded and the pipeline “drains” rather than streams, so the rate is lower (~2,961 µs). The actual per-frame work the AIPU does would be about the same in both runs, I think?For your end-to-end measurement, the line you want is inference-task0:Total latency. That's the wall-clock per-frame latency from input through to output (35,448 µs for the mp4, 308,970 µs for the images). Note that End-to-end average measurement is reported in the Effective FPS column, not microseconds, so 674.6 there means 674.6 FPS.This might have a bit more info on the subject (or explain it better than I did, anyway😆):

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account

Scanning file for viruses.

This file cannot be downloaded