Hello,
I’m running inference with model unet_fcn_512-cityscapes with pipe torch-aipu on the aetina eval board. It runs at 1.8 fps system and with 500ms of latency although device fps shows 11.5 fps capability. In the doc it is also mentioned that it should reach 18fps. I originally thought it was an issue with time wasted to load and decode png images from the SD card so I put them in shared memory but result are identical.
I also tested yolov5s-v7-coco that should reach 805 fps but I can only achieve 214fps. Here are the output of:
AXELERA_USE_CL_DOUBLE_BUFFER=0 ./inference.py yolov5s-v7-coco media/traffic3_720p.mp4 --show-stats --no-display
INFO : Deploying model yolov5s-v7-coco for 4 cores. This may take a while...
|████████████████████████████████████████| 12:41.1
arm_release_ver: g13p0-01eac0, rk_so_ver: 9
========================================================================
Element Time(𝜇s) Effective FPS
========================================================================
qtdemux0 319 3,126.4
h264parse0 3,094 323.2
capsfilter0 259 3,851.4
mppvideodec0 9,563 104.6
decodebin-link0 91 10,922.0
axtransform-colorconvert0 3,404 293.8
inference-task0:libtransform_resize_cl_0 4,090 244.4
inference-task0:libtransform_padding_0 1,816 550.5
inference-task0:inference 4,405 227.0
inference-task0:Inference latency 94,835 n/a
inference-task0:libdecode_yolov5_0 991 1,008.3
inference-task0:libinplace_nms_0 130 7,679.8
inference-task0:Postprocessing latency 952 n/a
inference-task0:Total latency 110,383 n/a
========================================================================
End-to-end average measurement 214.0
========================================================================Is there anything I can tune to reduce the latency and increase the fps?
Voyager SDK release v1.3.3, Ubuntu 22.04
Thanks in advance

