Hi @sara!
Hmm, yeah I think that number (2 seconds) is end-to-end, so that include everything from network speed to buffering, to decoding on the host and inference/post processing time.
I wonder if we can do some tests to try and figure out exactly where the bottleneck is? Like, maybe (temporarily) replacing the RTSP feed with a local video file? That might give us a clue as to whether the latency is on the feed, or in processing.
(venv) ~/voyager-sdk$ ./inference.py yolov8s-coco-onnx media/traffic1_480p.mp4 --show-stream-timing --show-stats
=========================================================================
Element Time(𝜇s) Effective FPS
=========================================================================
qtdemux0 122 8,133.9
h264parse0 236 4,228.0
capsfilter0 80 12,427.9
decodebin-link0 103 9,639.1
axtransform-colorconvert0 54,198 18.5
inference-task0:libtransform_resize_cl_0 1,384 722.3
inference-task0:libtransform_padding_0 2,239 446.5
inference-task0:inference 50,148 19.9
inference-task0:Inference latency 1,207,065 n/a
inference-task0:libdecode_yolov8_0 3,422 292.2
inference-task0:libinplace_nms_0 27 35,793.8
inference-task0:Postprocessing latency 6,842 n/a
inference-task0:Total latency 1,497,217 n/a
=========================================================================
End-to-end average measurement 18.0
=========================================================================
(venv) ~/voyager-sdk$ ./inference.py yolov8s-coco-onnx media/traffic1_480p.mp4 --show-stream-timing
INFO : Core Temp : 39.0°C
INFO : CPU % : 14.3%
INFO : End-to-end : 20.8fps
INFO : Jitter : 87.0ms
INFO : Latency : 1867.4ms
Hmm your fps seems incredibly low for yolov8s as well. What kind of host are you using? And which Metis product? Are you getting any warning?
If I run yolov8s on a single m.2 device and an x86 i5, I get around 390 fps.
Hi @sara ,
If I remember correctly you were using Raspberry Pi 5, correct?
If so, please see the following info from https://support.axelera.ai/hc/en-us/articles/26362016484114-Bring-up-Voyager-SDK-in-Raspberry-Pi-5 :
If you see that the performance is lower than expected, this is something that can be solved.
The solution is setting AXELERA_USE_DOUBLE_BUFFER=0 at the beginning of the inference.py command. For example:
AXELERA_USE_DOUBLE_BUFFER=0 ./inference.py yolov8s-coco-onnx ./media/traffic1_480p.mp4
For reference, by setting AXELERA_USE_DOUBLE_BUFFER=0, yolov8s-coco-onnx improved from 5.2 FPS end-to-end (which is a bug) to 80.3 FPS end-to-end, which is expected.
Note that this is an existing bug we are investigating. However, with the workaround I shared, performance is almost not affected.
Please, let us know if that solves the issue.
I also recommend you to check, just in case, the section “Note about MVM utilisation limit” in https://support.axelera.ai/hc/en-us/articles/26362016484114-Bring-up-Voyager-SDK-in-Raspberry-Pi-5.
Best,
Victor
Hi @sara, not precisely what you're asking for, but this latency can be reduced significantly if you want to. This latency of 2s is added “artificially” to allow some buffering for a smoother stream. If you are in a controlled environment (private, dedicated network) 40ms should be enough to keep a smooth stream.
Check the RTSP Latency section in the Voyager reference:
https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.3/docs/tutorials/application.md#rtsp-latency
Internally, this configures the “latency” property in the rtspsrc GStreamer element.
My two cents:
- Michael
https://ridgerun.ai
Hi! 
Yes, I've already tested it on a Raspberry, but not anymore. I've been doing a lot of tests to see what features there are and another question came up! Sorry 
I think using gstreamer to visualize the stream looks good and in real time and I've tried using axelera plugins but I'd like to know how I can draw the result of the inference on the image.
For example, in this example you give:
GST_PLUGIN_PATH=`pwd`/operators/lib GST_DEBUG=4 gst-launch-1.0 \
filesrc location=media/traffic3_480p.mp4 ! \
decodebin force-sw-decoders=true caps="video/x-raw(ANY)" expose-all-streams=false ! \
axinplace lib=libinplace_addstreamid.so mode=meta options="stream_id:0" ! \
axtransform lib=libtransform_colorconvert.so options=format:rgba ! \
queue max-size-buffers=4 max-size-time=0 max-size-bytes=0 ! \
axinferencenet \
model=build/yolov7-coco/yolov7-coco/1/model.json \
double_buffer=true \
dmabuf_inputs=true \
dmabuf_outputs=true \
num_children=4 \
preprocess0_lib=libtransform_resize_cl.so \
preprocess0_options="width:640;height:640;padding:114;letterbox:1;scale_up:1;to_tensor:1;mean:0.,0.,0.;std:1.,1.,1.;quant_scale:0.003921568859368563;quant_zeropoint:-128.0" \
preprocess1_lib=libtransform_padding.so \
preprocess1_options="padding:0,0,1,1,1,15,0,0;fill:0" \
preprocess1_batch=1 \
postprocess0_lib=libdecode_yolov5.so \
postprocess0_options="meta_key:detections;anchors:1.5,2.0,2.375,4.5,5.0,3.5,2.25,4.6875,4.75,3.4375,4.5,9.125,4.4375,3.4375,6.0,7.59375,14.34375,12.53125;classes:80;confidence_threshold:0.25;scales:0.003937006928026676,0.003936995752155781,0.003936977591365576;zero_points:-128,-128,-128;topk:30000;multiclass:0;sigmoid_in_postprocess:0;transpose:1;classlabels_file:ax_datasets/labels/coco.names;model_width:640;model_height:640;scale_up:1;letterbox:1" \
postprocess0_mode=read \
postprocess1_lib=libinplace_nms.so \
postprocess1_options="meta_key:detections;max_boxes:300;nms_threshold:0.45;class_agnostic:0;location:CPU" ! \
videoconvert ! \
x264enc ! \
mp4mux ! \
filesink location=output_video.mp4
How do i use the libinplace_draw.so plugin to draw the bounding boxes? is that the point?
Hi @sara,
In v1.3 we can modify the Gst-pipeline i1], by adding libinplace_draw.so right after axinferencenet plugin by either doing following:
! axinplace lib="${AXELERA_FRAMEWORK}/operators/lib/libinplace_draw.so" mode="write" \
! videoconvert \
! ximagesink
Or we can write to an .mp4 file with following:
! axinplace lib="${AXELERA_FRAMEWORK}/operators/lib/libinplace_draw.so" mode="write"\
! videoconvert \
! x264enc \
! mp4mux \
! filesink location=output_video.mp4
Hope this helps!
Feel free to let us know if you have anymore questions, comments or suggestions!
---
<1] https://community.axelera.ai/voyager-sdk-2/raspberry-pi-5-metis-inference-with-gstreamer-pipeline-221?postid=684#post684
It worked! :)
And to RTSP stream? is something like this? I’m getting an error.
GST_PLUGIN_PATH=`pwd`/operators/lib GST_DEBUG=4 gst-launch-1.0 \
rtspsrc location='rtsp:/...' latency=200 ! \
decodebin ! \
axinferencenet \
model=build/yolov8s-coco-onnx/yolov8s-coco-onnx/1/model.json \
double_buffer=true \
dmabuf_inputs=true \
dmabuf_outputs=true \
num_children=4 \
preprocess0_lib=libtransform_resize_cl.so \
preprocess0_options="width:640;height:640;padding:114;letterbox:1;scale_up:1;to_tensor:1;mean:0.,0.,0.;std:1.,1.,1.;quant_scale:0.003921568859368563;quant_zeropoint:-128.0" \
preprocess1_lib=libtransform_padding.so \
preprocess1_options="padding:0,0,1,1,1,15,0,0;fill:0" \
preprocess1_batch=1 \
postprocess0_lib=libdecode_yolov8.so \
postprocess0_options="meta_key:detections;anchors:1.5,2.0,2.375,4.5,5.0,3.5,2.25,4.6875,4.75,3.4375,4.5,9.125,4.4375,3.4375,6.0,7.59375,14.34375,12.53125;classes:80;confidence_threshold:0.25;scales:0.003937006928026676,0.003936995752155781,0.003936977591365576;zero_points:-128,-128,-128;topk:30000;multiclass:0;sigmoid_in_postprocess:0;transpose:1;classlabels_file:ax_datasets/labels/coco.names;model_width:640;model_height:640;scale_up:1;letterbox:1" \
postprocess0_mode=read \
postprocess1_lib=libinplace_nms.so \
postprocess1_options="meta_key:detections;max_boxes:300;nms_threshold:0.45;class_agnostic:0;location:CPU" ! \
axinplace lib=libinplace_draw.so mode=write ! \
videoconvert ! \
x264enc ! \
mp4mux ! \
autovideosink
Hi @sara,
I think between the decodebin and the axinferencenet we need some other plugins to properly convert to a format that is compatible with the latter … and since inference.py supports reading from rtsp links out of the box, perhaps you will have to regenerate the low-level gst yaml file first, i.e. run inference.py with an rtsp link as a source and then goto the build/<model>/logs/gst_pipeline.yaml, and use that to convert it to the gst-launch-1.0 syntax.
Please do let us know If you need further help regarding converting the gst_pipeline.yaml file.
Many thanks!
Hi @Habib 
I managed to solve it with this pipeline:
GST_PLUGIN_PATH=`pwd`/operators/lib GST_DEBUG=2 gst-launch-1.0 \
rtspsrc location='rtsp://….' latency=200 ! \
capsfilter name=rtspcapsfilter0 caps="application/x-rtp, media=video" ! \
decodebin force-sw-decoders=false caps="video/x-raw(ANY)" expose-all-streams=false ! \
videoconvert ! \
axinplace lib=libinplace_addstreamid.so mode=meta options="stream_id:0" ! \
axtransform lib=libtransform_colorconvert.so options=format:rgba ! \
queue max-size-buffers=4 max-size-time=0 max-size-bytes=0 ! \
axinferencenet \
model=build/yolov7-coco/yolov7-coco/1/model.json \
double_buffer=true \
dmabuf_inputs=true \
dmabuf_outputs=true \
num_children=4 \
preprocess0_lib=libtransform_resize_cl.so \
preprocess0_options="width:640;height:640;padding:114;letterbox:1;scale_up:1;to_tensor:1;mean:0.,0.,0.;std:1.,1.,1.;quant_scale:0.003921568859368563;quant_zeropoint:-128.0" \
preprocess1_lib=libtransform_padding.so \
preprocess1_options="padding:0,0,1,1,1,15,0,0;fill:0" \
preprocess1_batch=1 \
postprocess0_lib=libdecode_yolov5.so \
postprocess0_options="meta_key:detections;anchors:1.5,2.0,2.375,4.5,5.0,3.5,2.25,4.6875,4.75,3.4375,4.5,9.125,4.4375,3.4375,6.0,7.59375,14.34375,12.53125;classes:80;confidence_threshold:0.25;scales:0.003937006928026676,0.003936995752155781,0.003936977591365576;zero_points:-128,-128,-128;topk:30000;multiclass:0;sigmoid_in_postprocess:0;transpose:1;classlabels_file:ax_datasets/labels/coco.names;model_width:640;model_height:640;scale_up:1;letterbox:1" \
postprocess0_mode=read \
postprocess1_lib=libinplace_nms.so \
postprocess1_options="meta_key:detections;max_boxes:300;nms_threshold:0.45;class_agnostic:0;location:CPU" ! \
axinplace lib=libinplace_draw.so mode=write ! \
videoconvert ! \
ximagesink
But I'm losing a lot of frames and it's slow :( I've tried increasing the latency, increasing the max-size-buffers=4 but it's still the same. Do you have any suggestions or something I could be doing wrong?
I think it's due to the inference because only the stream works perfectly:
GST_PLUGIN_PATH=`pwd`/operators/lib GST_DEBUG=2 gst-launch-1.0 \
rtspsrc location='rtsp://….' latency=200 ! \
capsfilter name=rtspcapsfilter0 caps="application/x-rtp, media=video" ! \
decodebin force-sw-decoders=false caps="video/x-raw(ANY)" expose-all-streams=false ! \
videoconvert ! \
axinplace lib=libinplace_addstreamid.so mode=meta options="stream_id:0" ! \
axtransform lib=libtransform_colorconvert.so options=format:rgba ! \
queue max-size-buffers=4 max-size-time=0 max-size-bytes=0 ! \
axinplace lib=libinplace_draw.so mode=write ! \
videoconvert ! \
ximagesink
Many thanks @sara! for taking the time to post your solution here, this helps us and the community users who are facing the same issue.
To answer your question:
Have you tried with ximagesink:
! videoconvert \
! ximagesink sync=false
e.g. for me this h1] pipeline was working.
Feel free to let us know more about your project and don’t hesitate to get back to us for more questions, or comments.
Thanks again!
---
1]
> GST_PLUGIN_PATH=${AXELERA_FRAMEWORK}/operators/lib \
gst-launch-1.0 \
rtspsrc \
location=rtsp://127.0.0.1:9550/test \
latency=0 \
! application/x-rtp,media=video \
! decodebin \
force-sw-decoders=true \
expose-all-streams=false \
! videoconvert \
! video/x-raw,format=RGBA \
! axinplace \
lib="${AXELERA_FRAMEWORK}/operators/lib/libinplace_addstreamid.so" \
mode="meta" \
options="stream_id:0" \
! axtransform \
lib="${AXELERA_FRAMEWORK}/operators/lib/libtransform_colorconvert.so" \
options="format:rgb" \
! queue \
max-size-buffers=4 \
max-size-time=0 \
max-size-bytes=0 \
! axinferencenet \
name="inference-task0" \
model="/home/ubuntu/spap_v0/v1/software-platform/host/application/framework/build/yolov8s-coco-onnx/yolov8s-coco-onnx/1/model.json" \
devices="metis-0:1:0" \
double_buffer=false \
dmabuf_inputs=true \
dmabuf_outputs=false \
num_children=1 \
preprocess0_lib="${AXELERA_FRAMEWORK}/operators/lib/libtransform_resize_cl.so" \
preprocess0_options="width:640;height:640;padding:114;letterbox:1;scale_up:1;to_tensor:1;mean:0.,0.,0.;std:1.,1.,1.;quant_scale:0.003921568859368563;quant_zeropoint:-128.0" \
preprocess1_lib="${AXELERA_FRAMEWORK}/operators/lib/libtransform_padding.so" \
preprocess1_options="padding:0,0,1,1,1,15,0,0;fill:0" \
preprocess1_batch=1 \
postprocess0_lib="${AXELERA_FRAMEWORK}/operators/lib/libdecode_yolov8.so" \
postprocess0_options="meta_key:detections;classes:80;confidence_threshold:0.25;scales:0.07061304897069931,0.0678478255867958,0.06895139068365097,0.10098495334386826,0.15165244042873383,0.16900309920310974;padding:0,0,0,0,0,0,0,0|0,0,0,0,0,0,0,0|0,0,0,0,0,0,0,0|0,0,0,0,0,0,0,48|0,0,0,0,0,0,0,48|0,0,0,0,0,0,0,48;zero_points:-68,-58,-44,134,106,110;topk:30000;multiclass:0;classlabels_file:/tmp/tmp8vzqe8_f;model_width:640;model_height:640;scale_up:1;letterbox:1" \
postprocess0_mode="read" \
postprocess1_lib="${AXELERA_FRAMEWORK}/operators/lib/libinplace_nms.so" \
postprocess1_options="meta_key:detections;max_boxes:300;nms_threshold:0.45;class_agnostic:1;location:CPU" \
! axinplace \
lib="${AXELERA_FRAMEWORK}/operators/lib/libinplace_draw.so" \
mode="write" \
! videoconvert \
! ximagesink sync=false