inference on a frame obtained from the memory space

Hi again @Spanner ! 😊

I need to infer an image obtained from byte data stored in a memory space (https://docs.python.org/3/library/mmap.html) and after converting the bytes I get an image.

I've seen the sources available for create_inference_stream and I know that I can make inferences to images (https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.2.5/docs/tutorials/application.md), but I have to specify the/path/to/file. But in this case I won't have that information.

In the code I'll have this line where I read the bytes in the memory space:

frame_size=image_width*image_height*3 //RGB
frame_bytes = mm.read(frame_size)
And then I get the new image:
frame = np.frombuffer(frame_bytes,dtype=np.uint8).reshape((image_height,image_width, 3))

Is there any way to infer “frame”?

Thank you!

Page 1 / 1

Ah, great question @sara! This is a really interesting use case, and I’m not sure I’ve seen that come up before! Let me quickly ask around, and see what we can find out about doing this 👍

Hi @sara! Sorry for the delay, but I was just chatting with @jaydeep.de and he says you’re on the right track with reading the image from memory. For your use case — inferring from a NumPy array without a file path — take a look at this:

https://github.com/axelera-ai-hub/voyager-sdk/tree/release/v1.2.5/examples/axruntime

These examples show how to load a compiled model and perform inference directly on in-memory data using the low-level AxRuntime API. So it sounds like a good match for your setup, and you shouldn’t need to save the image to disk at all.

Let me know how it goes!

I'm having some doubts trying to use this code base for object detection with YOLOv8.

do you have any examples?

I'm having some doubts trying to use this code base for object detection with YOLOv8.

do you have any examples?

Not besides the examples in that folder at the moment, unfortunately. That script is set up for classification rather than detection, really - is it detection you’re looking for?

That being said, is it still useful for showing how to load a model and run inference from memory, which you were looking to achieve? If you’re working with a YOLOv8 model that’s already compiled with the Voyager SDK, could you reuse the same runtime approach?

Perhaps a few additional details about your setup (host, OS, project objective and such) might also help us to find the right path 🙂

Hello @Spanner ! 😊 Yes, I’m using object detection.

I'm using the Yolov8n model already deploy available (https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.2.5/ax_models/zoo/yolo/object_detection/yolov8n-coco.yaml), with 80 classes.

I've managed to load and run the model. Now I have some doubts about working with the tensors.

I know that each model has its own folder inside the build/ with important information such as “input_shape”, “output_shape”, “dequantize_params”, “output_dtypes”, etc.

I have these tensors as output and I'm having trouble interpreting them:

First:

[TensorInfo(shape=(1, 80, 80, 64), dtype=<class “numpy.int8”>, name=“output_mmio_var”, padding=[(0, 0), (0, 0), (0, 0), (0, 0)], scale=0.08608702570199966, zero_point=-65), TensorInfo(shape=(1, 40, 40, 64), dtype=<class 'numpy. int8'>, name=“output_mmio_var_1”, padding=[(0, 0), (0, 0), (0, 0), (0, 0)], scale=0.07383394986391068, zero_point=-54), TensorInfo(shape=(1, 20, 20, 64), dtype=<class 'numpy. int8'>, name=“output_mmio_var_2”, padding=[(0, 0), (0, 0), (0, 0), (0, 0)], scale=0.07019813358783722, zero_point=-43)]

Second:

[TensorInfo(shape=(1, 80, 80, 128), dtype=<class 'numpy.int8'>, name='output_mmio_var_3', padding=[(0, 0), (0, 0), (0, 0), (0, 48)], scale=0.10051003098487854, zero_point=141), TensorInfo(shape=(1, 40, 40, 128), dtype=<class 'numpy.int8'>, name='output_mmio_var_4', padding=[(0, 0), (0, 0), (0, 0), (0, 48)], scale=0.1407421976327896, zero_point=119), TensorInfo(shape=(1, 20, 20, 128), dtype=<class 'numpy.int8'>, name='output_mmio_var_5', padding=[(0, 0), (0, 0), (0, 0), (0, 48)], scale=0.16894204914569855, zero_point=112)]

I know that 20/40/80 are different feature map resolutions.

In the second tensor, since the padding is [(0, 0), (0, 0), (0, 0), (0, 48)] → 128 - 48 = 80 so i’ts supposed to ignore the last 48 values and deal with the 80?

if possible i could use some help interpreting what the tensors mean? If they are the bounding boxes, scores, class….

Thank you 😊

@sara
Would be nice to share with us a minimal working example that takes in a yaml file or just the compiled model.json file, along with some test images (in this case these should have some valid objects for the model to detect), and then it would perhaps print out or save the tensors you are interested in.

With this example we would be able to know exactly what you are working on and would also allow us to help you more efficiently.

Many thanks!

Hello @Habib ,

I'm trying to adapt your axruntime_example.py (https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.2.5/examples/axruntime/axruntime_example.py) for object detection like Yolov8, instead of classification with imagenet (the use case of axruntime_example.py).

I’m using yolov8n-coco.yaml (yolov8n-coco-onnx.zip) (https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.2.5/ax_models/zoo/yolo/object_detection/yolov8n-coco.yaml) to try to detect cars (car.jpg) .

I'm developing the script in python (script.txt), it has commented lines because it's not working but you can already see the shape of the tensors received (output_tensors.png).

My problem is interpreting these tensors in order to get a bounding box result and a score with some detection. It seems that it's not yet in the final format common in yolo (1,84,8400) but in some previous stage. However, this is what we get from the yolov8n-coco model you provided, since it is consistent with the yaml files (output_tensorsyaml.png).

Thank you!

Hi @sara,

Thanks for sharing the MWE.

The compiler also generates the “postprocess_graph.onnx” which is required to get the final output in the yolo format, i.e. (1,84,8400). You can find this graph in the path “yolov8n-coco-onnx/yolov8n-coco-onnx/1”.

And the outputs you get from the “Worker” instance require depadding, then dezeroing, and scaling; for this, please see [1]. Once we have the outputs in fp32 format, we can feed them to the “postprocess_graph.onnx” [^] and get the final yolo output.

Hope this helps! Feel free to let us know if you need any further assistance.
And thanks again for reaching out!

---
[1] https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.3/examples/axruntime/axruntime_example.py#L85
[^] for this perhaps we can use ONNXRuntime to load the post proc graph and then use it together with the AxRuntime

Hi @Habib

It worked! Thank you 😊

That’s great news! Big cheer for @sara and @Habib for getting to the bottom of that one!

Looking foward to seeing your project in action, Sara - you’ll be back to give us a demo?

It would be awesome to see this as a simple example in the repository, since many people would be interested in doing the same :)

Excellent idea 👍 In fact, would you be opposed to posting this as an idea in the Launchpad @saadtiwana? We can add some weight to it and see about getting it written up if there’s a demand for it.

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account

Scanning file for viruses.

This file cannot be downloaded