Yolo11 with custom weights conversion and deploy

Hi,

When trying to deploy yolo11 with custom weights, I noticed a significant performance drop due to the model conversion to onnx. Can I ask what method and parameters were used to convert the axelera yolo11s-coco-onnx model?

Thank you!

Page 1 / 1

Hey there @Giodst!

That’s a bit unexpected yeah, since YOLO11’s in the Axelera model zoo. Possibly your setup is using some unsupported operators? This might help to check on that.

In terms of an export to ONNX in the context of deploying YOLO models with custom datasets using Voyager SDK, I think this is the way to go:

yolo export model=runs/detect/<your training run>/weights/best.pt format=onnx opset=11

If this doesn’t help, let me know a bit more about your host system, OS and such, and let’s dig deeper! 👍

I performed the model conversion with:

model = YOLO("best.pt")

model.export( format="onnx", opset=11, simplify=True, dynamic=False)

The performance of the original and converted model are:

YOLO11S
Precision: 0.8566
Recall: 0.8456
mAP@0.5: 0.8862
mAP@0.5:0.95: 0.5588

YOLO11S ONNX
Precision: 0.8562
Recall: 0.8457
mAP@0.5: 0.8862
mAP@0.5:0.95: 0.5587

The performance of the model deployed on the Metis M.2 System with Aetina RK3588 are:

mAP_box: 30.69%

mAP50_box: 42.47%

precision_box: 56.47%

recall_box: 28.54%

Deployed with the following yaml file:

axelera-model-format: 1.0.0

name: yolo11s-KFM

description: yolo11s-KFM

pipeline:
- detections:
model_name: yolo11s-KFM
input:
type: image
preprocess:
- letterbox:
width: ${{input_width}}
height: ${{input_height}}
scaleup: True
- torch-totensor:
postprocess:
- decodeyolo:
max_nms_boxes: 30000
conf_threshold: 0.35
nms_iou_threshold: 0.01
nms_class_agnostic: False
nms_top_k: 300
eval:
conf_threshold: 0.35
nms_iou_threshold: 0.01
use_multi_label: True
nms_class_agnostic: False
box_format: xywh
normalized_coord: False
label_filter: ${{label_filter}}
use_multi_label: False

models:
yolo11s-KFM:
class: AxONNXModel
class_path: $AXELERA_FRAMEWORK/ax_models/base_onnx.py
weight_path: $AXELERA_FRAMEWORK/customers/mymodels/best.onnx
task_category: ObjectDetection
input_tensor_layout: NCHW
input_tensor_shape: [1, 3, 640, 640]
input_color_format: RGB
num_classes: 23
dataset: AI-KFM
# extra_kwargs:
# compilation_config:
# quantization_scheme: per_tensor_min_max
# ignore_weight_buffers: False

datasets:
AI-KFM:
class: ObjDataAdapter
class_path: $AXELERA_FRAMEWORK/ax_datasets/objdataadapter.py
data_dir_name: AI-KFM
label_type: YOLOv8
labels: data.yaml
cal_data: test.txt
val_data: test.txt

operators:
decodeyolo:
class: DecodeYolo
class_path: $AXELERA_FRAMEWORK/ax_models/decoders/yolo.py

Hi @Giodst!

I’m not the best at this, but the YAML looks pretty good. I had a read through the operaors file on GitHub, and in ther the defaults for things like nms_iou_threshold seem to be around 0.5, rather than more strict settings you’re using.

Possibly right now it’s just filtering out a lot of correct detections which we’re then interpreting as lower performance. Worth a shot, maybe?

I’ve already tried, it’s the same thing, I also tested the model with the values present in the yaml on my PC and found good performances comparable to those found before. The problem is the inference with AIPU because with --pipe torch I get the good performances I already mentioned.

Ah, okay. Just to double-check, when you say you’re using YOLO11 with custom weights, is that a model you trained yourself (like from a custom dataset)? As opposed to the YOLO11s model in the Axelera Model Zoo.

All I’m wondering is that, if it’s the former, maybe trying out the model from the Model Zoo might be worth a test as it’s already optimised for Metis M.2 deployment. Just to see if that performs as expected, and to narrow down the potential bottleneck, like.

Hi @Giodst

generally, during model evaluation the `conf_threshold` is set to 0.001 and `nms_iou_threshold` is set to 0.6 or 0.7. From your yaml, your settings appear to be different. You mentioned you tried other values but not sure if you tried `conf_threshold` set to 0.001.

Also, it is not clear what script you are using to evaluate the model but based on our experience we see that when opencv is used for reading images, there could be some discrepancy in RGB vs BGR.

At DeGirum, we have independently ported yolov8 and yolo11 models and are generally able to get mAP within 1% of original float model values. Please let us know if we can help in any way.

Hi,

Performance improves but is not comparable to that achieved by doing inference on the host. Then, deleting the old builds and redeploying I get these problems:

INFO : Compile model: yolo11s-KFM
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape' may not be supported. 'Reshape' parameters:
WARNING :    data: Metadata(shape=(1, 512, 20, 20), is_constant=False)
WARNING :    shape: Metadata(shape=(4,), is_constant=True)
WARNING :    reshaped: Metadata(shape=(1, 4, 128, 400), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Configuration of node '/model.10/m/m.0/attn/Transpose' may not be supported. 'Transpose' parameters:
WARNING :    perm: [0, 1, 3, 2]
WARNING :    data: Metadata(shape=(1, 4, 32, 400), is_constant=False)
WARNING :    transposed: Metadata(shape=(1, 4, 400, 32), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape_2' may not be supported. 'Reshape' parameters:
WARNING :    data: Metadata(shape=(1, 4, 64, 400), is_constant=False)
WARNING :    shape: Metadata(shape=(4,), is_constant=True)
WARNING :    reshaped: Metadata(shape=(1, 256, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Node '/model.10/m/m.0/attn/MatMul' implements operator 'MatMul', which may not be supported.
WARNING : Node '/model.10/m/m.0/attn/Softmax' implements operator 'Softmax', which may not be supported.
WARNING : Configuration of node '/model.10/m/m.0/attn/Transpose_1' may not be supported. 'Transpose' parameters:
WARNING :    perm: [0, 1, 3, 2]
WARNING :    data: Metadata(shape=(1, 4, 400, 400), is_constant=False)
WARNING :    transposed: Metadata(shape=(1, 4, 400, 400), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Node '/model.10/m/m.0/attn/MatMul_1' implements operator 'MatMul', which may not be supported.
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape_1' may not be supported. 'Reshape' parameters:
WARNING :    data: Metadata(shape=(1, 4, 64, 400), is_constant=False)
WARNING :    shape: Metadata(shape=(4,), is_constant=True)
WARNING :    reshaped: Metadata(shape=(1, 256, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : The operator compatibility warnings above suggest this model may not be supported by the Axelera Compiler.
Calibrating... ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨ | 100% | 2.24s/it | 200it |
ERROR : Traceback (most recent call last):
ERROR : File "/home/aetina/Desktop/voyager-sdk/axelera/app/compile.py", line 456, in compile
ERROR : the_manifest = top_level.compile(model, compilation_cfg, output_path)
ERROR : File "<frozen compiler.utils.error_report>", line 65, in wrapper
ERROR : File "<frozen compiler.top_level>", line 831, in compile
ERROR : File "<frozen compiler.utils.error_report>", line 65, in wrapper
ERROR : File "<frozen compiler.top_level>", line 581, in quantize
ERROR : File "<frozen qtools_tvm_interface.graph_exporter_v2.graph_exporter>", line 120, in __init__
ERROR : RuntimeError: External op model_dot_10_m_m_dot_0_attn_mul_1_to_model_dot_10_m_m_dot_0_attn_softmax_qre_to_model_dot_10_m_m_dot_0_attn_softmax_dem_to_model_dot_10_m_m_dot_0_attn_softmax_dre found in the model (<class 'qtoolsv2.rewriter.operators.fqelements.Dequantize'> op). QTools may have issues quantizing this model.
ERROR : External op model_dot_10_m_m_dot_0_attn_mul_1_to_model_dot_10_m_m_dot_0_attn_softmax_qre_to_model_dot_10_m_m_dot_0_attn_softmax_dem_to_model_dot_10_m_m_dot_0_attn_softmax_dre found in the model (<class 'qtoolsv2.rewriter.operators.fqelements.Dequantize'> op). QTools may have issues quantizing this model.
INFO : Quantizing yolo11s-KFM: yolo11s-KFM took 492.695 seconds
ERROR : Failed to deploy network
ERROR : Failed to prequantize yolo11s-KFM: yolo11s-KFM
INFO : Compiling yolo11s-KFM took 507.450 seconds

Hi @Giodst

here is how we can help while we are working on ways to make a cloud compiler available for everyone to use. If you are willing to provide the pytorch checkpoint and a few calibration images, we can compile the model, test it, and share the resulting assets with you. Additionally, if you provide validation data, we can run our own eval script and share the mAP numbers.

In terms of the errors, anyway, it looks like it’s to do with unsupported operations. I think anyway, after a quick look through the Opset support doc (double check MatMul, Softmax and Transpose). Looks like the reshape patterns in your model don’t match the supported configurations.

Hi,
I solved the error exporting the model using the parameters {'batch': 1, 'half': False, 'dynamic': True, 'simplify': True, 'opset': 17}, discovered by downloading yolo11s-coco-onnx from https://media.axelera.ai/artifacts/model_cards/weights/yolo/object_detection/yolo11s.onnx and analyzing it with netron.
But the performance problem remains, I’ll show you with conf_threshold:0.005 and nms_iou_threshold: 0.5:

--pipe torch
INFO : ==========================
INFO : | mAP_box | 59.71% |
INFO : | mAP50_box | 90.70% |
INFO : | precision_box | 84.69% |
INFO : | recall_box | 82.47% |
INFO : ==========================

Thanks for your help and availability.

Ah, excellent work! Good approach too, going in through Neutron 👍

So the model’s running smoothly now. Maybe this goes back to something quantisation related? Calibration quality, perhaps? One of the team sent me a link to this example that shows how inference is handled across AIPU and CPU, and could help us spot any potential mismatches?

https://github.com/axelera-ai-hub/voyager-sdk/blob/release/v1.3/examples/application_tensor.py

Hi @Spanner

Environment

Host: Intel i7‑12700K, Ubuntu 22.04.5
Hardware: Metis PCIe (rev 02), Voyager SDK v1.3.x
Ultralytics: latest YOLO11 + YOLOv8
Goal: deploy custom Roboflow‑trained bottle‑cap detector on Metis

What works

YOLOv8n + custom weights compiles and runs end‑to‑end:
```
 
```
./deploy.py customers/eggdefect/yolo8n-eggdefect.yaml ./inference.py yolo8n-eggdefect media/egg.mp4 --enable-hardware-codec --verbose

Deployed successfully; E2E ~70+ fps on sample video (single stream).

What fails

YOLO11n + custom weights (PyTorch → ONNX) fails during quantization:

./deploy.py customers/bottlecap/yolo11n-bottlecap-onnx.yaml

Key warnings/errors (abridged):
WARNING: .../attn/Reshape|Transpose|MatMul|Softmax may not be supported ERROR : External op ... Dequantize ... QTools may have issues quantizing this model.

So the compiler flags attention subgraphs (MatMul/Softmax/Transpose with non‑[0,1,2,3] perms).

Questions

Is custom YOLO11 currently supported in Voyager v1.3.x? Community posts suggest YOLOv11 is not yet supported, and that attention isn’t supported on Metis. Can you confirm the current support status?
If partial support exists, is there a recommended ONNX export recipe (opset, simplify, static shapes, disabling attention, etc.) to produce a graph that compiles on Metis?
Is the yolo11s‑coco‑onnx model in your zoo a special/modified export? If yes, what’s the conversion path we should follow to retrain that compatible variant with our custom data? (Pointer to docs/scripts appreciated.)
(Minor) I saw warnings on an unrelated YAML about axelera-model-format: 1.0.0. Am I right that I just need to add that header to custom YAMLs to meet the v1.3 schema?

Thanks in advance — happy to test any suggested export flags or a patched decoder.

@Spanner
I have posted a reply with my findings and questions, your spam filter filtered it out can you approve it.

hi @Spanner,

I recently tried a custom-weights model deployment with yolov11 in pytorch format which i built with roboflow and i followed the tutorial and i am facing this issue.

venv) wgtech@wgtech-server:~/axelera/voyager-sdk$ ./deploy.py /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml
WARNING: /home/wgtech/axelera/voyager-sdk/customers/mymodels/yolov8n-licenseplate-onnx.yaml: Expected axelera-model-format: 1.0.0 but found None
Unusable model: '/home/wgtech/axelera/voyager-sdk/customers/mymodels/yolov8n-licenseplate-onnx.yaml' since it does not contain the expected sections {'axelera-model-format', 'description', 'models', 'pipeline', 'datasets', 'name'}
(it has {'model-env', 'description', 'models', 'pipeline', 'datasets', 'name'})
INFO    : Detected Metis type as pcie
INFO    : Compiling network yolo11n-bottlecap-onnx /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml
INFO    : Compile model: yolo11n-bottlecap-onnx
Creating new label cache: /home/wgtech/axelera/voyager-sdk/data/bottlecap/labels/train_bottlecap_objdet.cache
Creating label cache:  82%|█████████████████████████████████████████████████████████████████████▍               | 397/486 [00:00<00:00, 834.64label/s]Warning: /home/wgtech/axelera/voyager-sdk/data/bottlecap/labels/frame_8930.txt: 0 duplicate labels removed
Labels found: 486, corrupt images: 0                                                                                                                  
Background images: 0, missing label files: 0, empty label files: 0
INFO    : Prequantizing yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx
WARNING: /home/wgtech/axelera/voyager-sdk/customers/mymodels/yolov8n-licenseplate-onnx.yaml: Expected axelera-model-format: 1.0.0 but found None
Unusable model: '/home/wgtech/axelera/voyager-sdk/customers/mymodels/yolov8n-licenseplate-onnx.yaml' since it does not contain the expected sections {'pipeline', 'name', 'models', 'axelera-model-format', 'description', 'datasets'}
(it has {'pipeline', 'name', 'models', 'model-env', 'description', 'datasets'})
INFO    : Quantizing network yolo11n-bottlecap-onnx /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml yolo11n-bottlecap-onnx
INFO    : Compile model: yolo11n-bottlecap-onnx
/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/ultralytics/nn/modules/head.py:163: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if self.format != "imx" and (self.dynamic or self.shape != shape):
/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/ultralytics/utils/tal.py:372: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  for i, stride in enumerate(strides):
WARNING : Configuration of node '/_internal_torch_model/model.10/m/m.0/attn/Reshape' may not be supported. 'Reshape' parameters:
WARNING :     data: Metadata(shape=(1, 256, 20, 20), is_constant=False)
WARNING :     shape: Metadata(shape=(4,), is_constant=True)
WARNING :     reshaped: Metadata(shape=(1, 2, 128, 400), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Configuration of node '/_internal_torch_model/model.10/m/m.0/attn/Transpose' may not be supported. 'Transpose' parameters:
WARNING :     perm: [0, 1, 3, 2]
WARNING :     data: Metadata(shape=(1, 2, 32, 400), is_constant=False)
WARNING :     transposed: Metadata(shape=(1, 2, 400, 32), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Node '/_internal_torch_model/model.10/m/m.0/attn/MatMul' implements operator 'MatMul', which may not be supported.
WARNING : Node '/_internal_torch_model/model.10/m/m.0/attn/Softmax' implements operator 'Softmax', which may not be supported.
WARNING : Configuration of node '/_internal_torch_model/model.10/m/m.0/attn/Transpose_1' may not be supported. 'Transpose' parameters:
WARNING :     perm: [0, 1, 3, 2]
WARNING :     data: Metadata(shape=(1, 2, 400, 400), is_constant=False)
WARNING :     transposed: Metadata(shape=(1, 2, 400, 400), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Node '/_internal_torch_model/model.10/m/m.0/attn/MatMul_1' implements operator 'MatMul', which may not be supported.
WARNING : Configuration of node '/_internal_torch_model/model.10/m/m.0/attn/Reshape_1' may not be supported. 'Reshape' parameters:
WARNING :     data: Metadata(shape=(1, 2, 64, 400), is_constant=False)
WARNING :     shape: Metadata(shape=(4,), is_constant=True)
WARNING :     reshaped: Metadata(shape=(1, 128, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Configuration of node '/_internal_torch_model/model.10/m/m.0/attn/Reshape_2' may not be supported. 'Reshape' parameters:
WARNING :     data: Metadata(shape=(1, 2, 64, 400), is_constant=False)
WARNING :     shape: Metadata(shape=(4,), is_constant=True)
WARNING :     reshaped: Metadata(shape=(1, 128, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : The operator compatibility warnings above suggest this model may not be supported by the Axelera Compiler.
Calibrating... ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨ | 100% |  5.70it/s | 200it |
ERROR   : Traceback (most recent call last):
ERROR   :   File "/home/wgtech/axelera/voyager-sdk/axelera/app/compile.py", line 456, in compile
ERROR   :     the_manifest = top_level.compile(model, compilation_cfg, output_path)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/utils/error_report.py", line 65, in wrapper
ERROR   :     return func(*args, **kwargs)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/top_level.py", line 831, in compile
ERROR   :     quantized_model = quantize(model, config)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/utils/error_report.py", line 65, in wrapper
ERROR   :     return func(*args, **kwargs)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/top_level.py", line 581, in quantize
ERROR   :     exporter = qti.GraphExporterV2(qtools_quantized_mod, input_shape=core_graph_input_shape)
ERROR   :   File "<frozen qtools_tvm_interface.graph_exporter_v2.graph_exporter>", line 120, in __init__
ERROR   : RuntimeError: External op _internal_torch_model_model_dot_10_m_m_dot_0_attn_mul_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_qre_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_dem_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_dre found in the model (<class 'qtoolsv2.rewriter.operators.fqelements.Dequantize'> op). QTools may have issues quantizing this model.
ERROR   : External op _internal_torch_model_model_dot_10_m_m_dot_0_attn_mul_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_qre_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_dem_to__internal_torch_model_model_dot_10_m_m_dot_0_attn_softmax_dre found in the model (<class 'qtoolsv2.rewriter.operators.fqelements.Dequantize'> op). QTools may have issues quantizing this model.
INFO    : Quantizing yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx took 45.845 seconds
ERROR   : Failed to deploy network
ERROR   : Failed to prequantize yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx
INFO    : Compiling yolo11n-bottlecap-onnx took 51.242 seconds

now but i tried a yolov8n base model with custom weights in the pipeline here is the result

(venv) wgtech@wgtech-server:~/axelera/voyager-sdk$ ./deploy.py /home/wgtech/axelera/voyager-sdk/customers/eggdefect/yolo8n-eggdefect.yaml
INFO    : Detected Metis type as pcie
INFO    : Compiling network yolo8n-eggdefect /home/wgtech/axelera/voyager-sdk/customers/eggdefect/yolo8n-eggdefect.yaml
INFO    : Compile model: yolo8n-eggdefect
LowerTIR failed to fit buffers into memory after iteration 0/4.
  Pool usage: {L1: alloc:4,310,528B avail:4,194,304B over:116,224B util:102.77%, L2: alloc:7,226,880B avail:8,077,312B over:0B util:89.47%, DDR: alloc:20,843,648B avail:260,046,848B over:0B util:8.02%}
  Overflowing buffer IDs: {99, 101, 102, 1646, 1565, 1567}
WARNING : LowerTIR failed to fit buffers into memory after iteration 0/4.
WARNING :   Pool usage: {L1: alloc:4,310,528B avail:4,194,304B over:116,224B util:102.77%, L2: alloc:7,226,880B avail:8,077,312B over:0B util:89.47%, DDR: alloc:20,843,648B avail:260,046,848B over:0B util:8.02%}
WARNING :   Overflowing buffer IDs: {99, 101, 102, 1646, 1565, 1567}
|████████████████████████████████████████| 2:05.3 
INFO    : Compile yolo8n-eggdefect.yaml:pipeline
INFO    : Compiling yolo8n-eggdefect took 128.722 seconds
INFO    : Successfully deployed network

and inference is also working



(venv) wgtech@wgtech-server:~/axelera/voyager-sdk$ ./inference.py yolo8n-eggdefect media/egg.mp4   --save-output /home/wgtech/Downloads/egg_defect_annotated.mp4   --enable-hardware-codec   --verbose
DEBUG   :axelera.app.device_manager: Using device metis-0:6:0
DEBUG   :axelera.app.network: Create network from /home/wgtech/axelera/voyager-sdk/customers/eggdefect/yolo8n-eggdefect.yaml
DEBUG   :axelera.app.network: Register custom operator 'decodeyolo' with class DecodeYolo from /home/wgtech/axelera/voyager-sdk/ax_models/decoders/yolo.py
DEBUG   :axelera.app.network: ~<any-user>/.cache/axelera/ not found in path /home/wgtech/axelera/voyager-sdk/customers/eggdefect/eggdefect.pt
DEBUG   :axelera.app.network: Deploying for 1 cores instead of 4 due to max_compiler_cores setting (for metis: pcie)
DEBUG   :axelera.app.device_manager: Reconfiguring devices with device_firmware=1, mvm_utilisation_core_0=100%, clock_profile_core_0=800MHz, mvm_utilisation_core_1=100%, clock_profile_core_1=800MHz, mvm_utilisation_core_2=100%, clock_profile_core_2=800MHz, mvm_utilisation_core_3=100%, clock_profile_core_3=800MHz
DEBUG   :axelera.app.utils: $ vainfo
DEBUG   :axelera.app.utils: Found OpenCL GPU devices for platform Intel(R) OpenCL HD Graphics: Intel(R) UHD Graphics 770 [0x4680]
DEBUG   :axelera.app.pipe.manager: 
DEBUG   :axelera.app.pipe.manager: --- EXECUTION VIEW ---
DEBUG   :axelera.app.pipe.manager: Input
DEBUG   :axelera.app.pipe.manager:   └─yolo8n-eggdefect
DEBUG   :axelera.app.pipe.manager: 
DEBUG   :axelera.app.pipe.manager: --- RESULT VIEW ---
DEBUG   :axelera.app.pipe.manager: Input
DEBUG   :axelera.app.pipe.manager:   └─yolo8n-eggdefect
DEBUG   :axelera.app.pipe.manager: Network type: NetworkType.SINGLE_MODEL
DEBUG   :yolo: Model Type: YoloFamily.YOLOv8 (YOLOv8 pattern:
DEBUG   :yolo: - 6 output tensors (anchor-free)
DEBUG   :yolo: - 3 regression branches (64 channels)
DEBUG   :yolo: - 3 classification branches (8 channels)
DEBUG   :yolo: - Channel pattern: [64, 64, 64, 8, 8, 8]
DEBUG   :yolo: - Shapes: [[1, 80, 80, 64], [1, 40, 40, 64], [1, 20, 20, 64], [1, 80, 80, 8], [1, 40, 40, 8], [1, 20, 20, 8]])
DEBUG   :axelera.app.pipe.io: FPS of /home/wgtech/.cache/axelera/media/egg.mp4: 25
DEBUG   :axelera.app.operators.inference: Enabled 4x1 inference queues for yolo8n-eggdefect because model_cores=1 and num_cores=4
DEBUG   :axelera.app.operators.inference: Using inferencenet name=inference-task0 model=/home/wgtech/axelera/voyager-sdk/build/yolo8n-eggdefect/yolo8n-eggdefect/1/model.json devices=metis-0:6:0 double_buffer=True dmabuf_inputs=True dmabuf_outputs=True num_children=4
DEBUG   :axelera.app.pipe.gst: GST representation written to build/yolo8n-eggdefect/logs/gst_pipeline.yaml
DEBUG   :axelera.app.pipe.gst: Started building gst pipeline
DEBUG   :axelera.app.pipe.gst: Received first frame from gstreamer
DEBUG   :axelera.app.pipe.gst: Finished building gst pipeline - build time = 0.636
DEBUG   :axelera.app.display: System memory: 3371.52 MB axelera: 722.41 MB, vms = 7828.27 MB    display queue size: 1                                 
DEBUG   :axelera.app.display: System memory: 3459.28 MB axelera: 757.01 MB, vms = 7942.93 MB    display queue size: 3                                 
DEBUG   :axelera.app.display: System memory: 3440.97 MB axelera: 757.64 MB, vms = 7942.91 MB    display queue size: 2                                 
DEBUG   :axelera.app.display: System memory: 3479.00 MB axelera: 756.77 MB, vms = 7942.96 MB    display queue size: 1                                 
DEBUG   :axelera.app.display: System memory: 3469.56 MB axelera: 756.92 MB, vms = 7942.97 MB    display queue size: 5                                 
DEBUG   :axelera.app.display: System memory: 3457.73 MB axelera: 756.71 MB, vms = 7941.03 MB    display queue size: 2                                 
DEBUG   :axelera.app.pipe.gst_helper: End of stream                                                                                                   
INFO    : Core Temp : 33.0°C                                                                                                                          
INFO    : CPU % : 7.5%
INFO    : End-to-end : 71.5fps
DEBUG   :axelera.app.meta.object_detection: Total number of detections: 2440
(venv) wgtech@wgtech-server:~/axelera/voyager-sdk$

i want to know whether the yolov11n with custom model is not supported or not

and i tried onxx approach for the yolov11-model

(venv) wgtech@wgtech-server:~/axelera/voyager-sdk$ ./deploy.py /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml
INFO    : Detected Metis type as pcie
INFO    : Compiling network yolo11n-bottlecap-onnx /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml
INFO    : Compile model: yolo11n-bottlecap-onnx
Creating new label cache: /home/wgtech/axelera/voyager-sdk/data/bottlecap/labels/train_bottlecap_objdet.cache
Creating label cache:  72%|████████████████████████████████████████████████████████████▊                       | 352/486 [00:00<00:00, 1028.07label/s]Warning: /home/wgtech/axelera/voyager-sdk/data/bottlecap/labels/frame_8930.txt: 0 duplicate labels removed
Labels found: 486, corrupt images: 0                                                                                                                  
Background images: 0, missing label files: 0, empty label files: 0
INFO    : Prequantizing yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx
INFO    : Quantizing network yolo11n-bottlecap-onnx /home/wgtech/axelera/voyager-sdk/customers/bottlecap/yolo11n-bottlecap-onnx.yaml yolo11n-bottlecap-onnx
INFO    : Compile model: yolo11n-bottlecap-onnx
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape' may not be supported. 'Reshape' parameters:
WARNING : 	allowzero: 0
WARNING : 	data: Metadata(shape=(1, 256, 20, 20), is_constant=False)
WARNING : 	shape: Metadata(shape=(4,), is_constant=True)
WARNING : 	reshaped: Metadata(shape=(1, 2, 128, 400), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Configuration of node '/model.10/m/m.0/attn/Transpose' may not be supported. 'Transpose' parameters:
WARNING : 	perm: [0, 1, 3, 2]
WARNING : 	data: Metadata(shape=(1, 2, 32, 400), is_constant=False)
WARNING : 	transposed: Metadata(shape=(1, 2, 400, 32), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape_2' may not be supported. 'Reshape' parameters:
WARNING : 	allowzero: 0
WARNING : 	data: Metadata(shape=(1, 2, 64, 400), is_constant=False)
WARNING : 	shape: Metadata(shape=(4,), is_constant=True)
WARNING : 	reshaped: Metadata(shape=(1, 128, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : Node '/model.10/m/m.0/attn/MatMul' implements operator 'MatMul', which may not be supported.
WARNING : Node '/model.10/m/m.0/attn/Softmax' implements operator 'Softmax', which may not be supported.
WARNING : Configuration of node '/model.10/m/m.0/attn/Transpose_1' may not be supported. 'Transpose' parameters:
WARNING : 	perm: [0, 1, 3, 2]
WARNING : 	data: Metadata(shape=(1, 2, 400, 400), is_constant=False)
WARNING : 	transposed: Metadata(shape=(1, 2, 400, 400), is_constant=False)
WARNING : Unsatisfied constraint: perm == [0, 1, 2, 3]
WARNING : Node '/model.10/m/m.0/attn/MatMul_1' implements operator 'MatMul', which may not be supported.
WARNING : Configuration of node '/model.10/m/m.0/attn/Reshape_1' may not be supported. 'Reshape' parameters:
WARNING : 	allowzero: 0
WARNING : 	data: Metadata(shape=(1, 2, 64, 400), is_constant=False)
WARNING : 	shape: Metadata(shape=(4,), is_constant=True)
WARNING : 	reshaped: Metadata(shape=(1, 128, 20, 20), is_constant=False)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and np.array_equal(shape, data.shape)
WARNING : 'Reshape' parameters do not match supported configuration: allowzero == 0 and len(data.shape) >= 2 and len(shape) == 2 and shape[0] == data.shape[0] and (shape[1] == data.shape[1] or shape[1] == -1)
WARNING : The operator compatibility warnings above suggest this model may not be supported by the Axelera Compiler.
Calibrating... ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨ | 100% | 11.26it/s | 200it |
ERROR   : Traceback (most recent call last):
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/top_level.py", line 584, in quantize
ERROR   :     simplified_ts_model = exporter.export()
ERROR   :   File "<frozen qtools_tvm_interface.graph_exporter_v2.graph_exporter>", line 254, in export
ERROR   :   File "<frozen qtools_tvm_interface.graph_exporter_v2.graph_exporter>", line 209, in _convert_operators
ERROR   :   File "<frozen qtools_tvm_interface.graph_exporter_v2.replacement_functions>", line 573, in _get_simpl_matmul
ERROR   : NotImplementedError: Only pinned zero (zero_point=0) is supported for inputs to MatMul ops.
ERROR   : 
ERROR   : The above exception was the direct cause of the following exception:
ERROR   : 
ERROR   : Traceback (most recent call last):
ERROR   :   File "/home/wgtech/axelera/voyager-sdk/axelera/app/compile.py", line 456, in compile
ERROR   :     the_manifest = top_level.compile(model, compilation_cfg, output_path)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/utils/error_report.py", line 65, in wrapper
ERROR   :     return func(*args, **kwargs)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/top_level.py", line 831, in compile
ERROR   :     quantized_model = quantize(model, config)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/utils/error_report.py", line 65, in wrapper
ERROR   :     return func(*args, **kwargs)
ERROR   :   File "/home/wgtech/.cache/axelera/venvs/83d579fa/lib/python3.10/site-packages/axelera/compiler/top_level.py", line 586, in quantize
ERROR   :     raise GraphExporterError(f"Quantized graph simplification failed:\n{e}") from e
ERROR   : axelera.compiler.exceptions.GraphExporterError: Quantized graph simplification failed:
ERROR   : Only pinned zero (zero_point=0) is supported for inputs to MatMul ops.
ERROR   : Quantized graph simplification failed:
ERROR   : Only pinned zero (zero_point=0) is supported for inputs to MatMul ops.
INFO    : Quantizing yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx took 26.908 seconds
ERROR   : Failed to deploy network
ERROR   : Failed to prequantize yolo11n-bottlecap-onnx: yolo11n-bottlecap-onnx
INFO    : Compiling yolo11n-bottlecap-onnx took 32.454 seconds
(venv) wgtech@wgtech-server:~/axelera/voyager-sdk$

Facing same errors

getting errors:

yolo11n-bottlecap-onnx.yaml (pt version)

axelera-model-format: 1.0.0

name: yolo11n-bottlecap-onnx
description: YOLOv11n, 640x640, 5-class bottlecap QA (custom ONNX)

model-env:
  dependencies:
    - ultralytics

pipeline:
  - yolo11n-bottlecap-onnx:
      template_path: $AXELERA_FRAMEWORK/pipeline-template/yolo-letterbox.yaml
      postprocess:
        - decodeyolo:                  # fine-tune decoder settings
            max_nms_boxes: 30000
            conf_threshold: 0.25
            nms_iou_threshold: 0.45
            nms_class_agnostic: False
            nms_top_k: 300
            eval:
               conf_threshold: 0.001     # overwrites above parameter during accuracy measurements

models:
  yolo11n-bottlecap-onnx:
    class: AxUltralyticsYOLO
    class_path: $AXELERA_FRAMEWORK/ax_models/yolo/ax_ultralytics.py
    weight_path: $AXELERA_FRAMEWORK/customers/bottlecap/models/bottlecapfit_yolov11n.pt
    task_category: ObjectDetection
    input_tensor_layout: NCHW
    input_tensor_shape: [1, 3, 640, 640]
    input_color_format: RGB
    num_classes: 5
    dataset: BottlecapCalibration

datasets:
  BottlecapCalibration:
    class: ObjDataAdapter
    class_path: $AXELERA_FRAMEWORK/ax_datasets/objdataadapter.py
    data_dir_name: bottlecap
    label_type: YOLOv8
    labels: data.yaml
    cal_data: train.txt     # Text file with image paths or directory like `valid`
    val_data: val.txt    # Text file with image paths or directory like `test`

yolo11n-bottlecap-onnx.yaml (onxx version)

axelera-model-format: 1.0.0

name: yolo11n-bottlecap-onnx
description: YOLOv11n, 640x640, 5-class bottlecap QA (custom ONNX)

model-env:
  dependencies:
    - ultralytics

pipeline:
  - yolo11n-bottlecap-onnx:
      template_path: $AXELERA_FRAMEWORK/pipeline-template/yolo-letterbox.yaml
      postprocess:
        - decodeyolo:                  # fine-tune decoder settings
            max_nms_boxes: 30000
            conf_threshold: 0.25
            nms_iou_threshold: 0.45
            nms_class_agnostic: False
            nms_top_k: 300
            eval:
               conf_threshold: 0.001     # overwrites above parameter during accuracy measurements

models:
  yolo11n-bottlecap-onnx:
    class: AxONNXModel
    class_path: $AXELERA_FRAMEWORK/ax_models/base_onnx.py
    weight_path: $AXELERA_FRAMEWORK/customers/bottlecap/models/bottlecapfit_yolov11n.onnx
    task_category: ObjectDetection
    input_tensor_layout: NCHW
    input_tensor_shape: [1, 3, 640, 640]
    input_color_format: RGB
    num_classes: 5
    dataset: BottlecapCalibration

datasets:
  BottlecapCalibration:
    class: ObjDataAdapter
    class_path: $AXELERA_FRAMEWORK/ax_datasets/objdataadapter.py
    data_dir_name: bottlecap
    label_type: YOLOv8
    labels: data.yaml
    cal_data: train.txt     # Text file with image paths or directory like `valid`
    val_data: val.txt    # Text file with image paths or directory like `test`

yolo8n-eggdefect.yaml (working fine)

axelera-model-format: 1.0.0

name: yolo8n-eggdefect
description: YOLOv8n, 640x640, 8-class egg custom weights pt model

model-env:
  dependencies:
    - ultralytics

pipeline:
  - yolo8n-eggdefect:
      template_path: $AXELERA_FRAMEWORK/pipeline-template/yolo-letterbox.yaml
      postprocess:
        - decodeyolo:                  # fine-tune decoder settings
            max_nms_boxes: 30000
            conf_threshold: 0.25
            nms_iou_threshold: 0.45
            nms_class_agnostic: False
            nms_top_k: 300
            eval:
               conf_threshold: 0.001     # overwrites above parameter during accuracy measurements

models:
  yolo8n-eggdefect:
    class: AxUltralyticsYOLO
    class_path: $AXELERA_FRAMEWORK/ax_models/yolo/ax_ultralytics.py
    weight_path: $AXELERA_FRAMEWORK/customers/eggdefect/eggdefect.pt
    task_category: ObjectDetection
    input_tensor_layout: NCHW
    input_tensor_shape: [1, 3, 640, 640]
    input_color_format: RGB
    num_classes: 8
    dataset: eggdefect

datasets:
  eggdefect:
    class: ObjDataAdapter
    class_path: $AXELERA_FRAMEWORK/ax_datasets/objdataadapter.py
    data_dir_name: eggdefect
    label_type: YOLOv8
    labels: data.yaml
    cal_data: train.txt     # Text file with image paths or directory like `valid`
    val_data: val.txt    # Text file with image paths or directory like `test`

For yolo11 onnx, the problem may be caused by the model conversion parameters. Try these:
{'batch': 1, 'half': False, 'dynamic': True, 'simplify': True, 'opset': 17}.

Let me know if it works and if you get comparable performance when inferring on the aipu compared to when inferring on the host.

Hi @WGPravin

You can try our cloud compiler and see it can help. See our post on this topic: Axelera Now Supported in DeGirum Cloud Compiler | Community We ported multiple models successfully and with minimal map loss. We can help you if you are interested.

Hi,

How can I run inference on axelera metis of the model compiled by degirum?

Hi @Giodst

Glad to see you were able to compile a model for Axelera metis with our cloud compiler. Our team is currently working on detailed user guides for Axelera and they should be available mid next week.

However, to get started, you can see a basic example here: PySDKExamples/examples/google_colab/pysdk_axelera_hello_world.ipynb at main · DeGirum/PySDKExamples.

Fo advanced use cases, you can see other examples in the repo: DeGirum/PySDKExamples: DeGirum PySDK Usage Examples. You just need to change the model name to appropriate model.

Please let us know if you encounter any issues.

@shashi.chilappagari will try it out and let you know

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account

Scanning file for viruses.

This file cannot be downloaded