Skip to main content

I am encountering issues when trying to deploy and run the YOLOv5m6 ONNX model on the Metis M.2. Below are the details of the steps I followed and the issues that arose:

1. Deployment Process:

I used the following command to deploy the model:

./deploy.py customers/mymodels/yolov5m6-object-detection.yaml

The deployment was successful, but I encountered warnings about memory overflow during the buffer fitting process:

LowerTIR failed to fit buffers into memory after iteration 0/4.
Pool usage: {L1: alloc:4,195,840B avail:4,194,304B over:1,536B util:100.04%, L2: alloc:6,864,384B avail:8,077,312B over:0B util:84.98%, DDR: alloc:54,275,328B avail:260,046,848B over:0B util:20.87%}

2. Inference Process:

I then attempted to run inference with the following command:

./inference.py c-yolov5m6-object-detection chinmay/Vehicle-CV-ADAS/assets/VID-20250208-WA0008.mp4 --save-output ~/Desktop/voyager-sdk/chinmay/saved_videos/output.mp4

The model failed to run due to a ShapeInferenceError related to incompatible dimensions during the ONNX model loading:

terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to create ONNX Runtime session: Load model from /home/aravind/Desktop/voyager-sdk/build/yolov5m6-object-detection/yolov5m6-object-detection/1/postprocess_graph.onnx failed:Node (Mul_526) Op (Mul) [ShapeInferenceError] Incompatible dimensions
Aborted (core dumped)

3. Model Inputs and Outputs:

  • The input shape for the model is [1, 3, 512, 640] (batch size 1, 3 channels, 512x640 image size).

  • The output shape from the model is [1, 20400, 85], which corresponds to 20400 possible bounding boxes with 85 values per box (4 coordinates, 1 objectness score, 80 class scores).

4. Steps Taken So Far:

  • I have validated the model using ONNX runtime and confirmed that the model is correct and the input shape is consistent.
  • I also tried adjusting the YAML file to match the model's input/output requirements.

5. Links:

Looking forward to your guidance on resolving this issue.  

I think we’ll be able to get some more experienced advice than mine, but I notice that the model is slightly too large to fit into L1 memory. Only like 1.5KB, which is nothing, but might be enough to cause performance issues or instability?

Would reducing the input size by a fraction ease the memory pressure a bit?


Reply