Skip to main content
Question

How to use Batch Size > 1 for input during model compilation

  • April 28, 2026
  • 5 replies
  • 41 views

I am trying to run some filtering models for my workflow on Axelera Metis. However, the `axcompile` command doesn’t allow for inputs with batch size (dim = 0) to be greater than 1. During model inference, that means that I have to pass inputs in a sequential order which introduces latency. The model is a custom model (not in model zoo) and not a vision model. 

I have tried creating 4 instances of the same model running on each of the cores of Metis AIPU, but that isn’t possible due to SRAM memory limitations. 

Is there a way to use Batch Size > 1 for Metis? 

5 replies

Spanner
Axelera Team
Forum|alt.badge.img+3
  • Axelera Team
  • April 29, 2026

Hi there ​@reck

Ah, possibly Voyager is at cross purposes with you here in its terminology, if I understand right 😄

In Voyager, "batch size" at the input shape level isn't how the SDK feeds multiple samples in parallel, which is why axcompile rejects dim[0] > 1. The SDK's documented mechanism for parallelism is to compile at batch=1 and then replicate the model across the four AIPU cores at runtime via num_children (or aipu_cores in the YAML), so each core processes a different frame.

I may not have explained that brilliantly, but there’s more (and better) info on it here: axinferencenet.md.

Let me know if that helps!

 


  • Author
  • Cadet
  • April 29, 2026

Hi ​@Spanner

Thanks for your reply. I get what you’re trying to say and I believe the reason for that is due to constrained ONNX operations as listed in onnx-opset-17-support.md. However, for my use case it seems that a single model instance is being distributed evenly across the 4 cores. For Metis M.2, there is not enough memory available to create more than one instance of my model. 

I am planning to use this towards cloud detection models on Sentinel 2 Multi-Spectral Data. 


  • Author
  • Cadet
  • April 30, 2026

Hi @Spanner, 

This might be unrelated to the original question, but is it possible to order a M.2 MAX Axelera Metis chip (16GB RAM) chip? I wasn’t able to find it anywhere although its been released for quite some time. 


Spanner
Axelera Team
Forum|alt.badge.img+3
  • Axelera Team
  • April 30, 2026

Hi ​@Spanner

Thanks for your reply. I get what you’re trying to say and I believe the reason for that is due to constrained ONNX operations as listed in onnx-opset-17-support.md. However, for my use case it seems that a single model instance is being distributed evenly across the 4 cores. For Metis M.2, there is not enough memory available to create more than one instance of my model. 

I am planning to use this towards cloud detection models on Sentinel 2 Multi-Spectral Data. 

Hi ​@reck!

Looks to me like you've already got the right setup for a model this size: one nicely compiled instance with all four cores working on it together, sharing the weights. The reason axcompile won't take a batch dimension on top of that is the SDK splits work across cores, rather than across batches, so I think you're effectively at the ceiling for this model on a single Metis.

That being said, let’s see if ​@Habib has any suggestions here - he’s a ninja with this stuff!

 


Spanner
Axelera Team
Forum|alt.badge.img+3
  • Axelera Team
  • April 30, 2026

Hi @Spanner, 

This might be unrelated to the original question, but is it possible to order a M.2 MAX Axelera Metis chip (16GB RAM) chip? I wasn’t able to find it anywhere although its been released for quite some time. 

Ah, it’s not actually released yet. It was announced a little while ago, but it’s yet to actually hit the store. It’s imminent though! I can almost taste it, it’s so close...