Hey everyone,
I am evaluating the Metis lineup for an edge AI project and have a specific question about multi-card architecture that I haven't found a definitive answer to in the docs or SDK references.
What I understand is supported today:
-
The Voyager SDK detects multiple Metis cards in a single host.
-
Each card runs its own independent inference pipeline.
-
A real-world surveillance example shows three 4-chip PCIe cards running five primary models + one secondary model in parallel across 48 AIPU cores.
My question: Is there any current or planned SDK support for splitting a single neural network across multiple Metis cards (e.g., layer sharding or tensor parallelism across two M.2 cards)? Or is the architecture strictly "one model must fit entirely within a single card's memory and AIPU"?
Context for my use case: I am looking at a host with dual M.2 slots (e.g., Radxa Orion O6N) and weighing whether two standard Metis M.2 cards could act as a unified 2-GB / 428-TOPS accelerator for a single large model, or if I should instead plan for a single M.2 Max (16 GB) and treat the dual-M.2 path as strictly for agent-swarming / multi-model parallelism.
If unified multi-card execution is not on the roadmap, a clear confirmation would help me (and likely others) size the right SKU upfront rather than over-provisioning hardware.
Thanks for any insight you can share.
