Skip to main content

In case anyone on here hasn’t spotted it in the wild yet, the Metis M.2 Max was just announced. 

It’s basically managed to pack PCIe card-level performance into an M.2 stick. So that’s better performance for LLMs and vision transformers on much smaller host devices (RPi, Orange Pi, etc), all while keeping power draw super low (around 6.5W). It’s been bumped up to 16GB of memory, slimmer design options, and added security features for tougher environments.

Feels like a big step forward for anyone wanting to run GenAI, computer vision, or LLM workloads locally, on devices with an impressively small footprint, without needing a data centre to do it 😆.

Shipping starts later this year.

https://axelera.ai/news/axelera-ai-boosts-llms-at-the-edge-by-2x-with-metis-m.2-max-introduction

Exciting news. I have a few questions:

1. From what I understood from the link you shared, it's using same Metis chip as the existing M.2 device, but with up to 16GB of RAM and higher transfer bandwidth, right?

2. Other than running the AI models, Is it possible to also use the Metis devices as general purpose compute devices? Like for example, if i have a FFT algorithm or some machine vision algorithm, is it also possible to write a kernel that runs those algorithms on the Metis instead of the CPU?


Hey man!

  1. That’s correct, yep. Same AIPU, but it’s been scaled up in terms of design:
        •    Memory options: 1, 4, 8, or 16 GB.
        •    Double memory bandwidth (even the 1 GB variant benefits from this).
        •    Better thermals (new slimmer card profile, cooling options; passive if feasible, otherwise active).
        •    Security upgrades: secure boot, secure upgrade, secure PCIe, secure debug; all transparent to the user.
    So from a dev perspective, you’re programming the same AIPU, but now with more headroom, which is what unlocks the options for LLMs, VLMs, cascaded models, multi-stream vision etc.\
  2. That part hasn’t changed really. The Metis AIPU isn’t a general-purpose processor. So “non-neural” tasks would still be handled on the host CPU/GPU. Voyager helps you slot these steps into the pipeline, but they won’t execute on the Metis silicon. 👍


Reply