Skip to main content

I just built a quick demo showing the Llama 3.2B chatbot running on our Metis platform, totally offline. This model packs 3 billion parameters and runs smoothly on both a standard Lenovo P360 with our PCIe card and even on an Arduino-based dev board (Portenta X8).

We hit 6+ tokens/sec per core – which means real-time chat. Perfect for smart customer support bot, digital concierge systems, any edge AI assistant application really, all running fully on-device. No cloud needed.

Check out the video and let me know what you think. Any projects you can think of where you could use a self-contained, power-efficient, offline AI chatbot like this?

 

//EDIT I am aware that the youtube link is currently broken. I will reupload it soon.

 

For me right now, running a local LLM would be the main use case for buying the Mentis AI hardware, and this demo convinced me that this could indeed be a viable application! 

I’m running Home Assistant locally and it already has all the infrastructure in place for AI integration (openwakeword, text-to-speech, speech to text). Nextcloud also is starting to include (local) AI options which I'd like to integrate in my setup. 

The common route in the homelab community is to go for an energy-intensive (and pricy) GPU with a lot of VRAM, but a dedicated device like the arduino or SBC sound like an excellent alternative, especially if you weigh in the energy usage, price and TOPS. 

Question: The key for LLM performance seems to be VRAM. The way I understand it is that your chip design has a different approach to how memory and processing power are integrated. Does the raw power (214 TOPS) balance out the lower memory availability? What impact on token generation does loading a larger model for example have?


Hi there ​@Eis-T! Yeah, you’ve nailed it there - Metis balances out the lower available memory by relying on smarter memory usage, reduced data movement, and high parallel compute.

And I’m right with you regarding how amazing it’d be to see this coupled with Home Assistant! I’m a big HA user as well, and having a local LLM integrated into it would be incredible.

Is that something you’re actively working on?


Thanks for the quick response ​@Spanner!

It's not something I'm actively working on, but more something I want to sink my teeth in and learn more about. I don´t plan on developing a product or service, but I do plan on sharing my findings with the Home Assistant community. Curious to hear if you are already in contact with Nabu Casa / Open Home Foundation as they are based in the Netherlands too. Building an ethical, local alternative to Amazon's Alexa/Echo is something they are dreaming about, just saying :)

Follow up question: which of these two systems would you advice for this purpose? The Arduino or the ARM SBC? Spoiler: the one you recommend will be the one I'm ordering.


Reply