For me right now, running a local LLM would be the main use case for buying the Mentis AI hardware, and this demo convinced me that this could indeed be a viable application!
I’m running Home Assistant locally and it already has all the infrastructure in place for AI integration (openwakeword, text-to-speech, speech to text). Nextcloud also is starting to include (local) AI options which I'd like to integrate in my setup.
The common route in the homelab community is to go for an energy-intensive (and pricy) GPU with a lot of VRAM, but a dedicated device like the arduino or SBC sound like an excellent alternative, especially if you weigh in the energy usage, price and TOPS.
Question: The key for LLM performance seems to be VRAM. The way I understand it is that your chip design has a different approach to how memory and processing power are integrated. Does the raw power (214 TOPS) balance out the lower memory availability? What impact on token generation does loading a larger model for example have?
Hi there @Eis-T! Yeah, you’ve nailed it there - Metis balances out the lower available memory by relying on smarter memory usage, reduced data movement, and high parallel compute.
And I’m right with you regarding how amazing it’d be to see this coupled with Home Assistant! I’m a big HA user as well, and having a local LLM integrated into it would be incredible.
Is that something you’re actively working on?
Thanks for the quick response @Spanner!
It's not something I'm actively working on, but more something I want to sink my teeth in and learn more about. I don´t plan on developing a product or service, but I do plan on sharing my findings with the Home Assistant community. Curious to hear if you are already in contact with Nabu Casa / Open Home Foundation as they are based in the Netherlands too. Building an ethical, local alternative to Amazon's Alexa/Echo is something they are dreaming about, just saying :)
Follow up question: which of these two systems would you advice for this purpose? The Arduino or the ARM SBC? Spoiler: the one you recommend will be the one I'm ordering.
How amazing would it be if we could put some of this in Frenck’s hands and get him excited about it too?!
Homestly, I don’t know which I’d go for either! For what we’ve talked about, either does the job beautifully. If you wanted to experiemnt more and try out some alternative applications, the Compute board is probably more flexible?
I went for the arduino one, because when looking at the product brief I saw that one has 16 GB of LPDDR4X and the compute board has 4 GB. Excited to tinker around with this!
I went for the arduino one, because when looking at the product brief I saw that one has 16 GB of LPDDR4X and the compute board has 4 GB. Excited to tinker around with this!
I think the 16 you saw on the arduino is the storage not the memory.
Really looking forward to seeing what you build with it @Eis-T! Great choice of board.