Skip to main content

This may not be possible because of how the software and hardware work together. I’m new here, and not traditionally a SW engineer, but I have a question/ suggestion. 

I noticed when I attempted to run the example local LLM inference scripts in the SDK that I can’t run them with only 1GB of DDR memory on the card. I think the example needs to be edited so that the model can be run on 1 GB of DDR (a model like tinyllama or similar that’ll actually fit in the 1GB of space.)

After thinking about it, I do have an intel Raptor Lake integrated GPU, and an external NVIDIA A500 from Adlink Technologies. Is there any way to use the memory in the integrated CPU or the thunderbolt connected A500 GPU to run the LLM inference demos by using the DDR memory in the other devices? That would be very very useful!

Be the first to reply

Reply