Skip to main content

How to configure custom LLM models in YAML?

  • December 18, 2025
  • 6 replies
  • 48 views

Forum|alt.badge.img+1

@Spanner  what i get on this code?

 

by launching “axllm.py my.yaml”

try:
network_yaml_info = yaml_parser.get_network_yaml_info(
include_collections=['llm_local', 'llm_cards', 'llm_zoo']
)
parser = config.create_llm_argparser(
network_yaml_info, description='Perform LLM inference on an Axelera platform'
)

How to configure the yaml file 

what’s inside?

 

can i specify like 

 

llm_local = ip:port
llm_cards = /dev/metis0
llm_zoo = https://huggingface.com/my/llm.zip

…………

Or

llm_zoo:
- url: ${HUG_RELEASE_URL}/llm/myhillarius.zip
- url: ${HUG_RELEASE_URL}/llm/codellama:8B.zip
- url: ${HUG_RELEASE_URL}/llm/r1-1776:70B.zip

By duck tapeing models to my yaml

6 replies

Spanner
Axelera Team
Forum|alt.badge.img+2
  • Axelera Team
  • December 19, 2025

Hi ​@florit ! (I moved this message out into its own topic, as I thought it’d make it easier to focus the conversation 😀 Hope that’s cool!)

The collections you're seeing (llm_local, llm_cards, llm_zoo) in the yaml_parser.get_network_yaml_info() code are internal SDK parameters that control where the parser looks for LLM YAML files, but they're not user-configurable as far as I know.

So you can't point to arbitrary HuggingFace URLs directly. LLM models must be pre-compiled for Metis hardware and use a specific YAML format with precompiled_url pointing to Axelera-hosted model packages.

To use an LLM, you can copy an existing YAML from ax_models/zoo/llm/ to ax_models/llm/ and run with ./inference_llm.py <model-name> or axllm <model-name>. LLM support is still marked as experimental at the moment, but there is the LLM tutorial that has some really good insights into getting started. 👍

Technically it’s all very possible, but currently needs a more hands on, experimental approach 😅 But keep us posted! I really want to see what you’re able to do with all this! What’s the project you’re working on?


Forum|alt.badge.img+1
  • Author
  • Ensign
  • December 19, 2025

that helps a lot 😁

 

want the following models:

r1-1776:70B
codellama
codegemma

is there a chance to get this precompiled this year?


Spanner
Axelera Team
Forum|alt.badge.img+2
  • Axelera Team
  • December 19, 2025

that helps a lot 😁

 

want the following models:

r1-1776:70B
codellama
codegemma

is there a chance to get this precompiled this year?

Awesome, glad that helped!

Honestly, probably not this year, given we’re already well into December… but do add it to the Launchpad as that’s a great way to draw attention to new features, models, etc that could go on the roadmap 👍


Forum|alt.badge.img+1
  • Author
  • Ensign
  • December 20, 2025

@Spanner how to pre-compile?

is there a special tool what’s used with graphic cards or aipu’s?

i want access to it ...


Spanner
Axelera Team
Forum|alt.badge.img+2
  • Axelera Team
  • December 21, 2025

Unfortunately I don’t think there is yet for LLMs (or SLMs) as it’s still quite experimental. But there’s a LOT of working going on internally around this, so I will ask the team if there’s anything that’s in condition to be shared yet 👍


Forum|alt.badge.img+1
  • Author
  • Ensign
  • December 21, 2025

I read the model parameter like:  llama-3-1-8b-1024-4core-static

is it possible to make  llama-3-1-8b-2048-8core-static, llama-3-1-8b-3072-12core-static, llama-3-1-8b-4096-16core-static

 

true within the metis 4x4core 64GB card?

and higher models like:  llama-3-3-70b-4096-16core-static

 

??? big question