How to configure custom LLM models in YAML?

Forum|Forum|2 months ago
December 18, 2025
6 replies
78 views

florit
Ensign

@Spanner what i get on this code?

by launching “axllm.py my.yaml”

try:
        network_yaml_info = yaml_parser.get_network_yaml_info(
            include_collections=['llm_local', 'llm_cards', 'llm_zoo']
        )
        parser = config.create_llm_argparser(
            network_yaml_info, description='Perform LLM inference on an Axelera platform'
        )

How to configure the yaml file

what’s inside?

can i specify like

llm_local = ip:port
llm_cards = /dev/metis0
llm_zoo = https://huggingface.com/my/llm.zip

…………

Or

llm_zoo:
  - url: ${HUG_RELEASE_URL}/llm/myhillarius.zip
  - url: ${HUG_RELEASE_URL}/llm/codellama:8B.zip
  - url: ${HUG_RELEASE_URL}/llm/r1-1776:70B.zip

By duck tapeing models to my yaml

Spanner
Axelera Team
Forum|Forum|2 months ago
December 19, 2025

Hi @florit ! (I moved this message out into its own topic, as I thought it’d make it easier to focus the conversation 😀 Hope that’s cool!)

The collections you're seeing (llm_local, llm_cards, llm_zoo) in the yaml_parser.get_network_yaml_info() code are internal SDK parameters that control where the parser looks for LLM YAML files, but they're not user-configurable as far as I know.

So you can't point to arbitrary HuggingFace URLs directly. LLM models must be pre-compiled for Metis hardware and use a specific YAML format with precompiled_url pointing to Axelera-hosted model packages.

To use an LLM, you can copy an existing YAML from ax_models/zoo/llm/ to ax_models/llm/ and run with ./inference_llm.py <model-name> or axllm <model-name>. LLM support is still marked as experimental at the moment, but there is the LLM tutorial that has some really good insights into getting started. 👍

Technically it’s all very possible, but currently needs a more hands on, experimental approach 😅 But keep us posted! I really want to see what you’re able to do with all this! What’s the project you’re working on?

florit
Author
Ensign
Forum|Forum|2 months ago
December 19, 2025

that helps a lot 😁

want the following models:

r1-1776:70B
codellama
codegemma

is there a chance to get this precompiled this year?

Spanner
Axelera Team
Forum|Forum|2 months ago
December 19, 2025

that helps a lot 😁

want the following models:

r1-1776:70B
codellama
codegemma

is there a chance to get this precompiled this year?

Awesome, glad that helped!

Honestly, probably not this year, given we’re already well into December… but do add it to the Launchpad as that’s a great way to draw attention to new features, models, etc that could go on the roadmap 👍

florit
Author
Ensign
Forum|Forum|2 months ago
December 20, 2025

@Spanner how to pre-compile?

is there a special tool what’s used with graphic cards or aipu’s?

i want access to it ...

Spanner
Axelera Team
Forum|Forum|2 months ago
December 21, 2025

Unfortunately I don’t think there is yet for LLMs (or SLMs) as it’s still quite experimental. But there’s a LOT of working going on internally around this, so I will ask the team if there’s anything that’s in condition to be shared yet 👍

florit
Author
Ensign
Forum|Forum|2 months ago
December 21, 2025

I read the model parameter like: llama-3-1-8b-1024-4core-static

is it possible to make llama-3-1-8b-2048-8core-static, llama-3-1-8b-3072-12core-static, llama-3-1-8b-4096-16core-static

true within the metis 4x4core 64GB card?

and higher models like: llama-3-3-70b-4096-16core-static

??? big question

Sign up

Log in, or create an Axelera AI account

Login to the community

Log in, or create an Axelera AI account

Scanning file for viruses.

This file cannot be downloaded