Dear Axelera Support,
Â
I am attempting to install Voyager SDK on the radxa rock 5T board with the axelera AI Accelerator M.2. While the SDK suggests using Ubuntu 22.04, I could not find a version of ubuntu for the 5T (https://github.com/DHDAXCW/ubuntu-rockchip-rk3588?tab=readme-ov-file)Â
Â
I have reviewed previous threads but either they’re not fixed (https://community.axelera.ai/support-central-47/subject-metis-aipu-detected-on-pcie-but-not-responding-stage0-load-failure-on-rk3588-compute-board-voyager-sdk-1-5-3-1280) or there’s an ubuntu version available (https://community.axelera.ai/metis-pcie-7/pcie-connection-doesn-t-work-with-radxa-rock-5b-1101)Â
Â
I’m not exactly sure if I can use the ubuntu image meant for the Radxa Rock 5B+ on the Rock 5T?
Â
radxa@rock-5t:~$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
Â
radxa@rock-5t:~$ cat /sys/class/metis/version
1.4.16
Â
Installation of SDK for debian
This is how I manage to get the SDK to be installed on debian:
After cloning the repository: https://github.com/axelera-ai-hub/voyager-sdk/blob/0b25b098ca5fa591d7b5aa2fe71ad017089d64b3/docs/tutorials/install.mdÂ
Â
Install base dependencies:
sudo apt update
sudo apt install -y \
  gstreamer1.0-tools \
  gstreamer1.0-plugins-base \
  gstreamer1.0-plugins-good \
  gstreamer1.0-plugins-bad \
  gstreamer1.0-libav \
  libgstreamer1.0-dev \
  python3-gi \
  python3-pip \
  pciutils \
  dkms
Â
Create Debian config (copy from Ubuntu):
cd ~/Documents/voyager-sdk
Â
cp cfg/config-ubuntu-2204-arm64.yaml \
   cfg/config-debian-12-arm64.yaml
Â
sed -i 's/python3\.10-dev/python3.11-dev/g' \
  cfg/config-debian-12-arm64.yaml
Â
Fix Python version mismatch:
sudo apt install -y \
  python3.11-dev \
  libpython3-dev \
  python3-venv \
  build-essential
Â
Remove invalid Ubuntu-only GStreamer package:
python3 - <<'PY'
from pathlib import Path
p = Path("cfg/config-debian-12-arm64.yaml")
text = p.read_text().replace(
    "  - libgstreamer-plugins-good1.0-dev\n", ""
)
p.write_text(text)
print("patched")
PY
Â
sudo apt install -y \
  gstreamer1.0-plugins-good \
  libgstreamer-plugins-base1.0-dev \
  libgstreamer-plugins-bad1.0-dev \
  libgstrtspserver-1.0-dev
Â
sudo apt install -y \
  librga-dev \
  librga2 \
  libasio-dev \
  libboost-log1.74.0 \
  libboost-program-options1.74.0 \
  libboost-regex1.74.0 \
  libeigen3-dev \
  graphviz \
  unzip
Â
Operator build failure - Fixing Werror / arch workaround
Problem: Unsupported architecture: unknown ""
Â
export ARCH=arm64
export AARCH64=1
export WERROR=0
Â
Run installer:Â
./install.sh --runtime --no-development
Â
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ lspci | grep -i axelera
0000:01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ lsmod | grep metis
metis         118784 0
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ triton_multi_ctx
usage: triton_multi_ctx
Â
Issues
However, these are the issues:
Â
1st issue: (venv) radxa@rock-5t:~/Documents/voyager-sdk$ axdevice
WARNING: 4PCI device count mismatch: lspci=1, triton=0
Â
Followed by sudo dmesg | grep -iE 'metis|axelera|aipu|triton|firmware|pci'
The output can be seen in axdevice.txt
I could not properly diagnose this, so I threw the output to chatgpt which gave me this diagnosis (needs to be verified):
BAR 2: no space for [mem size 0x02000000]
followed by BAR 0 / BAR 6 also failing to assign. That means the ROCK 5T host can enumerate the Metis card on PCIe, but it is not giving the device enough MMIO/BAR address space, so the Axelera runtime cannot fully map the card. That matches why you get lspci=1, triton=0: Linux sees the device, but the runtime cannot use it.
Â
2nd issue: Another worthy thing I diagnosed is axelera-multi-device.service where the output can be found in axelera-multi-device.service.txt (failed to start)
Â
Root problem
The Metis needs 32MB for BAR2 (non-prefetchable). The original pcie@fe150000 DT node only had a 14MB MEM window (0xf0200000–0xf0ffffff), so the kernel could never assign BAR2.
Â
Attempted Fixes
Attempted solutions targeting memory allocation for the PCIE within rk3588-rock-5t.dtb (device tree) found in radxa@rock-5t:/usr/lib/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb :
Â
Fallback, Recovery & Serial Debugging
Made a copy before any edits and this is the usual pipeline:
vi ~/rock5t-safe.dts  # change only 0xe00000 → 0x04000000
dtc -I dts -O dtb -o ~/rock5t-safe.dtb ~/rock5t-safe.dts
sudo cp ~/rock5t-safe.dtb /usr/lib/linux-image-$(uname -r)/rockchip/rk3588-rock-5t.dtb
sudo reboot
Â
If there’s a soft crash, this is fixed by transferring the SD card to my laptop and reverting the dtb
sudo cp \
/media/<usr-name>/rootfs/home/radxa/dtb-backup/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb \
/media/<usr-name>/rootfs/usr/lib/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb
sync
sudo umount /media/<usr-name>/rootfs
sudo umount /media/<usr-name>/config
Â
The reason for the crashes can be found via serial debugging via a USB-to-TTL cable
GPIO pins: pin 6 →GND, pin 8 → TX, pin 10 → RX
View serial logs via Tabby on baud rate 1500000 on device /dev/ttyUSB0 with data bits 8, stop bits 1 and no parity. Default settings otherwise.
Â
I’m not an expert, so I relied on claude and chatgpt to help with the ideation and diagnosis of the fixes, I can provide the logs where necessary:
Â
Attempt Logs
Attempt 1 — Expand fe150000 MEM to 64MB
Changed ranges size from 0x00e00000 to 0x04000000.
Failed — this caused fe150000 MEM end (0xf41fffff) to overlap with fe180000 config (0xf3000000) and fe190000 config (0xf4000000), triggering a resource collision and kernel NULL pointer dereference crash in rk_pcie_remove.
Â
Attempt 2 — Expand fe150000 + shift fe160000/fe170000/fe180000/fe190000
Moved all five controllers to new address ranges to avoid collision.
Failed — fe180000 and fe190000 were moved to 0xf7/0xf8 ranges which appear to be outside what the RK3588 hardware actually supports, causing a boot hang. The system never reached SSH.
Â
Attempt 3 — Expand fe150000 to 46MB + shift only fe160000/fe170000, leave fe180000/fe190000 original
fe150000 MEM: 0xf0200000–0xf2ffffff (46MB, enough for Metis).
fe160000 config: 0xf5000000, fe170000 config: 0xf6000000.
fe180000/fe190000 unchanged at original 0xf3/0xf4 addresses.
Partially successful — no collision, no crash. fe150000 linked up at Gen.3 x2. However the system still hung — appeared to be a readl spin on CPU7 during PCIe enumeration.
Â
Attempt 4 — Identify source of hang
Disabled fe150000 entirely to isolate whether it was causing the hang.
Result — system still hung at same point ([14.77x] imx415 sensor). Revealed the hang was not fe150000 but something else.
Â
Attempt 5 — Disable camera DTBO overlays
Removed rock-5t-cam0-radxa-camera-4k.dtbo and rock-5t-cam1-radxa-camera-4k.dtbo from extlinux.conf.
Result — revealed a new and different crash: kernel stack overflow in pci_do_find_bus with infinite recursion (~80+ levels deep). The camera overlays were masking this crash by hanging earlier.
Â
Root cause of stack overflow identified
The Metis internal bridge advertises subordinate bus = 0xff, causing pci_scan_bridge_extend → pci_scan_child_bus_extend → pci_scan_bridge_extend to recurse infinitely until kernel stack exhausts.
Â
Attempt 6 — Change bus-range from <0x00 0x0f> to <0x01 0x0f>
Hypothesis was that bus 0 conflict caused the recursion.
Failed — same stack overflow, same recursion depth. Bus range starting at 1 did not prevent the subordinate bus scan recursion.
Â
Attempt 7 — Add pci=noaer pcie_aspm.policy=performance kernel parameters
Failed — same stack overflow unchanged. These parameters do not affect bridge subordinate bus scanning behavior.
Â
Current state
The DT memory allocation is correct and working (46MB window, no collisions, Gen.3 x2 link confirmed). The remaining problem is purely a kernel-level PCIe bridge scanning issue triggered by the Metis advertising subordinate=0xff. The next attempt is pci=noaer,noexpand pcie_aspm=off pcie_port_pm=off which specifically prevents subordinate bus range expansion during enumeration. I still cannot complete the boot up sequence with this bug.
Â
I have attached the dtb in .txt format for your perusal. Thank you!

