Dear Axelera Support,
I am attempting to install Voyager SDK on the radxa rock 5T board with the axelera AI Accelerator M.2. While the SDK suggests using Ubuntu 22.04, I could not find a version of ubuntu for the 5T (https://github.com/DHDAXCW/ubuntu-rockchip-rk3588?tab=readme-ov-file)
I have reviewed previous threads but either they’re not fixed (https://community.axelera.ai/support-central-47/subject-metis-aipu-detected-on-pcie-but-not-responding-stage0-load-failure-on-rk3588-compute-board-voyager-sdk-1-5-3-1280) or there’s an ubuntu version available (https://community.axelera.ai/metis-pcie-7/pcie-connection-doesn-t-work-with-radxa-rock-5b-1101)
I’m not exactly sure if I can use the ubuntu image meant for the Radxa Rock 5B+ on the Rock 5T?
radxa@rock-5t:~$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
radxa@rock-5t:~$ cat /sys/class/metis/version
1.4.16
Installation of SDK for debian
This is how I manage to get the SDK to be installed on debian:
After cloning the repository: https://github.com/axelera-ai-hub/voyager-sdk/blob/0b25b098ca5fa591d7b5aa2fe71ad017089d64b3/docs/tutorials/install.md
Install base dependencies:
sudo apt update
sudo apt install -y \
gstreamer1.0-tools \
gstreamer1.0-plugins-base \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad \
gstreamer1.0-libav \
libgstreamer1.0-dev \
python3-gi \
python3-pip \
pciutils \
dkms
Create Debian config (copy from Ubuntu):
cd ~/Documents/voyager-sdk
cp cfg/config-ubuntu-2204-arm64.yaml \
cfg/config-debian-12-arm64.yaml
sed -i 's/python3\.10-dev/python3.11-dev/g' \
cfg/config-debian-12-arm64.yaml
Fix Python version mismatch:
sudo apt install -y \
python3.11-dev \
libpython3-dev \
python3-venv \
build-essential
Remove invalid Ubuntu-only GStreamer package:
python3 - <<'PY'
from pathlib import Path
p = Path("cfg/config-debian-12-arm64.yaml")
text = p.read_text().replace(
" - libgstreamer-plugins-good1.0-dev\n", ""
)
p.write_text(text)
print("patched")
PY
sudo apt install -y \
gstreamer1.0-plugins-good \
libgstreamer-plugins-base1.0-dev \
libgstreamer-plugins-bad1.0-dev \
libgstrtspserver-1.0-dev
sudo apt install -y \
librga-dev \
librga2 \
libasio-dev \
libboost-log1.74.0 \
libboost-program-options1.74.0 \
libboost-regex1.74.0 \
libeigen3-dev \
graphviz \
unzip
Operator build failure - Fixing Werror / arch workaround
Problem: Unsupported architecture: unknown ""
export ARCH=arm64
export AARCH64=1
export WERROR=0
Run installer:
./install.sh --runtime --no-development
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ lspci | grep -i axelera
0000:01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ lsmod | grep metis
metis 118784 0
(venv) radxa@rock-5t:~/Documents/voyager-sdk$ triton_multi_ctx
usage: triton_multi_ctx
Issues
However, these are the issues:
1st issue: (venv) radxa@rock-5t:~/Documents/voyager-sdk$ axdevice
WARNING: 4PCI device count mismatch: lspci=1, triton=0
Followed by sudo dmesg | grep -iE 'metis|axelera|aipu|triton|firmware|pci'
The output can be seen in axdevice.txt
I could not properly diagnose this, so I threw the output to chatgpt which gave me this diagnosis (needs to be verified):
BAR 2: no space for [mem size 0x02000000]
followed by BAR 0 / BAR 6 also failing to assign. That means the ROCK 5T host can enumerate the Metis card on PCIe, but it is not giving the device enough MMIO/BAR address space, so the Axelera runtime cannot fully map the card. That matches why you get lspci=1, triton=0: Linux sees the device, but the runtime cannot use it.
2nd issue: Another worthy thing I diagnosed is axelera-multi-device.service where the output can be found in axelera-multi-device.service.txt (failed to start)
Root problem
The Metis needs 32MB for BAR2 (non-prefetchable). The original pcie@fe150000 DT node only had a 14MB MEM window (0xf0200000–0xf0ffffff), so the kernel could never assign BAR2.
Attempted Fixes
Attempted solutions targeting memory allocation for the PCIE within rk3588-rock-5t.dtb (device tree) found in radxa@rock-5t:/usr/lib/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb :
Fallback, Recovery & Serial Debugging
Made a copy before any edits and this is the usual pipeline:
vi ~/rock5t-safe.dts # change only 0xe00000 → 0x04000000
dtc -I dts -O dtb -o ~/rock5t-safe.dtb ~/rock5t-safe.dts
sudo cp ~/rock5t-safe.dtb /usr/lib/linux-image-$(uname -r)/rockchip/rk3588-rock-5t.dtb
sudo reboot
If there’s a soft crash, this is fixed by transferring the SD card to my laptop and reverting the dtb
sudo cp \
/media/<usr-name>/rootfs/home/radxa/dtb-backup/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb \
/media/<usr-name>/rootfs/usr/lib/linux-image-6.1.84-8-rk2410/rockchip/rk3588-rock-5t.dtb
sync
sudo umount /media/<usr-name>/rootfs
sudo umount /media/<usr-name>/config
The reason for the crashes can be found via serial debugging via a USB-to-TTL cable
GPIO pins: pin 6 →GND, pin 8 → TX, pin 10 → RX
View serial logs via Tabby on baud rate 1500000 on device /dev/ttyUSB0 with data bits 8, stop bits 1 and no parity. Default settings otherwise.
I’m not an expert, so I relied on claude and chatgpt to help with the ideation and diagnosis of the fixes, I can provide the logs where necessary:
Attempt Logs
Attempt 1 — Expand fe150000 MEM to 64MB
Changed ranges size from 0x00e00000 to 0x04000000.
Failed — this caused fe150000 MEM end (0xf41fffff) to overlap with fe180000 config (0xf3000000) and fe190000 config (0xf4000000), triggering a resource collision and kernel NULL pointer dereference crash in rk_pcie_remove.
Attempt 2 — Expand fe150000 + shift fe160000/fe170000/fe180000/fe190000
Moved all five controllers to new address ranges to avoid collision.
Failed — fe180000 and fe190000 were moved to 0xf7/0xf8 ranges which appear to be outside what the RK3588 hardware actually supports, causing a boot hang. The system never reached SSH.
Attempt 3 — Expand fe150000 to 46MB + shift only fe160000/fe170000, leave fe180000/fe190000 original
fe150000 MEM: 0xf0200000–0xf2ffffff (46MB, enough for Metis).
fe160000 config: 0xf5000000, fe170000 config: 0xf6000000.
fe180000/fe190000 unchanged at original 0xf3/0xf4 addresses.
Partially successful — no collision, no crash. fe150000 linked up at Gen.3 x2. However the system still hung — appeared to be a readl spin on CPU7 during PCIe enumeration.
Attempt 4 — Identify source of hang
Disabled fe150000 entirely to isolate whether it was causing the hang.
Result — system still hung at same point ([14.77x] imx415 sensor). Revealed the hang was not fe150000 but something else.
Attempt 5 — Disable camera DTBO overlays
Removed rock-5t-cam0-radxa-camera-4k.dtbo and rock-5t-cam1-radxa-camera-4k.dtbo from extlinux.conf.
Result — revealed a new and different crash: kernel stack overflow in pci_do_find_bus with infinite recursion (~80+ levels deep). The camera overlays were masking this crash by hanging earlier.
Root cause of stack overflow identified
The Metis internal bridge advertises subordinate bus = 0xff, causing pci_scan_bridge_extend → pci_scan_child_bus_extend → pci_scan_bridge_extend to recurse infinitely until kernel stack exhausts.
Attempt 6 — Change bus-range from <0x00 0x0f> to <0x01 0x0f>
Hypothesis was that bus 0 conflict caused the recursion.
Failed — same stack overflow, same recursion depth. Bus range starting at 1 did not prevent the subordinate bus scan recursion.
Attempt 7 — Add pci=noaer pcie_aspm.policy=performance kernel parameters
Failed — same stack overflow unchanged. These parameters do not affect bridge subordinate bus scanning behavior.
Current state
The DT memory allocation is correct and working (46MB window, no collisions, Gen.3 x2 link confirmed). The remaining problem is purely a kernel-level PCIe bridge scanning issue triggered by the Metis advertising subordinate=0xff. The next attempt is pci=noaer,noexpand pcie_aspm=off pcie_port_pm=off which specifically prevents subordinate bus range expansion during enumeration. I still cannot complete the boot up sequence with this bug.
I have attached the dtb in .txt format for your perusal. Thank you!

