Skip to main content
Question

Subject: Metis AIPU detected on PCIe but not responding – stage0 load failure on RK3588 Compute Board (Voyager SDK 1.5.3)

  • March 22, 2026
  • 3 replies
  • 178 views

Dear Axelera Support,

I am currently following the Voyager SDK quick start guide (YOLO inference demo) on an RK3588-based Compute Board (received out-of-the-box), using Voyager SDK version 1.5.3. However, I am unable to successfully run inference due to persistent device communication issues with the Metis AIPU.

--------------------------------------------------

System Overview

- Platform: RK3588 (Compute Board, out-of-the-box setup)
- Voyager SDK: 1.5.3 (and also tested with 1.4.0)
- Connection: PCIe (Metis AIPU detected as 01:00.0 [1f9d:1100])

--------------------------------------------------

Initial Issue

Running the quick start inference:

./inference.py yolov5s-v7-coco usb:/dev/video1

Result:

ERROR: No devices found

At that moment:
- lspci returned no devices, indicating a PCIe enumeration issue.

--------------------------------------------------

After Cold Boot

After performing a full power cycle:

- lspci correctly shows:
  01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02) [1f9d:1100]

However, the device still does not respond:

Device communication timed out
Failed to get valid board type (got 8)
board_type=unknown (not responding)

--------------------------------------------------

Runtime / Inference Error

When the Metis device is detected and I attempt to run the model again, I consistently receive the following error:

[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
Failed to get valid board type for device metis-0:1:0 got 8
Unknown board type unknown from device metis-0:1:0, assuming pcie
arm_release_ver: g13p0-01eac0, rk_so_ver: 11
WARNING : Unknown board type unknown from device metis-0:1:0, assuming pcie
[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
ERROR   : Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin
ERROR   : AXR_ERROR_CONNECTION_ERROR: Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin

--------------------------------------------------

axdevice Diagnostics

axdevice -v
- PCI device detected
- Driver detected
- Device not responding

axdevice --refresh -v
Failed to load runtime stage0:
start_axelera_runtime_stage0.bin
AXR_ERROR_CONNECTION_ERROR

axdevice --reboot -v
Failed to execute reboot
Failed to execute a cold_boot
AXR_ERROR_CONNECTION_ERROR

axdevice --report
Generated and in attachments

--------------------------------------------------

Summary

- PCIe enumeration works after a cold boot
- Metis device is visible and the driver is loaded
- Communication with the device consistently times out
- Runtime stage0 cannot be loaded
- Device reboot via axdevice fails
- The same stage0 failure occurs when running inference

--------------------------------------------------

Could you please help me resolve this issue or advise on how to proceed?

I have attached the generated report file for further analysis.

Thank you in advance for your support.

3 replies

  • Axelera Team
  • March 24, 2026

Hello ​@Nathan

Thank you for sharing the detailed observations and logs. I would like to verify one additional detail—the Yocto BSP version and Metis driver version. Could you please run the following commands and share their outputs?

  • cat /etc/os-release
  • cat /sys/class/metis/version

  • Author
  • Cadet
  • March 24, 2026

Hi snehakondur,

This is our following Output:

antelao@antelao-3588:~$ cat /etc/os-release

  • ID=voyager
  • NAME="Voyager Linux"
  • VERSION="1.3.1"
  • BUILD_ID="jenkins_235"
  • PRETTY_NAME="Voyager Linux 1.3.1"
  • BOARD_TYPE=antelao-3588
  • GIT_HASH="6db6ebfcfc0214e7e554b91081232698900ccc8d"

antelao@antelao-3588:~$ cat /sys/class/metis/version

  • 1.4.4

  • May 21, 2026

Hi ​@Nathan , ​@snehakondur ,

 

Any resolution for this?

 

I have been using my Metis Compute Board recently perfectly fine with kernel version 1.4.4 and sdk version 1.5.3, However, yesterday I updated the Kernel versions to 1.4.16 and the SDK to 1.6.0. 

 

It was working well (running inference inside containers and applications outside), but I believe since I powered cycled something has went wrong with the build - I get the same no devices found error: Failed to get valid board type for device metis-0:1:0 got 8

.I am running the BSP v1.3.1 
 

I have tried power cycling a few times and running axdevice --refresh, but the problem persists - any ideas?

Thanks!