Skip to main content
Question

Subject: Metis AIPU detected on PCIe but not responding – stage0 load failure on RK3588 Compute Board (Voyager SDK 1.5.3)

  • March 22, 2026
  • 7 replies
  • 238 views

Dear Axelera Support,

I am currently following the Voyager SDK quick start guide (YOLO inference demo) on an RK3588-based Compute Board (received out-of-the-box), using Voyager SDK version 1.5.3. However, I am unable to successfully run inference due to persistent device communication issues with the Metis AIPU.

--------------------------------------------------

System Overview

- Platform: RK3588 (Compute Board, out-of-the-box setup)
- Voyager SDK: 1.5.3 (and also tested with 1.4.0)
- Connection: PCIe (Metis AIPU detected as 01:00.0 [1f9d:1100])

--------------------------------------------------

Initial Issue

Running the quick start inference:

./inference.py yolov5s-v7-coco usb:/dev/video1

Result:

ERROR: No devices found

At that moment:
- lspci returned no devices, indicating a PCIe enumeration issue.

--------------------------------------------------

After Cold Boot

After performing a full power cycle:

- lspci correctly shows:
  01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02) [1f9d:1100]

However, the device still does not respond:

Device communication timed out
Failed to get valid board type (got 8)
board_type=unknown (not responding)

--------------------------------------------------

Runtime / Inference Error

When the Metis device is detected and I attempt to run the model again, I consistently receive the following error:

[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
Failed to get valid board type for device metis-0:1:0 got 8
Unknown board type unknown from device metis-0:1:0, assuming pcie
arm_release_ver: g13p0-01eac0, rk_so_ver: 11
WARNING : Unknown board type unknown from device metis-0:1:0, assuming pcie
[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
ERROR   : Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin
ERROR   : AXR_ERROR_CONNECTION_ERROR: Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin

--------------------------------------------------

axdevice Diagnostics

axdevice -v
- PCI device detected
- Driver detected
- Device not responding

axdevice --refresh -v
Failed to load runtime stage0:
start_axelera_runtime_stage0.bin
AXR_ERROR_CONNECTION_ERROR

axdevice --reboot -v
Failed to execute reboot
Failed to execute a cold_boot
AXR_ERROR_CONNECTION_ERROR

axdevice --report
Generated and in attachments

--------------------------------------------------

Summary

- PCIe enumeration works after a cold boot
- Metis device is visible and the driver is loaded
- Communication with the device consistently times out
- Runtime stage0 cannot be loaded
- Device reboot via axdevice fails
- The same stage0 failure occurs when running inference

--------------------------------------------------

Could you please help me resolve this issue or advise on how to proceed?

I have attached the generated report file for further analysis.

Thank you in advance for your support.

7 replies

  • Axelera Team
  • March 24, 2026

Hello ​@Nathan

Thank you for sharing the detailed observations and logs. I would like to verify one additional detail—the Yocto BSP version and Metis driver version. Could you please run the following commands and share their outputs?

  • cat /etc/os-release
  • cat /sys/class/metis/version

  • Author
  • Cadet
  • March 24, 2026

Hi snehakondur,

This is our following Output:

antelao@antelao-3588:~$ cat /etc/os-release

  • ID=voyager
  • NAME="Voyager Linux"
  • VERSION="1.3.1"
  • BUILD_ID="jenkins_235"
  • PRETTY_NAME="Voyager Linux 1.3.1"
  • BOARD_TYPE=antelao-3588
  • GIT_HASH="6db6ebfcfc0214e7e554b91081232698900ccc8d"

antelao@antelao-3588:~$ cat /sys/class/metis/version

  • 1.4.4

  • Cadet
  • May 21, 2026

Hi ​@Nathan , ​@snehakondur ,

 

Any resolution for this?

 

I have been using my Metis Compute Board recently perfectly fine with kernel version 1.4.4 and sdk version 1.5.3, However, yesterday I updated the Kernel versions to 1.4.16 and the SDK to 1.6.0. 

 

It was working well (running inference inside containers and applications outside), but I believe since I powered cycled something has went wrong with the build - I get the same no devices found error: Failed to get valid board type for device metis-0:1:0 got 8

.I am running the BSP v1.3.1 
 

I have tried power cycling a few times and running axdevice --refresh, but the problem persists - any ideas?

Thanks!


Spanner
Axelera Team
Forum|alt.badge.img+3
  • Axelera Team
  • May 22, 2026

Hi ​@Nathan , ​@snehakondur ,

Any resolution for this?

Hi ​@cking233! Hmm, that’s an odd one. Since this only started after the power cycle, maybe the first thing to rule out is whether the updated driver is actually being loaded on boot? Could you run lsmod | grep metis and share the output? That'll show us what kernel module is in place. From there we can work out the next step. 👍


  • Cadet
  • May 26, 2026

Hi ​@Spanner , 

 

Output of lsmod | grep metis is:

metis                 126976  0

Be it inside or out of the Docker container on my Metis Compute board. 


  • Cadet
  • May 26, 2026

Also, for some more information. 

 

The output of:

axdevice -v

is: 

antelao@antelao-3588:~$ python3 start_axelera.py start --container-name "may_con" --version "1.6.0"
[INFO] Initializing Voyager SDK container 'may_con' (Version 1.6.0)
[INFO] X11 display server detected on host system (:0).
[INFO] SSH connection detected - container configured for host's physical display.
[INFO] GUI applications will appear on the display connected to the host machine.
[INFO] Existing container 'may_con' detected.
[INFO] Container has exited. Restarting container session...
ubuntu@antelao-3588:~/voyager-sdk$ axdevice -v
INFO: Found AIPU driver: metis 126976 0
[libaxldev_linux.c:1515] Device communication timed out: device did not respond within 1 seconds. (0)
WARNING: Failed to get valid board type for device metis-0:1:0 got 8
INFO: Ignoring device 0(metis-0:1:0) as it has not returned a valid board type: 8
INFO: Using device metis-0:1:0
Device 0: metis-0:1:0 board_type=unknown (not responding)

 

I am not quite sure what is happening. I also know that if I soft reboot (reboot from inside a bash terminal), then the metis device will not appear as a connected pcie device (lspci) until it has been power cycled at the wall. 

 

This has only happened once I attempted to run the scripts here: https://github.com/jde-axelera/yocto_voyager

Which I believe has messed with the configuration of my device. 

 

Thanks, 

Cameron 


  • Cadet
  • May 26, 2026

Hi ​@Spanner , 

 

No luck debugging it any further today the errors continue. What I did find as a sign from the output of “lspci -vv -s 01:00.0” showing “DLActive-” which apparently suggests the device is physically connected but there is a firmware error? 

See the attached .txt file for the output of some of the common axelera debug prompts. 

 

Hopefully someone has dealt with this issue before!

 

Thanks, 

Cameron