Dear Axelera Support,
I am currently following the Voyager SDK quick start guide (YOLOv5 inference demo) on an RK3588-based Compute Board (received out-of-the-box), using Voyager SDK version 1.5.3. However, I am unable to successfully run inference due to persistent device communication issues with the Metis AIPU.
--------------------------------------------------
System Overview
- Platform: RK3588 (Compute Board, out-of-the-box setup)
- Voyager SDK: 1.5.3 (and also tested with 1.4.0)
- Connection: PCIe (Metis AIPU detected as 01:00.0 [1f9d:1100])
--------------------------------------------------
Initial Issue
Running the quick start inference:
./inference.py yolov5s-v7-coco usb:/dev/video1
Result:
ERROR: No devices found
At that moment:
- lspci returned no devices, indicating a PCIe enumeration issue.
--------------------------------------------------
After Cold Boot
After performing a full power cycle:
- lspci correctly shows:
01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02) [1f9d:1100]
However, the device still does not respond:
Device communication timed out
Failed to get valid board type (got 8)
board_type=unknown (not responding)
--------------------------------------------------
Runtime / Inference Error
When the Metis device is detected and I attempt to run the model again, I consistently receive the following error:
[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
Failed to get valid board type for device metis-0:1:0 got 8
Unknown board type unknown from device metis-0:1:0, assuming pcie
arm_release_ver: g13p0-01eac0, rk_so_ver: 11
WARNING : Unknown board type unknown from device metis-0:1:0, assuming pcie
[libtriton_linux.c:1313] Device communication timed out: device did not respond within 1 seconds. (0)
ERROR : Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin
ERROR : AXR_ERROR_CONNECTION_ERROR: Failed to load runtime stage0: /opt/axelera/device-1.5.3-1/omega/bin/start_axelera_runtime_stage0.bin
--------------------------------------------------
axdevice Diagnostics
axdevice -v
- PCI device detected
- Driver detected
- Device not responding
axdevice --refresh -v
Failed to load runtime stage0:
start_axelera_runtime_stage0.bin
AXR_ERROR_CONNECTION_ERROR
axdevice --reboot -v
Failed to execute reboot
Failed to execute a cold_boot
AXR_ERROR_CONNECTION_ERROR
axdevice --report
Generated and in attachments
--------------------------------------------------
Summary
- PCIe enumeration works after a cold boot
- Metis device is visible and the driver is loaded
- Communication with the device consistently times out
- Runtime stage0 cannot be loaded
- Device reboot via axdevice fails
- The same stage0 failure occurs when running inference
--------------------------------------------------
Could you please help me resolve this issue or advise on how to proceed?
I have attached the generated report file for further analysis.
Thank you in advance for your support.
