Skip to main content
Question

lspci does not find M.2 card, reports Non-VGA unclassified device: Synopsys, Inc. DWC_usb3 / PCIe bridge

  • September 3, 2025
  • 7 replies
  • 122 views

shabaz
Ensign
Forum|alt.badge.img+1

Hi,

Bit of a weird issue with the M.2 Evaluation System (SBC model AIB-MR1B-A1), and I’m worried it could be a hardware fault. I somehow got to a state where I would see 

AXR_ERROR_CONNECTION_ERROR: No target device found in lspci output".

lspci didn’t show the accelerator card, but instead listed “Non-VGA unclassified device”:

00:00.0 PCI bridge: Rockchip Electronics Co., Ltd RK3588 (rev 01)
01:00.0 Non-VGA unclassified device: Synopsys, Inc. DWC_usb3 / PCIe bridge

dmesg showed:

Tue Sep 2 19:51:14 2025] rk-pcie fe170000.pcie: PCIe Linking... LTSSM is 0x3 
[Tue Sep 2 19:51:16 2025] rk-pcie fe170000.pcie: PCIe Link Fail
[Tue Sep 2 19:51:16 2025] rk-pcie fe170000.pcie: failed to initialize host

After trying a few things, I resorted to re-imaging the SBC. Then, on the first log-in from adb shell, I still saw Non-VGA unclassified device, but after a reboot, SSH’ing into the Eval System, I saw it as:

00:00.0 PCI bridge: Rockchip Electronics Co., Ltd Device 3588 (rev 01)
01:00.0 Processing accelerators: Device 1f9d:1100

And then after installing the Voyager SDK:

00:00.0 PCI bridge: Rockchip Electronics Co., Ltd RK3588 (rev 01)
01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)

Incidentally I always shut down using the sudo poweroff command, i.e. never an uncontrolled shutdown. All fine, but again today when I came to use it, I saw Non-VGA unclassified device. I rebooted, and lspci showed nothing at all.

root@aetina:~# lspci
root@aetina:~# sudo sh -c 'echo 1 > /sys/bus/pci/rescan'
root@aetina:~# lspci -nn
root@aetina:~# lspci
root@aetina:~# dmesg | grep -iE 'rk-pcie|pcie link|nvme'
[ 1.810292] rk-pcie fe150000.pcie: invalid prsnt-gpios property in node
[ 1.810321] rk-pcie fe170000.pcie: invalid prsnt-gpios property in node
[ 1.815781] rk-pcie fe170000.pcie: missing legacy IRQ resource
[ 1.815800] rk-pcie fe170000.pcie: IRQ msi not found
[ 1.815811] rk-pcie fe170000.pcie: use outband MSI support
[ 1.815819] rk-pcie fe170000.pcie: Missing *config* reg space
[ 1.815832] rk-pcie fe170000.pcie: host bridge /pcie@fe170000 ranges:
[ 1.815854] rk-pcie fe170000.pcie: err 0x00f2000000..0x00f20fffff -> 0x00f2000000
[ 1.815870] rk-pcie fe170000.pcie: IO 0x00f2100000..0x00f21fffff -> 0x00f2100000
[ 1.815886] rk-pcie fe170000.pcie: MEM 0x00f2200000..0x00f2ffffff -> 0x00f2200000
[ 1.815898] rk-pcie fe170000.pcie: MEM 0x0980000000..0x09bfffffff -> 0x0980000000
[ 1.815931] rk-pcie fe170000.pcie: Missing *config* reg space
[ 1.815959] rk-pcie fe170000.pcie: invalid resource
[ 1.826810] rk-pcie fe150000.pcie: missing legacy IRQ resource
[ 1.826836] rk-pcie fe150000.pcie: IRQ msi not found
[ 1.826845] rk-pcie fe150000.pcie: use outband MSI support
[ 1.826865] rk-pcie fe150000.pcie: host bridge /pcie@fe150000 ranges:
[ 1.826905] rk-pcie fe150000.pcie: IO 0x00f0100000..0x00f01fffff -> 0x00f0100000
[ 1.826939] rk-pcie fe150000.pcie: MEM 0x0900000000..0x091fffffff -> 0x0040000000
[ 1.826957] rk-pcie fe150000.pcie: MEM 0x0920000000..0x093fffffff -> 0x0060000000
[ 1.827026] rk-pcie fe150000.pcie: invalid resource
[ 2.021962] rk-pcie fe170000.pcie: PCIe Linking... LTSSM is 0x3
[ 2.031961] rk-pcie fe150000.pcie: PCIe Linking... LTSSM is 0x0
[ 2.047510] rk-pcie fe170000.pcie: PCIe Linking... LTSSM is 0x3
[ 2.057516] rk-pcie fe150000.pcie: PCIe Linking... LTSSM is 0x0
[ <truncated similar lines>
[ 27.187539] rk-pcie fe150000.pcie: PCIe Linking... LTSSM is 0x1
[ 27.207577] rk-pcie fe170000.pcie: PCIe Linking... LTSSM is 0x3
[ 27.214288] rk-pcie fe150000.pcie: PCIe Linking... LTSSM is 0x0
[ 29.164223] rk-pcie fe170000.pcie: PCIe Link Fail
[ 29.164294] rk-pcie fe170000.pcie: failed to initialize host
[ 29.170893] rk-pcie fe150000.pcie: PCIe Link Fail
[ 29.170958] rk-pcie fe150000.pcie: failed to initialize host

After another reboot, I’m back to the Non-VGA unclassified device.

I can try re-imaging the system again, but I’m concerned the same thing will happen again, unless I try to figure out what could have gone wrong. 

I also tried retrieving the live device-tree using sudo dtc -I dtb -O dts -o live.dts /sys/firmware/fdt and it is attached.

Has anyone seen a similar issue? Any debugging steps I should take? If it is indeed a SBC hardware issue or likely to be, then I’ll try to purchase another SBC, or Eval System,  but would like to be fairly sure that it is indeed hardware-related before I try that option.

Incidentally I have tried to re-seat the accelerator card, but it made no difference. I was fairly sure that couldn’t have been an issue anyway, since the board is protected in a cover with just fan holes (no dust or knocks possible to unseat or affect the connections), but figured it was worth a try.

Many thanks!

 

7 replies

shabaz
Ensign
Forum|alt.badge.img+1
  • Author
  • Ensign
  • September 3, 2025

I narrowed it down; I tried re-imaging again, and at the same stage (just after the ./flash.sh and sudo adb shell), I again saw “Non-VGA unclassified device” followed by the correct “Processing accelerators: Device 1f9d:1100” after a reboot.

This was too much of a coincidence! 

Test 1:  After a sudo poweroff, and then unplug and re-insert the DC power connector, again I saw “Non-VGA unclassified device”. I repeated this test 10 times, and saw “Non-VGA unclassified device” each time

Test 2: After a sudo poweroff and then press the little black power button, I still saw “Non-VGA unclassified device”. Repeated about 5 times, no difference.

Test 3: After a sudo poweroff, and then unplug and re-insert the DC power connector, again I saw “Non-VGA unclassified device” (this is the same as Test 1 so far), but this time I typed sudo reboot and then I saw “Processing accelerators: Device 1f9d:1100”! I repeated this 10 times, and it’s consistent, I see the correct output after a subsequent reboot, but not immediately after a poweroff cycle.

I think this confirms to me that it’s a SBC issue, most likely power related. I’m speculating loads but perhaps the power consumption at power-up is just slightly high enough to cause a slight voltage dip to cause the the link to fail, whereas the power may be more stable during a reboot. I’ll try to reach out to the SBC manufacturer to see if they have something to suggest based on these symptoms. At least I now know that a reboot will (or may! - I have yet to plug in other things into the SBC which will consume some current) get things working if the board has been power-cycled. I think I may need to replace this SBC however, since there may be knock-on effects, if it is indeed a hardware fault with it.

 


  • Axelera Team
  • September 5, 2025

Hi ​@shabaz,

 

thanks for your extensive investigation and description!  Before we continue, can you clarify which version of the Voyager SDK you were using during your tests?  And also which firmware versions were installed on your Metis card and which kernel driver version was installed on the host?

You can find out the versions as follows:

  • Voyager SDK version: Starting from the latest 1.4.0 release, the command axversion is available.  Otherwise, inspect the existing directory names under /opt/axelera/ and derive the version number from those (i.e., /opt/axelera/runtime-1.3.3-1 would mean SDK 1.3.3).
  • Firmware versions on Metis card: Run axdevice -v.  This only works if the device was recognized by the host, but from your description it seems you now know a way to get it into that state reliably.
  • Kernel driver version: cat /sys/class/metis/version

shabaz
Ensign
Forum|alt.badge.img+1
  • Author
  • Ensign
  • September 5, 2025

Hi Manuel,

Thanks for the response!

Here is the info:

Voyager SDK version: 1.4.0

Metis Firmware versions (obtained after doing the sudo reboot to get it visible):

INFO: Found PCI device: 01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)
INFO: Found AIPU driver: metis 57344 0
INFO: Current firmware version v1.2.0-rc2+bl1-stage0 != required version v1.4.0
INFO: Device firmware version is not compatible, loading now
INFO: Using device metis-0:1:0
Device 0: metis-0:1:0 1GiB m2 flver=1.2.0-rc2 bcver=1.0 clock=800MHz(0-3:800MHz) mvm=0-3:100%
device_runtime_firmware=v1.4.0
board_controller_board_type=ortles
sw_throttling: 200°C, hysteresis 5°C, throttle rate:12%
hw_throttling: 105°C, hysteresis 10°C
pvt_warning_threshold: 95°C

Kernel driver version: 1.2.2

 


  • Axelera Team
  • September 9, 2025

Hi Manuel,

Thanks for the response!

Here is the info:

Voyager SDK version: 1.4.0

Metis Firmware versions (obtained after doing the sudo reboot to get it visible):

INFO: Found PCI device: 01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)
INFO: Found AIPU driver: metis 57344 0
INFO: Current firmware version v1.2.0-rc2+bl1-stage0 != required version v1.4.0
INFO: Device firmware version is not compatible, loading now
INFO: Using device metis-0:1:0
Device 0: metis-0:1:0 1GiB m2 flver=1.2.0-rc2 bcver=1.0 clock=800MHz(0-3:800MHz) mvm=0-3:100%
device_runtime_firmware=v1.4.0
board_controller_board_type=ortles
sw_throttling: 200°C, hysteresis 5°C, throttle rate:12%
hw_throttling: 105°C, hysteresis 10°C
pvt_warning_threshold: 95°C

Kernel driver version: 1.2.2

 

Hi ​@shabaz,

 

thanks for the quick follow-up.  Can you please update your device firmware to the latest versions as described in this tutorial?  Before you can perform the update, you need to apply your “reboot trick” again to make the device visible.  Then you can just follow the instructions.

After a successful update, you should see flver=1.4.0 bcver=7.0 in the output of axdevice.

At this point, please rerun your tests, e.g., try if the device is now properly picked up after a reboot, a full power-off, etc.


shabaz
Ensign
Forum|alt.badge.img+1
  • Author
  • Ensign
  • September 10, 2025

Hi Manuel,

Thank you for this feedback!

If it’s OK, ideally I would prefer to hold off from updating the firmware (just in case it’s high-risk, I’d rather only run it if it’s essential), and will continue with the sudo reboot.

If you think it’s essential, I will proceed with the update.

Many thanks,

Shabaz.

 

EDIT: Incidentally I reached out to Aetina, they are currently investigating, in case they can diagnose from the symptom if there is something wrong with the board.


  • Ensign
  • September 10, 2025

For info: I have upgraded the firmware and the SDK v1.4 on my Aetina board and after the powercycle axdevice was showing:

ERROR: No target device found in lspci output
ERROR: AXR_ERROR_CONNECTION_ERROR: No target device found in lspci output

I have tried with many argument --refresh, --pcie-rescan, --reboot, --reload-firmware but all were given same error.

I had to sudo reboot too for the device to show up and axdevice -v now shows:

INFO: Found PCI device: 01:00.0 Processing accelerators: Axelera AI Metis AIPU (rev 02)
INFO: Found AIPU driver: metis 57344 0
INFO: Firmware version matches: v1.4.0
INFO: Using device metis-0:1:0
Device 0: metis-0:1:0 1GiB m2 flver=1.4.0 bcver=7.0 clock=800MHz(0-3:800MHz) mvm=0-3:100%
device_runtime_firmware=v1.4.0
board_controller_board_type=ortles
sw_throttling: 200°C, hysteresis 5°C, throttle rate:12%
hw_throttling: 105°C, hysteresis 10°C
pvt_warning_threshold: 95°C

cat /sys/class/metis/version
1.2.3


  • Axelera Team
  • September 23, 2025

Hi,

Please find below the link to “How to Solve: Metis Driver Failure to Persist After Host Reboot” for your reference. We hope this information will help resolve the issue you encountered.

https://support.axelera.ai/hc/en-us/articles/29308064843794-How-to-solve-Metis-driver-failure-to-persist-after-host-reboot

 

Ed