Step by step toward an embedded system for gi (gastrointestinal) disease diagnostics support; how to debug it at best? Looking hundreds of hours of dance videos!
Hardware side
embedded system composed by a beautiful orange pi 5 plus 8GB and an excellent Axelera metis (at 50% of MVVM)
SW side
OS
Orange pi 1.2.0 Ubuntu 22.04 kernel 5.10.160
GUI app
- implemention c++ based on SDL2 / NANOVG over DRM and OpenGL ES 3 to minimize footprint in system and avoid X/wayland
- using EGL to take advantage of zero copy features and maintain full fps also at UHD/30
- video capture on HDMI input board using V4L2
- video stream internal management using MPP (to take advantage of SOC HW features in example to encode video stream for recording features)
- video frame scaling using librga to minimize dataflow given that is managed internally to RK3588
Inference app
- indipendent c++ inference server based on axruntime (VOYAGER SDK 1.6.0):
- receive from gui frames scaled 640x640 over SHM
- run inference on yolo11lseg-coco-onnx model to get instance segmentation roi
- contour extraction from roi and next parametrically smoothing of it to give now a visual compliant result
- centroid and some additional features extraction over roi / contour calculated to do reliable shapes tracking
- send data back to gui app over unix sockets
Next Steps to have a minimal POC:
- solves glitches on performance to maintain a strong full FPS when inference run
- train a new model yolov11(n/m/l/x?)seg over datasets kvasir-seg to validate train procedure and start to identify right objects…
Side effect at POC completion: No more dance videos!

