Skip to main content

A.B.C. ( A.I. Book Cataloguer ) - Stage 3

  • February 23, 2026
  • 4 replies
  • 68 views

Denovo
Ensign

We're in the home stretch for ABC Project.

The pipeline is complete and we're currently debugging the final part (handling discards with cross-referenced indices).

The good news is that we've set aside EasyOCR in favor of a hybrid solution based on PP-OCR available in the Voyager-SDK: the flow involves running the model on Metis to identify all blocks on the covers, while a more advanced model running on CPU captures text with greater accuracy within the identified blocks.

The pipeline starts by using YOLOv8 to recognize the placement of a book in the loading area; at that point, the OCR acquisition is triggered (which uses various tricks to identify and detect text with increasing precision) and ultimately tries to match against the local database we've built.

After recognition, the system waits to determine whether the acquisition was correct (in which case you just need to move the book to confirm everything is fine, and it waits for the next one) or not — fingers crossed. We're working on this last part, which unfortunately generates false positives, but we're confident we'll be able to resolve it in the next few hours!

All updates in the official repo!

4 replies

Why did you opt to move from PaddleOCR to EasyOCR?

Is VoyagerSDK able to convert the onnx version of paddleocr pipeline to its own format?

 


Denovo
Ensign
  • Author
  • Ensign
  • February 24, 2026

The opposite: we moved from EasyOCR (and Tesseract) to PaddleOCR v3 Latin. Both EasyOCR and Tesseract were removed.

For the second question, the SDK can compile single models to run on Metis, but PaddleOCR is a multi-stage pipeline so I don't think it's directly achievable.


Denovo
Ensign
  • Author
  • Ensign
  • February 25, 2026

We have a working prototype 🤗! Details to follow.


Denovo
Ensign
  • Author
  • Ensign
  • February 28, 2026