The OCR engine you are using for inference should be the same you used for training. By default the Transformers AutoProcessor uses PyTesseract under the hood when you set apply_ocr attribute to True. So the solution is to use the same ocr engine during inference and to manualy pass words and bboxes to the processor.
Please correct me if I’m wrong but I feel like this is the case.
1 Like