How to get bounding boxes for TrOCR?

Hi everyone. I’ve been playing around with the TrOCR model (TrOCR) and was wondering how can I get bounding boxes for each character in the line image. I’d appreciate any pointers. Thanks!

Hi. I think you need a text detector like CRAFT model that detects text in a scene. It has a parameter in inference named <<link_threshold>> that tunes linkage between detection. Imo raising it up will capture more standalone characters and words. Try it yourself please.