How to translate text on PDF inplace?

ModernNoob · December 15, 2024, 9:12pm

I think extracting the text and translating it from a PDF should be pretty easy using openCV and some translator LLM. However, I am not sure how to put the translated text back in the right place. Here’s an example of what I want to achieve:

to

I believe Gradio did something very similar today where they can translate a research paper from English to Chinese without altering the style (here). I know I can probably just use their API, but I am interested in learning how I could do it myself (out of curiosity).

EDIT:
The images are stored in a PDF

Topic		Replies	Views
Multi-lang non-OCR PDF text recognition Beginners	0	542	November 12, 2023
Training a model for a PDF with OCR - where to begin? Beginners	4	10662	October 27, 2024
When vectorizing a pdf to later talk to it. Can we also store and retrieve an image? Beginners	0	75	June 10, 2024
I need your opinion about Metadata Extraction Beginners	0	260	March 27, 2024
Looking for a model for text extraction from complex background Beginners	1	1942	April 22, 2024

How to translate text on PDF inplace?

Related topics