Fine-tuning TrOCR to do digit recognition in another language

hadamard-2 · May 21, 2024, 5:55pm

I’m a beginner so I’m sorry if I’m making any wrong assumptions.

If I was building my own model, I would have all the control over the dimensions of the images I use to train my model. But since I’m fine-tuning a pre-trained model, I think there might be constraints when it comes image dimensions and some other aspects. If so, what are these restrictions/constraints I need to be aware of before generating my synthetic dataset?

Are there TrOCR model components (encoder, decoder, …) or other components like tokenizers that I have to find replacements for, in order to be able to work on my specific problem of Ge’ez digit recognition?
The pretrained model (one of the ones trained with English text), I think, has some notion of what each digit looks like. Won’t it create a confusion & cause it to perform terribly when I try to fine-tune it with glyphs from another language?
All the TrOCR examples (specifically the ones @nielsr created) I’ve seen are done on single-line text images. But there’s no reason for it to not work on single digit images, right? (eg. Afro-MNIST dataset)

I was going through one of the tutorial notebooks @nielsr created, (thank you so much for those btw) on Google Colab. And using the T4 GPU, fine-tuning the TrOCR model on IAM test set takes a lot of time. It got to the point where Colab told me that I’ve run out of my usage limits. Specially given that the dataset is relatively small (about 3K), I’m worried if I have the resources available to fine-tune the model on a relatively bigger dataset (eg. Afro-MNIST). Are there any strategies I should explore to efficiently utilize the resources Google Colab gives me? Or are there any other free options available?

Thank you.

Topic		Replies	Views
How to fine tune TrOCR model properly? Beginners	2	8500	November 15, 2021
Fine-tuning TrOCR on new language 🤗Transformers	4	2398	April 10, 2025
This is my fine tuning trocr code why is it not working anyone please help me I really need your help I am working on new language 🤗Transformers	9	32	July 8, 2025
TrOCR training from scratch Beginners	1	1305	October 23, 2022
Fine-tuning TrOCR on custom dataset 🤗Transformers	1	2591	October 18, 2023