TrOCR training from scratch


I was wondering if anyone succeed to train TrOCR from scratch with the huggingface library ?
I have some weird behaviors where the model is not really learning when I create data from this repository : GitHub - clovaai/synthtiger: Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

I suppose it may be related to hyper parameters but until now I did not succeed to make the model better. (I did not succeed to get good results neither with a finetuning instead of pretraining from scratch)

so my question is : does anyone succeed to finetune on an artificial dataset the TrOCR model ? With which parameters? Because the finetuning with IAM dataset works well but as soon as I used artificial dataset it does not work. (whatever is the sequence length to predict)