[DONUT] Typo errors - Document parsing

WaterKnight · March 14, 2023, 11:20am

I have trained Donut model in my custom dataset following this tutorial from @nielsr .

I have managed to get a decent accuracy 0.65% if I expect the values to be exactly the same (not using edit distance to compare expected value and prediction). I have been evaluating the results of the model and I have seen that the errors I get that don’t help to get a better accuracy are related to typos.

They are small typo errors, like missing a number in value, wrong typo, missing quote… Do you know why this could be happening? Too much training?

For finetuning this model for you custom extraction task, is it neccessary to finetune encoder or decoder? Or can I just freeze the encoder and just train the decoder?

DeepeshAlwani · September 10, 2024, 5:42am

were you able to find a solution for this as i am stuck on a similar situation, thank you

Topic		Replies	Views
Creating custom Donut model Models	0	716	March 16, 2023
Donut fine tuning question 🤗Optimum	0	1630	October 16, 2023
Problem on inference using peft and DonUT 🤗Transformers	0	131	March 26, 2024
Donut base-sized model, pre-trained only for a new language tutorial Models	2	1049	February 19, 2023
Donut - DOC QA - Training the model to say "Answer not found" Beginners	0	219	August 30, 2023

[DONUT] Typo errors - Document parsing

Related topics