@nielsr I’m currently going through your donut vqa fine-tuning experiment, and I’m confused about the gt_parses column.
-
What is the correct format for multiple questions for each answer? Your tutorial shows the formatting for multiple answers for each question but not the converse.
-
I noticed that in your docvqa_1200_examples_donut dataset, most rows only have one ground truth in the gt_parses list. Moreover, rows that have more than one ground truth actually have the question and answer repeated. Is this by design, or is there something wrong?
Thanks!