Creating a docvqa dataset - gt_parses

@nielsr I’m currently going through your donut vqa fine-tuning experiment, and I’m confused about the gt_parses column.

  1. What is the correct format for multiple questions for each answer? Your tutorial shows the formatting for multiple answers for each question but not the converse.

  2. I noticed that in your docvqa_1200_examples_donut dataset, most rows only have one ground truth in the gt_parses list. Moreover, rows that have more than one ground truth actually have the question and answer repeated. Is this by design, or is there something wrong?

Thanks!