For google/deplot, what should I input as header text for fine-tuning?

I am trying to do fine-tuning google/deplot according to the link and Notebook below.

link: DePlot
Notebook: https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_pix2struct.ipynb

but, I got error, “ValueError: A header text must be provided for VQA models.” from following code.

encoding = self.processor(images=item[“image”], return_tensors=“pt”, add_special_tokens=True, max_patches=MAX_PATCHES)

For deplot, what should I input as header text?

2 Likes

cc @ybelkada

1 Like

Hi @sinchir0
Deplot is a VQA model, so you need to render a question or a specific task directly on the image as the snippet here: google/deplot · Hugging Face
This is different from image captioning task where the input is image only, and you’re trying to predict a caption given that image

1 Like

@ybelkada
Thank you!
I would like to fine tuning Deplot for the task of generating the underlying data tables(plot-to-table).

In that case, should I always give the text “Generate underlying data table of the figure below:”?

Example code:

encoding = self.processor(images=image, text="Generate underlying data table of the figure below:", return_tensors="pt")
1 Like

hi @sinchir0
I think that would work yes! To double check I would also ask the authors directly by opening an issue in the Hub repo of deplot
I will also ask the authors on my side and let you know!
Thanks!

2 Likes

Hey @sinchir0, thanks for looking into deplot! I’m an author.
Yes indeed we always rendered the header “Generate underlying data table of the figure below:” during deplot training. Though in this case I imagine sending just an empty string would also work.

4 Likes

@ybelkada @fl399
Thank you! My understanding is now very clear. Now I can proceed with the fine-tuning of deplot!

To double check I would also ask the authors directly by opening an issue in the Hub repo of deplot

You’re right, I should have asked in the Hub issue.

I have posted an additional question on the Hub issue about the format of the input data table for fine-tuning deplot, if you would like to check it out.

1 Like

Awesome! Thanks @fl399 for replying!