For google/deplot, what should I input as header text for fine-tuning?

sinchir0 · April 24, 2023, 11:35pm

I am trying to do fine-tuning google/deplot according to the link and Notebook below.

link: DePlot
Notebook: https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_pix2struct.ipynb

but, I got error, “ValueError: A header text must be provided for VQA models.” from following code.

encoding = self.processor(images=item[“image”], return_tensors=“pt”, add_special_tokens=True, max_patches=MAX_PATCHES)

For deplot, what should I input as header text?

nielsr · April 25, 2023, 4:50pm

cc @ybelkada

ybelkada · May 10, 2023, 4:03pm

Hi @sinchir0
Deplot is a VQA model, so you need to render a question or a specific task directly on the image as the snippet here: google/deplot · Hugging Face
This is different from image captioning task where the input is image only, and you’re trying to predict a caption given that image

sinchir0 · May 10, 2023, 9:27pm

@ybelkada
Thank you!
I would like to fine tuning Deplot for the task of generating the underlying data tables(plot-to-table).

In that case, should I always give the text “Generate underlying data table of the figure below:”?

Example code:

encoding = self.processor(images=image, text="Generate underlying data table of the figure below:", return_tensors="pt")

ybelkada · May 11, 2023, 9:24am

hi @sinchir0
I think that would work yes! To double check I would also ask the authors directly by opening an issue in the Hub repo of deplot
I will also ask the authors on my side and let you know!
Thanks!

fl399 · May 11, 2023, 9:35am

Hey @sinchir0, thanks for looking into deplot! I’m an author.
Yes indeed we always rendered the header “Generate underlying data table of the figure below:” during deplot training. Though in this case I imagine sending just an empty string would also work.

sinchir0 · May 11, 2023, 1:18pm

@ybelkada @fl399
Thank you! My understanding is now very clear. Now I can proceed with the fine-tuning of deplot!

To double check I would also ask the authors directly by opening an issue in the Hub repo of deplot

You’re right, I should have asked in the Hub issue.

I have posted an additional question on the Hub issue about the format of the input data table for fine-tuning deplot, if you would like to check it out.

ybelkada · May 11, 2023, 1:50pm

Awesome! Thanks @fl399 for replying!

Topic		Replies	Views
Could not fine-tune deplot model Models	3	483	January 10, 2024
ValueError: Invalid image type. Expected either PIL.Image.Image, numpy.ndarray, torch.Tensor, tf.Tensor or jax.ndarray, but got 🤗Transformers	6	4249	January 5, 2024
Image to text model that can take an additional text input 🤗Transformers	1	280	October 2, 2023
mT5 translation tasks only generates <extra__id_0>result Models	0	31	October 4, 2024
TableGPT: Few-shot Table-to-Text Generation with Table Structure Reconstruction and Content Matching 🤗Transformers	2	3562	July 25, 2023

For google/deplot, what should I input as header text for fine-tuning?

Related topics