How can I retrain VisualBERT on another dataset?

I am looking to retrain the Vision-Language model VisualBERT on a specific dataset (images + text). How can I do this?

Out-of-the-box VisualBERT is trained on the COCO dataset, and I’d like to retrain it on my own data so that it retains the original learned model parameters, whilst being updated with the new data.

I’m unsure what section of the code is I need to change.

VisualBERT GitHub repo: GitHub - uclanlp/visualbert: Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Thanks in advance.

Hi,

VisualBERT is a really outdated (and slightly complicated) model. I’d recommend taking a look at ViLT which is simpler and performs the same. See my demo notebook on fine-tuning here: Transformers-Tutorials/ViLT/Fine_tuning_ViLT_for_VQA.ipynb at master · NielsRogge/Transformers-Tutorials · GitHub.

So I would set the ‘config’ variable to the dataset I want to fine-tune on?

So I would set the ‘config’ variable to the dataset I want to fine-tune on? @nielsr

No,

a configuration (or model) is typically loaded from a HF repository, like dandelin/vilt-b32-finetuned-vqa · Hugging Face in this case. The dataset needs to be prepared separately, as shown in the notebook.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.