Questions regarding the multi-modal setup on MM-Imbd dataset

abhay · June 14, 2021, 10:49am

Hi all, thank you so much for the wonderful service.

I have some doubts regarding the training details for MM-Imdb dataset.

Are the image encoders and tokenizer embeddings fine-tuned during training on MM-Imdb dataset? If not, can you suggest a way to do it or refer any material for help?
Is there a way to modify the code so that the model’s pre-trained weights can be used for sequence-to-sequence generations tasks instead of classification?

Any suggestions or comments will be of great help.

Thank You

Topic		Replies	Views
MMBT Model (Resnet and BERT) for multimodal embeddings 🤗Transformers	3	3982	November 10, 2021
Supervised Fine-tuning Trainer - where is the 'supervised' part? Beginners	0	448	July 3, 2023
IMDb score prediction Beginners	1	217	December 23, 2023
Not enough values to unpack (expected 2, got 1) in training IMDB dataset Models	1	894	March 2, 2022
Bug in models filtering by dataset? Site Feedback	4	25	March 14, 2025