Finetuning llama-vision-3.2-instruct

nrajanee · November 8, 2024, 4:59pm

Hello everyone!

I’m trying to finetune the Llama-vison-3.2-instruct model. I want to only train the multimodal_projector which is a linear layer. But I’m seeing when I only set those parameters to requires.grad = True the loss.requires_grad = False. This means for someone the loss is not getting back-propagated to the multimodal projection weights.

I can forcefully set loss.required_grad = True and train but that seems very hacky.

Has anyone else faced this?

Thank you for your time.

John6666 · November 8, 2024, 5:27pm

I wonder if that’s it?

nrajanee · November 11, 2024, 8:32pm

Yes but I’m wondering why I have to manually loss.requires_grad? It’s typically not the case. For eg: It’s done in the tutorial here: - Fine-tune a pretrained model.

Thank you!

Topic		Replies	Views
Transformers CausalLM loss is always nan 🤗Transformers	0	179	April 18, 2024
Fine-tuning LLM for regression yields low loss during training but not in inference? 🤗Transformers	2	4459	March 4, 2024
[SOLVED] Trying to fine-tune Llama, getting NaN gradients after a single step Models	1	975	August 23, 2024
What is the loss Function when fine-tuning LlamaV2 Models	0	2135	September 19, 2023
Loss.backward() problems with require_grad Beginners	1	3943	August 27, 2020

Finetuning llama-vision-3.2-instruct

Related topics