Hello everyone!
I’m trying to finetune the Llama-vison-3.2-instruct model. I want to only train the multimodal_projector which is a linear layer. But I’m seeing when I only set those parameters to requires.grad = True the loss.requires_grad = False. This means for someone the loss is not getting back-propagated to the multimodal projection weights.
I can forcefully set loss.required_grad = True and train but that seems very hacky.
Has anyone else faced this?
Thank you for your time.