Replace and fine tune Masked LM head

rianrajagede · August 30, 2024, 11:08am

I recently read this paper: WARP https://aclanthology.org/2021.acl-long.381.pdf that introduces soft verbalizer (I believe this term was coined by other papers).

In my understanding, they remove the decoder in the LM head of the masked model and replace it with a linear layer, where the output is the number of verbalizers. I think it has a similar concept to replacing the LM head in LM for sentence classification.

I want to try this scheme and train this soft verbalizer only. However, I can not find a similar tutorial out there. Can I do that on HF by only changing the LM Head, like:

model.lm_head.decoder = torch.nn.Linear(768, 2, bias=True).to("cuda")

Is there any caveat to changing the decoder for the training process?

Topic		Replies	Views
Clarification on heads, layers, training and output Beginners	0	415	June 5, 2021
Usind a fine-tuned sentence completion model in a Masked LM task 🤗Transformers	2	378	September 23, 2021
Fine-Tuning a Text2Text Model using different tokenizer 🤗Transformers	5	70	January 20, 2025
Replacing the LlamaDecoderLayer Class hugging Face With New LongNet Intermediate	0	803	March 30, 2024
FineTuning a CasualLM with a text file Intermediate	0	115	September 17, 2024

Replace and fine tune Masked LM head

Related topics