Training causal LM from scratch - forcing prompt during training

mohotmoz · February 11, 2022, 5:58pm

Hi everyone,

Thank you in advance for your time as always - very much appreciate your help.

We are training a causal LM for a problem we are working on - in this case, the initial part of the text (about a third of it) is determined beforehand. It is not the same every data point, it’s just that we will always know it beforehand in the inference use-case. Using the HF trainer - is there a “easy” way to feed this information in? I feel like setting the attention mask to -100 or something is the right course of action but not sure…

Thank you!!!

Topic		Replies	Views
How is the prompt + answer handled during training Beginners	0	112	March 20, 2024
Supervised Fine-tuning Trainer - where is the 'supervised' part? Beginners	0	448	July 3, 2023
Training in a long prompt Beginners	3	386	January 15, 2024
How to finetune the facebook/bart-large-mnli model using HF Trainer? Beginners	1	718	May 22, 2023
When I try to use my fine-tuned Causal LM model to inference a prompt, I get nothing but the last word repeated multiple times 🤗Transformers	1	520	April 14, 2024

Training causal LM from scratch - forcing prompt during training

Related topics