Thank you in advance for your time as always - very much appreciate your help.
We are training a causal LM for a problem we are working on - in this case, the initial part of the text (about a third of it) is determined beforehand. It is not the same every data point, it’s just that we will always know it beforehand in the inference use-case. Using the HF trainer - is there a “easy” way to feed this information in? I feel like setting the attention mask to -100 or something is the right course of action but not sure…