I am not sure I understand what it means to prime a LM. I came across this concept in several blogposts and papers (sometimes also referred to as exploring the capabilities of meta learning of the model or as in context learning).
From the openai gpt2 paper, section 3.7 Translation:
We test whether GPT-2 has begun to learn how to translate from one language to another. In order to help it infer that this is the desired task, we condition the language model on a context of example pairs of the format english sentence = french sentence and then after a final prompt of english sentence =
This I believe is an example of priming? Since with transformers there is no concept of hidden state being passed from one step to another, we provide the model with an input sequence of tokens of up to 1024 length and the model will output up to 1024 x vocab size softmax activations where each will encode the probability of the subsequent word (following the word at a given position). So priming would be just constructing the input sequence in a specific manner?
If I am reading this correctly, priming would refer to the act of passing a sequence into the model expecting that the model’s meta learning capability would affect its output?
In this sense, for priming, we are always limited to a sequence of < 1024 tokens (where 1024 need to suffice for the priming sequence and the output)?
past parameter just saves on compute, it provides the model with the key value pairs calculated at earlier steps of text generation but there is nothing else magical happening there?
And last but not least - are such questions okay to ask? Meaning, this would certainly qualify as a beginner question, but it doesn’t directly relate to the library I suppose. I really appreciate the amazing resource you put out there, the transformer library along with the wonderful documentation, in fact I am blown over by how awesome it is, just would like to make sure I am not bothering you with my questions and am using the forums in a way that they were intended to be used.
Thank you very much!