How to prime GPT-2 with input-output pairs

DGhose · February 3, 2021, 2:38pm

Hi, first post here! Let me know if I’m in the wrong subforum.

It looks like it’s possible to prime GPT-3 with an input and output (see, e.g. github.com/shreyashankar/gpt3-sandbox). I’m wondering how to do this for GPT-2.

Further details:

My use case is to try to replicate the results of this demo, whose author primes GPT-3 with the following text:

gpt.add_example(Example('apple', 'slice, eat, mash, cook, bake, juice'))
gpt.add_example(Example('book', 'read, open, close, write on'))
gpt.add_example(Example('spoon', 'lift, grasp, scoop, slice'))
gpt.add_example(Example('apple', 'pound, grasp, lift'))

I only have access to GPT-2, via the Huggingface Transformer. How can I prime GPT-2 large on Huggingface to replicate the above examples? The issue is that, with the online Hugginface demo, one doesn’t get to prime with the input and corresponding output separately (as the author of the GPT-3 demo did above).

Similarly, I can’t find anything in the Huggingface documentation describing how to prime with examples of input-output pairs, like Example('apple', 'slice, eat, mash, cook, bake, juice').

Does anyone know how to do this?

Desired output:
use GPT-2 to return something like, for input “potato”, output “peel, slice, cook, mash, bake” (as in the GPT-3 demo above). Obviously the exact list of output verbs won’t be the same as GPT-2 and GPT-3 are not identical models.

lewtun · February 3, 2021, 3:29pm

Hi @DGhose, I’ve found that using the following prompt format to be reasonably good at getting GPT-2 to complete the pattern for the last input_n:

input_1 => output_1 \n input_2 => output_2 \n ... input_n =>

So for your use case, you could try feeding something like the following

apple => slice, eat, mash, cook, bake, juice \n book => read, open, close, write on \n spoon => lift, grasp, scoop, slice \n banana =>

which in the HuggingFace inference API for gpt2-xl produces a semi-coherent output for “banana”:

Screen Shot 2021-02-03 at 4.26.47 pm

You’ll probably need more examples if you’re doing more complex mappings (e.g. language translation) and it takes a few tries to “cherry pick” the desired output because the text generation is not deterministic in the API (I think they use sampling)

Topic		Replies	Views
Adding examples in GPT-J Beginners	4	1303	November 1, 2021
Adding examples to GPT-J Beginners	0	426	November 1, 2021
How to use generation of gpt2 from huggingface transformers in tensorflow keras model? Models	0	1109	June 8, 2021
Can't Replicate GPT-2 Output Detector Demo Results Beginners	0	722	August 23, 2022
Training GPT-2 from scratch Beginners	2	1228	August 3, 2020

How to prime GPT-2 with input-output pairs

Related topics