Difference between pipeline and model.generate?

I tried the following two things and find a significant difference between pipeline and model.generate to complete sequences.

model_pr = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.decode(model_pr.generate(**input_tok)[0])
'My name is Merve and my favorite+ and my CR+ and my CR+ and my CR'

(2) Using pipeline to do that same thing

generator = pipeline('text-generation', model='gpt2') 
generator(input) 
[{'generated_text': 'My name is Merve and my favorite brand is Baskin-Robbins.\n\n"They bring a whole lot of stuff to the table and we have to come up with a new way of making a big deal," explains Jeff.'}]

I get a lot more sensible output for pipeline for some reason. My understanding was that both should have given similar responses.

4 Likes

Has anyone figured out the solution to this? Experiencing the same problem.

Hi,

After digging a bit into the code base, I found that GPT-2 by default uses do_sample=True and max_length=50 when generating text as seen here: config.json · openai-community/gpt2 at main.

Hence to get the equivalent behaviour, one can do the following:

from transformers import pipeline, set_seed, AutoModelForCausalLM, AutoTokenizer

set_seed(42)

pipe = pipeline(model="gpt2")

prompt = "hello world, my name is"

result = pipe(prompt, do_sample=False)["generated_text"]
print(result)

# equivalent to:
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

inputs = tokenizer(prompt, return_tensors="pt")

generated_ids = model.generate(**inputs, do_sample=False, max_length=50)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_text)

The pipeline uses the generate() method behind the scenes, but uses some default generation keyword arguments which are documented here. I didn’t get equivalent results when using sampling (despite using `set_seed).

2 Likes