Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation

The first bit of code runs, but I get the eos_token_id error. The second bit fails. So adding the token to the input_ids doesn’t work. :frowning: At least the model persists for the second query, so I’m making progress. :slight_smile:
Anyway, big thanks to Huggingface for posting these great models. You guys rock!

from transformers import GPTNeoForCausalLM, GPT2Tokenizer
from macos_speech import Synthesizer

model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")

prompt = (
    "In a shocking finding, "
)

input_ids = tokenizer(prompt, return_tensors="pt").input_ids

gen_tokens = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.9,
    max_length=100,
)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

prompt = (
    "Albert Einstein was "
)

input_ids_pre = tokenizer(prompt, return_tensors="pt").input_ids
nput_ids = input_ids_pre + tokenizer.eos_token

gen_tokens = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.9,
    max_length=100,
)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)
1 Like

Oops, I had a typo. I misspelled input_ids. I fixed that, and it still failed. :frowning: Anyway, this token error seems to be one that a lot of people have encountered. I’ve read a lot of posts about it. Maybe a better question is, how can I find out about this sort of thing in the docs? Or would the tutorial answer this question?

Here’s the code that suppresses the error:

`from transformers import pipeline
import time
start = time.time()
print("Time elapsed on working...")
#generator = pipeline('text-generation', model='bigscience/bloom-560m')
#generator = pipeline('text-generation', model='gpt2')
generator = pipeline('text-generation', model='EleutherAI/gpt-neo-1.3B')
#generator = pipeline('text-generation', model='EleutherAI/gpt-j-6B')
text = generator("Albert Einstein was:", max_length=10, pad_token_id=50256, num_return_sequences=1)
print(text)
time.sleep(0.9)
end = time.time()
print("Time consumed in working: ",end - start)
text = generator("Albert Einstein was:", max_length=10, num_return_sequences=1)
print(text)
time.sleep(0.9)
end = time.time()
print("Time consumed in working: ",end - start)`
2 Likes

I added pad_token_id = 50256 to my pipeline.

What when I don’t want open ended ? The model seems to ignore now the eos token and keeps going until the max_tokes limit is reached. I would rather have it to stop when it is done.

Why 50256 and not 50258?