How does the text-generation pipeline know the special stop token?

I use the Llama2 model currently which has the stop token </s>.

I tried to change the stop token so that the pipeline would continue to generate regardless of the model predicting the </s> token.

I tried to change:

model.config.eos_token_id = <someRandomNumber>
# or
tokenizer.eos_token = "somethingelse"

but the pipeline still seems to know that it should stop at </s>

So I’m asking myself:
“How does the the Pipeline (Text-Generation) know what the correct stop token is and does it get the stop token from?”

Hey! You have to change it also in model’s generation config. Try out the below code

model.generation_config.eos_token_id = None
model.config.eos_token_id = None
1 Like

Thank you your response! I had no idea about the generation config.
The thing is actually my pipeline does not stop to generate.

The following print statements print all </s> (the correct stop token).

print("1. ", tokenizer.decode(tokenizer.eos_token_id))
print("2. ", tokenizer.decode(model.config.eos_token_id))
print("3. ", tokenizer.decode(model.generation_config.eos_token_id))

However, when I create and call the pipeline:

pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)

# Run text generation pipeline with our next model
while True:
    user_input = "write an sql statement"
    prompt = f"<s>[INTS]{user_input}[/INST]"
    result = pipe(prompt)

It outputs:

<s>[INTS]write an sql statement[/INST] SELECT * FROM table_name WHERE name = "mario"</s>\n\nWhat does the SQL statement mean?\n\nIn this SQL...

You can see that that the correct stop token is set in every single config and the pipeline still ignores the </s> and continues.

Ah sorry, I thought you want to generate even if there is stop token. Then you shouldn’t be setting them to None as I showed above.

If the model is still generating beyond eos_token set in the model.generation_config.eos_token_id, can you check if the generation config has forced_eos_tokens or min_length. Also it would be easier if you provide a reproducible code with the model checkpoint you’re using :slight_smile:

1 Like

No, I’m sorry, this was on me!
The script is about a fine tuned LLM (with LoRA). It tries to predict the SQL instruction from a user question and the database table.

Here is the entire script: Google Colab

Let me give you a quick overview:

  1. Loading the base model and merging it with my trained LORA Adapter.
  2. Showing all model configs
  3. Showing all eos token ids (decoded)
  4. Trying out the pipeline with the config files
  5. Trying out the pipeline and setting specifically the </s> stop token (also had no effect)

Thanks a lot for sharing the notebook. Unfortunately I still can’t reproduce it on ‘main’. I am using an arbitrary adapter as I don’t have access to the one you had in Colab, so my wild guess is that there’s smth in the adapter config as the version in Colab is the latest one.

Let me know if you can provide the whole script with the adapter, and I will try to find the bug!

1 Like

No, thank you so much for even trying to help me.
But please don’t spend too much time on it but if you can not reproduce it with these script I will go crazy.

My trained lora adapter: SQLFormerLlama2_Adapter - Google Drive

The entire training script: Google Colab
(If you want to try out the training script you just have to let it run for maybe 3-5min and then you can stop the cell. By then the model will have learned the rough desired structure and will already fail to stop at the stop token)


@VitalContribution hey, I ran the notebook with your adapter and I found the error. The generated “” tokens are not EOS, i.e. the model is generating ids: [829, 29879, 29958]`which are together decoded as “”.

I can’t say right now what went wrong with training, or how to train properly since I’m prioritizing other tasks. A workaround here will be to use a stop_strings as a kwarg to pipeline, which stops when a custom string is generated. See docs for more details :slight_smile:

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.