Stopping criteria BLOOM

Hi, I’ve spent a couple of days reading topics in the forum about model stopping criteria, but I didn’t find a solution. Anyway, if the topic is repeated, sorry in advance!

I’m using the BLOOM model and I want to stop text generation when a set of special characters are found, like ‘###’, but I can’t achieve it. I know that I can implement a piece of code to post-process the generated text and extract the expected result, but it would be interesting to stop text generation when a criteria is fulfilled to save some words/tokens in the task.

I’m using the parameter eos_token_id to that end, but it doesn’t work. I thought that I was doing something wrong, but today I tried the model Incoder with the same eos_token_id parameter and it works! (note that in the code below I changed both API_URL and tokenizer to make reference to Incoder model instead of BLOOM)

Anyone know if this is a specific problem of BLOOM or am I doing something wrong?

tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom")
end_sequence = '###'

payload = {
    "inputs": f"{context} \nHuman: {nl_query} \nAI: ",
        "top_p": 0.9,
        "temperature": 0.2,
        "max_new_tokens": 40,
        "eos_token_id": int(tokenizer.convert_tokens_to_ids(end_sequence)),
        "return_full_text": False
          "use_cache": True,
          "wait_for_model": True
response =, headers=headers, json=payload)