BLOOM outputs only few tokens

Hi, I am trying to inference on BLOOM using Inference API. It only produces a few tokens (max 1-3 sentences), even if I set ‘min_length’ very high, for instance “min_length”: 1024. How can I generate more tokens with BLOOM? Is there a limitation around this, or am I entering the parameters incorrectly?

This is what my entry looks like:


HF_TOKEN = ‘api_org_XXXXXXX’

headers = {“Authorization”: f"Bearer {HF_TOKEN}"}

prompt_text = “”“Text:
Some prompt text is here because”“”

json_ = {“inputs”: prompt_text,
“top_p”: 1.0,
“temperature”: .72,
“min_length”: 1024,
“return_full_text”: False
}, “options”:
“use_cache”: False,
response =, headers=headers, json=json_)
output = response.json()