Hi, I am trying to inference on BLOOM using Inference API. It only produces a few tokens (max 1-3 sentences), even if I set ‘min_length’ very high, for instance “min_length”: 1024. How can I generate more tokens with BLOOM? Is there a limitation around this, or am I entering the parameters incorrectly?
This is what my entry looks like:
API_URL = “https://api-inference.huggingface.co/models/bigscience/bloom”
HF_TOKEN = ‘api_org_XXXXXXX’
headers = {“Authorization”: f"Bearer {HF_TOKEN}"}
prompt_text = “”“Text:
Some prompt text is here because”“”
json_ = {“inputs”: prompt_text,
“parameters”:
{
“top_p”: 1.0,
“temperature”: .72,
“min_length”: 1024,
“return_full_text”: False
}, “options”:
{
“use_cache”: False,
“wait_for_model”:True
},}
response = requests.post(API_URL, headers=headers, json=json_)
output = response.json()