Hi.
I’m doing inferences with a T5 model (Text2text-generation task) through JSON inference and Inference API.
I noticed that I have the possibility to use the parameters num_beams
and max_length
within JSON inference, but not within Inference API (here, the Inference API parameters for Text-generation task that are the same as for Text2text-generation task). Strange…
Am I right? If yes, there will be an actualization of the Inference API parameters?
JSON inference
(I can use the parameters num_beams
and max_length
)
import json
import requests
API_URL = "https://api-inference.huggingface.co/models/" + model_name
headers = {"Authorization": f"Bearer {API_TOKEN}"}
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8")), response.headers.get('x-compute-type')
# get inference
data, x_compute_type = query(
{
"inputs": input_text,
"parameters": {
"num_beams": num_beams,
"num_return_sequences": num_return_sequences,
"max_length": max_target_length,
},
}
)
Inference API
(I can not use the parameters num_beams
and max_length
)
!pip install huggingface_hub
from huggingface_hub.inference_api import InferenceApi
inference = InferenceApi(repo_id=model_name, token=API_TOKEN)
params = {
# "num_beams": num_beams,
"num_return_sequences": num_return_sequences,
#"max_length" : max_target_length,
}
inference(input_text, params)