Question about the temperature parameter in the Hugging Face Inference API

Hi everyone,

I have a question regarding the temperature parameter in the Hugging Face Inference API, particularly in the context of chat models. According to the documentation, the default value for temperature is 1. However, I noticed that some models seem to have a different default, such as 0.6, as specified in their generation_config.json file.

Here are my questions:
1. When using the Inference API, if I don’t explicitly set the temperature parameter, does the API always use the model’s default value from the generation_config.json? Or does it fall back to a global default of 1 as mentioned in the docs?
2. If I don’t pass in any additional parameters (like max_length, top_p, etc.), does the API automatically use all the defaults specified in the model’s generation_config.json file? Or are there other fallback defaults from the API side?

Thank you in advance for your help!

1 Like

I think that there are some difference of temperature between models apis.
Because their use cases are different.
If you use chatbot or creativity model, then their temperature should be 1.
But if you use summarization or retelling model then their temperature should be 0.
But there is another case: Higher Temperatures (>1): Values above 1 introduce more randomness, leading to highly creative but potentially less coherent outputs.

Then what is the default values of model parameter?
They are well-suited parameters that have the maximum ability of the model.

1 Like