I am using the AsyncClient()
from Text Generation Inference
to query a bunch of models including Llama3, Mixtral, Llama2, Vicuna, and Command-R. I am using the generate()
function from the AsyncClient. I see that unless passed, the value of top_p
and temperature
are set to None
.
However, I am assuming that there must be some default temperature
and top_p
setting. Where do I find this information? Is this model specific or set by TGI?