Using nucleus sampling and temperature at the same time

Hi,

I have a finetuned FlanT5 model and I’m trying to use it for inference with the model.generate method. I’m inspecting the behaviour of decoding methods that alter the next-token probability distribution, specifically the top_p parameter (for nucleus sampling) and the temperature parameter. I was wondering what happens if I specify both top_p and temperature? Will it first flatten the distribution with a high temperature and then obtain the nucleus of this flattened distribution (i.e., temperature, then nucleus)? Or will it obtain the nucleus and then use the temperature to flatten the distribution (i.e., nucleus, then temperature). Or something else (e.g., only use nucleus, and ignore temperature, or vice versa).

Thank you!

1 Like