Order of execution of Top-K, Top-P sampling along with temperature

deshwalmahesh · September 19, 2023, 7:37am

Let’s say I use:

sample_outputs = model.generate(**model_inputs,max_new_tokens=40,do_sample=True,
    top_k=3,top_p=0.51, temperature=0.6, num_return_sequences=3,)

What is the order of execution in this one?
Looked at the code for labml.ai sampling example and it doesn’t make sense because when using Temperature with top-K or Top-P, it first uses Softmax and selects and then use the Sampler

In the Google Cloud Document, it says that you use Top-K then Filter with Top-P along with Temperature.

Let’s say your PROBABILITIES as t0→0.4 , t1→0.2, t2→0.2, t3→0.15, t4→0.05`

You use Top-K = 3 and now you have t0,t1,t2. Now you have 2 choices:

Whether to apply Top-P = 0.51 and then you again Normalize
Or Normalize first then Apply Top-P = 0.51 and then again Normalize

On top of it, if we use temperature = 0.6 do we apply it in beginning? If yeas, then it is different when we would have used on just Top-P because the distribution has been shifted already.

How does that work? Can someone please explain in terms of execution?

tisu1902 · October 31, 2023, 5:34pm

From the source here, function top_k_top_p_filtering, I believe it first does top-k sampling then top-p sampling.

Topic		Replies	Views
Using nucleus sampling and temperature at the same time Models	0	430	June 27, 2023
Get top_k tokens for each time step instead of the highest probability token 🤗Transformers	0	484	January 24, 2022
How to get variations with top p and k decoding methods (instead of beam search)? Beginners	1	699	October 25, 2022
Default parameters when querying models with TGI Intermediate	0	348	April 23, 2024
Sampling with a temperature schedule Beginners	0	384	October 27, 2022

Order of execution of Top-K, Top-P sampling along with temperature

Related topics