Letâ€™s say I use:

```
sample_outputs = model.generate(**model_inputs,max_new_tokens=40,do_sample=True,
top_k=3,top_p=0.51, temperature=0.6, num_return_sequences=3,)
```

What is the order of execution in this one?

Looked at the code for labml.ai sampling example and it doesnâ€™t make sense because when using Temperature with top-K or Top-P, it first uses Softmax and selects and then use the `Sampler`

In the Google Cloud Document, it says that you use Top-K then Filter with Top-P along with Temperature.

Letâ€™s say your ** PROBABILITIES** as t0â†’0.4 , t1â†’0.2, t2â†’0.2, t3â†’0.15, t4â†’0.05`

You use `Top-K = 3`

and now you have `t0,t1,t2`

. Now you have 2 choices:

- Whether to apply
`Top-P = 0.51`

and then you again Normalize - Or Normalize first then Apply
`Top-P = 0.51`

and then again Normalize

On top of it, if we use `temperature = 0.6`

do we apply it in beginning? If yeas, then it is different when we would have used on just `Top-P`

because the distribution has been shifted already.

How does that work? Can someone please explain in terms of execution?