How could I fusion the logits from different models and then convert it to Token?

How could I fusion the logits from different models then convert it to Token?

Suppose I have tow logits x,y from GPT2-base and GPT2-large, with the same shape: 85032000.

8 is the batch size, and 50 is the length of input tokens, 32000 is the vocabe size.

Then I fusion the x and y using: Z= x+y

So, how could I convert the Z to the tokens with the generation function?

I have tried to use the [argmax(z) for z in Z], but the results are bad. And it seems due to I didn’t consider the stopping criteria, but how?