I share the same concern. However, when I use greedy decoding, the logits closely resemble each other. Have you noticed this as well? Perhaps, we need to set the seeds properly?
1 Like
I share the same concern. However, when I use greedy decoding, the logits closely resemble each other. Have you noticed this as well? Perhaps, we need to set the seeds properly?