What is the difference between logits and scores?

The documentation in the link above makes me believe that scores is just a “processed” version of logits. This begs the question: how exactly are these logits processed?

I took a sample of these scores myself, and they look no different in meaning to logits. They don’t look like probabilities since some of them are clearly negative or above 1.0.

tensor([[-7.5898, -5.9922, 18.5625,  ..., -8.4844, -4.7539, -4.6758]],
                             device='cuda:0'))

Can someone please explain to me what these scores really are?

The links documentation is just the data class. I think the method of processing would vary based on where this class is used. I would guess in many cases the logits would just be softmaxed into a probability distribution.

1 Like