Softmax vs logits

sj1 · June 20, 2021, 1:59pm

Why do we need to apply softmax after getting the logit values? I know it says that it would help to normalise the scores and get a probabilistic interpretation. But is it not that that the utility of logits/softmax scores is to determine which value is bigger and then infer the label.
For example, if you get a logits score of [-4.2095, 4.6053], where -4.2095 refers to label0 and 4.6053 refers to label1. Then as 4.6053 > -4.2095, I would keep label1 as my prediction. If instead I apply softmax to the logits score, I get [1.4850e-04, 9.9985e-01]. With this softmax score I will still infer that the predicted label is label1.

sgugger · June 21, 2021, 1:11pm

If you just want to get the predicted class, you don’t need the softmax layer as, as you pointed out, you just have to take the index of the maximum logits. The softmax will convert the logits into probabilities, so you should use it when you want the probabilities for each prediction.

Topic		Replies	Views
Softmax and text classification Beginners	0	423	November 26, 2021
Soft max is output greated than 1 Models	1	725	January 11, 2023
Questions about outputs.logits, 🤗Transformers	0	368	July 17, 2024
Transform Logits to probabilities doesn't work Beginners	4	9423	February 17, 2022
Why do probabilities output for a model does not correspond to label predicted by the finetune model? Beginners	3	1376	December 3, 2021

Softmax vs logits

Related topics