slight_smile:
Hello, I finetune a model from huggigface on a classification task : a multiclassification with 3 labels encoded as : 0,1, and 2. I use the crossentropy loss function for the computing of the loss .
When training I tried to get the probabilities but I observe that the probabilities does not correspond to the final label of the classification model. For industrial purpose I, need to set a threshold of probabilities so that not all the text given to the model which are classified are returned. But since the probabilities does not correspond to the label , how can I intepret them.
For industrail purpose I, need to get the right probabilities in order to introduce a threshold for what is returned after the classification is done.
For the pobabilities I used this code line : proba = nn.functional.softmax(logits, dim=1)
probabilities + label
[ 0.1701, 0.4728, 0.3571], => 1
[0.2768, 0.4665, 0.2567], => 1
[0.2286, 0.5702, 0.2012], => 1
**[0.2479, 0.5934, 0.1587], => 2**
**[0.2212, 0.5519, 0.2270], => 2**
[0.2169, 0.5404, 0.2428], => 1
[0.1706, 0.6370, 0.1924], => 1
[0.1836, 0.6960, 0.1203]] => 1
As see above, the predicted label for the line with *** are 2 but I do not get why, I thought by observing, it will be 1. Maybe it is me who does not understand. I put the original logits which I converted to probabilities. For the classification model I used Flaubertsequenceclassification Class.
Logits :
[-0.67542565 0.34714806 0.06658715]
[-0.1786863 0.3430867 -0.25426903]
[-0.2919644 0.6223039 -0.41944826]
**[-0.25066078 0.62209827 -0.69668627]**
** [-0.5443676 0.37007216 -0.51845074]**
[-0.5634354 0.34945157 -0.45065987]
[-0.7058248 0.6116817 -0.58579236]
[-0.7987261 0.5336867 -1.2213029 ]
If you have any idea !!!
A snippet of the model Class
# extract the hidden representations from the encoder output
hidden_state = encoder_output[0] # (bs, seq_len, dim)
pooled_output = hidden_state[:, 0] # (bs, dim)
# apply dropout
pooled_output = self.dropout(pooled_output) # (bs, dim)
# feed into the classifier
logits = self.classifier(pooled_output) # (bs, dim)
proba = nn.functional.softmax(logits, dim=1)
#print(type(proba))
print(proba)
#outputs = (probabilities,) + encoder_output[1:] # logits
outputs = (logits,) + encoder_output[1:] # logits
if labels is not None:
#multiclassification
loss_fct = torch.nn.CrossEntropyLoss() #crossEntropyLoss
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
# aggregate outputs
outputs = (loss, ) + outputs
# print(outputs)
return outputs # (loss), logits, (hidden_states), (attentions)