HOW TO determine the best threshold for predictions when making inference with a finetune model?

Hello, I finetune a model but the F-score is not quite good for certains classes. To avoid a lot of false positives I decide to set a threshhold for the probabilities and I would like to know how to determine, the best threshhold ?

Should I use the mean, median , or just look at accuracy of the model on the test_data ?


The best way to determine the threshold is to compute the true positive rate (TPR) and false positive rate (FPR) at different thresholds, and then plot the so-called ROC-curve. The ROC curve plots, for every threshold, the corresponding true positive rate and false positive rate.

Then, selecting the point (i.e. threshold) that is most to the top left of the curve will yield the best balance among the two.

Sklearn provides an implementation of this, however it’s for binary classification only. Note that there are extensions for multiclass classification.

1 Like

Thank you actually, I am doing multiclassification, I forget to mention.