Hi, I’m training a simple classification model and I’m experiencing an unexpected behaviour:
When the training ends, I predict with the model loaded at the end with:
predictions = trainer.predict(tokenized_test_dataset)
list(np.argmax(predictions.predictions, axis=-1))
and I obtain predictions which match the accuracy obtained during the training
(the model loaded at the end of the training is the best of the training, I’m using load_best_model_at_end=True).
However, if I load the model from the checkpoing (the best one), and get predictions with:
logits = model(model_inputs)
probabilities = torch.nn.functional.softmax(logits.logits, dim=-1)
predictions = torch.argmax(probabilities, axis=1)
I get predictions which are slightly different from the previous ones and do not match the accuracy of the training.
So, anything I’m missing? Shouldn’t these predicitions be exactly equal? Any help would be appreciated!