Different BERT results

Hello,

i have trained different BERT models to evaluate which of these models perform best with my dataset.
At the moment I got the problem that I always get different results when running the model again (e.g. run a new colab session).
For a reproducible evaluation I want to get always the same results (e.g. F1) but I don’t know how to do this.
I have ensured that my train/test/validation split is always the same.
Here is an example notebook: Google Colab

Thanks in advance!

used seed to make your results reproducible.

random.seed(seed)
    np.random.seed(seed)
    if is_torch_available():
        tt.manual_seed(seed)
        tt.cuda.manual_seed_all(seed)