@Whenzi This is great, thanks! I got my code working with your two lines and the following within the compute_metrics function
# load_accuracy = load_metric("accuracy")
# load_f1 = load_metric("f1")
load_accuracy = evaluate.load("accuracy")
load_f1 = evaluate.load("f1")