Pre-training a BERT model from scratch with custom tokenizer

Looks like your evaluation DataLoader does not contain labels as the compute_metrics is never called. If it was you would get an error since you are not taking the argmax of your predictions before sending them to the accuracy.