Hi all - trying to perform a downstream classification task with BERT and wondering if charting the training loss vs the eval loss and looking at accuracy or f1 score is enough of a framework before putting the model into production?
I also plan to test the model once it is tuned properly. Any thoughts?
Cheers DB.