I have a multi class model that I want to evaluate (during training on the eval set using additional data that is contained in the dataset. The extra data is a column that provides a grouping - a document id.
These are options I am considering.
compute_metricfunction you can pass into the
Traineris only passed in an
EvalPredictionwhich contains the predictions and labels, not the extra data, so this is not possible.
- I can subclass the
Trainerand then override the
evaluation_loop. It would be a very much a copy paste effort with overriding the call to compute_metrics to pass in extra data. I don’t know if this would also affect the training time?
- I can utilise a callback at the end of epoch and calculate the metrics I want outside of the evaluation loop. I’m not sure whether I will have access to all data here though.
Any other options that I am missing? What is the best way of accessing extra data in the evaluation metrics?