I have a multi class model that I want to evaluate (during training on the eval set using additional data that is contained in the dataset. The extra data is a column that provides a grouping - a document id.
These are options I am considering.
- The
compute_metric
function you can pass into theTrainer
is only passed in anEvalPrediction
which contains the predictions and labels, not the extra data, so this is not possible. - I can subclass the
Trainer
and then override theevaluation_loop
. It would be a very much a copy paste effort with overriding the call to compute_metrics to pass in extra data. I don’t know if this would also affect the training time? - I can utilise a callback at the end of epoch and calculate the metrics I want outside of the evaluation loop. I’m not sure whether I will have access to all data here though.
Any other options that I am missing? What is the best way of accessing extra data in the evaluation metrics?