Hi,
I’m training a NER model (i.e. a token-level classification task) on a custom dataset, using the transformers.Trainer
class. I want to compute some evaluation metrics (such as f1, precision, recall) using seqeval.classification_report
.
The problem is that the Trainer takes a compute_metrics
argument, which should be a callable (i.e. a function) which in turn takes as argument an EvalPrediction
object. Such object only seems to have two attributes, namely predictions
and label_ids
, so that in the Trainer I would set compute_metrics=get_metrics
, where get_metrics
is something like this
def get_metrics(p: EvalPrediction):
predictions = p.predictions
label_ids = p.label_ids
## DO SOME CUSTOM PROCESSING HERE ##
report = classification_report(y_true, y_pred, output_dict=True)
return report
The problem I’m having with a token-level classification task, is that many sentences will be shorter than the maximum sequence length, hence there will be a lot of [PAD] tokens, and I don’t want to include predictions on [PAD] tokens into account when calculating my model’s metrics, as they are meaningless and will give a wrong (possibly worse) picture of the model’s performance. Hence I would like to include information about the mask tokens within the get_metrics
function, so that I can use something like torch.masked_select
function to remove labels and predictions coming from padded tokens. Is there any easy way to do this, short of giving up on the transformers.Trainer
and using my own custom training loop?
Many thanks