How to use mask tokens information in EvalPrediction for token classification tasks?

Hi,
I’m training a NER model (i.e. a token-level classification task) on a custom dataset, using the transformers.Trainer class. I want to compute some evaluation metrics (such as f1, precision, recall) using seqeval.classification_report.
The problem is that the Trainer takes a compute_metrics argument, which should be a callable (i.e. a function) which in turn takes as argument an EvalPrediction object. Such object only seems to have two attributes, namely predictions and label_ids, so that in the Trainer I would set compute_metrics=get_metrics, where get_metrics is something like this

def get_metrics(p: EvalPrediction):
    predictions = p.predictions
    label_ids = p.label_ids
    ## DO SOME CUSTOM PROCESSING HERE ##
    report = classification_report(y_true, y_pred, output_dict=True)
    return report

The problem I’m having with a token-level classification task, is that many sentences will be shorter than the maximum sequence length, hence there will be a lot of [PAD] tokens, and I don’t want to include predictions on [PAD] tokens into account when calculating my model’s metrics, as they are meaningless and will give a wrong (possibly worse) picture of the model’s performance. Hence I would like to include information about the mask tokens within the get_metrics function, so that I can use something like torch.masked_select function to remove labels and predictions coming from padded tokens. Is there any easy way to do this, short of giving up on the transformers.Trainer and using my own custom training loop?

Many thanks