I’m working on a sentence regression task, where each sample consists of a sentence paired with a numerical scalar . However, each sample also includes metadata (e.g., project name, task type). I want to compute and log the accuracy with respect to this metadata during evaluation.
What I’ve Tried
So far, I’ve successfully:
- Modified the dataset class to return a
data_dict
containing:
input_ids
labels
(for loss calculation, just like in a language modeling task)numerical_score
(the scalar target for regression)- Metadata (a string field like
Project
orTask
)
- Updated the collate function to pass along all these elements correctly to the model’s
forward
function. - confirmed that this is passed along the model outputs as an additional metadata attribute.
Problem: Metadata Gets Lost before reachingcompute_metrics
The issue arises during the evaluation_loop. I need access to metadata inside the compute_metrics
function, but the Trainer class doesn’t seem to provide a clean way to pass it along.
for example
if args.include_inputs_for_metrics:
metrics = self.compute_metrics(
EvalPrediction(predictions=all_preds, label_ids=all_labels, inputs=all_inputs)
)
else:
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
I enabled include_inputs_for_metrics=True
hoping the metadata would be passed as part of all_inputs
, but it gets stripped by this line in the evaluation loop:
inputs_decode = self._prepare_input(inputs[main_input_name]) if args.include_inputs_for_metrics else None
What I Want to Achieve
I need a way to log the accuracy or any custom metric with respect to the metadata. Ideally, I don’t want to override the entire evaluation loop or write my own Trainer class, as that feels cumbersome and difficult to maintain, especially with parallelism.
What I’m Looking For
Is there a clean way to pass metadata through the evaluation loop, without hacking the Trainer class or completely rewriting the evaluation logic? I’m likely not the first person facing this issue, and I suspect there’s an elegant solution I might be missing.
Thanks in advance for your help!