From what I can understand, what you want is somehow filter the dataset and then use the same dataset to compute metrics, is this right?
You should be able to do this by
- get your filtered dataset
- create a dataloader
- iterate over the batches and do prediction for each batch
- compute the metrics.
for batch in dataloader:
model_input, targets = batch
predictions = model(model_inputs)
metric.add_batch(predictions, targets)
score = metric.compute()