Hi,
Is there a good way of retrieving the examples that were filtered out when using the DatasetDict.filter() function ?
For now, I’m calling filter() on a DatasetDict that way:
datasets = datasets.filter(lambda example: not example['label_kept'] in labels_to_remove)
For now I compute the list before for each split, but I was wondering if there’s a better way to do that. I need the id of these removed examples to compare with the original dev/test file at the end.
Thanks