I’d like to store some metadata in a streaming IterableDataset which would be expensive to compute on the fly - for example, the total number of positive examples of each class. Is there a way for me to store that in a custom dataset without including it in a feature that is returned every example?
Storing metadata is not natively supported, so one option is to assign this metadata to the dataset object (iterable_ds.metadata
= …) and re-assign it after each operation (the dataset ops are immutable and return a new dataset object)
Thanks for the response. It seems that if the attribute can’t be assigned at creation and does not persist through operations, I might as well keep track of it separately from the dataset object.