To my understanding,
set_transform should do transformations on the fly such that the gpu can immediately use if for training.
When I specify
group_by_length=True on the trainer,
set_transform no longer does lazy eval, it goes through the whole dataset – my hunch is that it needs to do all the transformations first to be able to group by length.
Is this behavior intended? I think the
group_by_length should only be limited to the batch size (or a smaller subset of the dataset) and not to the whole dataset