I think I may have found a way around this issue (or at least the trainer starts and completes!). The subclassing of a torch.utils.data.Dataset
object for the distilbert example in “Fine-tuning with custom datasets” needs changing as follows. I guess because the distilbert model provides just a list of integers whereas the T5 model has output texts and I assume the DataCollatorForSeq2Seq() takes care of preprocessing the labels (the output encodings) into the features needed by forward function of T5 model (I am guessing, but this is what I am assuming from what I have read). Code changes below:
1 Like