I’m having trouble getting my data into the proper shape to train a TransformerXL model from scratch. I have a custom Pytorch Dataset that returns a dict from
__getitem__ with the keys
labels both assigned the same 1-D Tensor of ids for a particular sequence. When I pass this Dataset to the Trainer object with the default_collator I get a
grad can be implicitly created only for scalar outputs error. What am I missing here? Am I missing a needed field for the
forward method? I’ve tried just about everything and cannot figure it out.