Questions when doing Transformer-XL Finetune with Trainer

sgugger · April 5, 2021, 1:31pm

Note that TransformerXL is the only model of the library that does not work with Trainer as the loss it returns is not reduced (it’s an array and not a scalar). You might get away with it by implementing your own subclass of Trainer and override the compute_loss function to convert that array to a scalar.

Topic		Replies	Views
How to use Transformer XL for sequence classification? 🤗Transformers	2	592	October 6, 2021
Training Transformer XL from scratch Beginners	0	892	May 22, 2021
Convert a Python Tokenizer into a TokenizerFast Beginners	0	339	May 20, 2022
Errors when fine-tuning T5 Beginners	7	6465	January 3, 2022
KeyError: 'loss' even after appending labels while Fine Tuning Transformer XL Beginners	2	791	May 10, 2021

Questions when doing Transformer-XL Finetune with Trainer

Related topics