Questions when doing Transformer-XL Finetune with Trainer

Thanks for letting me know this! That’s really helpful. Or I will keep working on figuring out why Trainer is not working with Transformer-XL. :sweat_smile:
I will try to rewrite the compute_loss function for it.