Track multiple losses & different outputs size with Trainer and callbacks


I’ve built a model that optimize jointly 2 different models and I would like to track these into Tensorboard or Wandb. Right now, I had to subclass the following methods: training_step (to return all losses and the one to optmize), train (to manage the new output from training_step and to add all losses into metrics).

Is there a better way to integrate this logic (especially to not subclass train which is a very heavy method) ?

Also, can Trainer manage different output size (ex: logits at token level and logits at sentence level) ? It seems that all logits must have the same shape currently, hence reshaping everything with the same dimension may lead to waste of memory.


The Trainer class is not built to optimize two models at the same time, so no, there is no easier way than subclassing and overrifing the training_step. In general, subclassing the Trainer and overriding the method(s) to fit your needs is the expected way and we designed the Trainer API to make it as easy as possible.

For predict/evaluate, yes Trainer will need tensors of the same size (with the exception of the batch dimension) otherwise it won’t be able to concatenate all predictions. This is something we’ll look into more when we rewrite the token-classification examples (in the next few weeks).

Thanks a lot for the confirmation