I’ve built a model that optimize jointly 2 different models and I would like to track these into Tensorboard or Wandb. Right now, I had to subclass the following methods: training_step (to return all losses and the one to optmize), train (to manage the new output from training_step and to add all losses into metrics).
Is there a better way to integrate this logic (especially to not subclass train which is a very heavy method) ?
Also, can Trainer manage different output size (ex: logits at token level and logits at sentence level) ? It seems that all logits must have the same shape currently, hence reshaping everything with the same dimension may lead to waste of memory.