Training using multiple GPUs

The Trainer lets you compute the loss how you want by subclassing and overriding compute_loss (see an example here). By default we use the basic loss since that’s the use case of most users.

1 Like