Deepspeed trainer and custom loss weights

Morphe · March 24, 2023, 4:18am

Hi all, I wrote a custom loss as suggested in this forum:

loss_fct = torch.nn.CrossEntropyLoss(weight=class_weights_pt)
class SentTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get("logits")
        # compute custom loss
        labels = inputs.get("labels")
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss

However, when integrating deepspeed, it complained about weight tensor not on the same device. How should I fix this issue?

tehranixyz · February 28, 2024, 4:32pm

I think you can define loss_fct inside __init__ function and then set class_weights_pt to the appropriate device.

    def __init__(self, *args, class_weights: Optional[FloatTensor] = None, **kwargs):
        super().__init__(*args, **kwargs)
        if class_weights is not None:
            class_weights = class_weights.to(self.accelerator.device)
        self.loss_fct = torch.nn.CrossEntropyLoss(weight=class_weights)

Topic		Replies	Views
Custom Training Loss Function for Seq2Seq BART Beginners	1	1728	July 21, 2023
Class weights for Segformer loss function 🤗Transformers	1	924	July 28, 2023
Supervised Fine-tuning Trainer - Loss function calculation Beginners	0	3334	September 6, 2023
Transformers replacing loss function 🤗Transformers	0	3370	March 26, 2022
Custom Loss: compute_loss() got an unexpected keyword argument 'return_outputs' Beginners	12	10953	January 12, 2022

Deepspeed trainer and custom loss weights

Related topics