Trainer code for token-wise prediction model

raghavmallampalli · June 6, 2022, 7:59am

I wish to train a model that classifies each token in a sequence input. I have a model architecture in place, and I want to use torch.nn.binary_cross_entropy_with_logits() to calculate loss.

logits = model(inputs)
inputs shape: (batch_size, max_seq_length)
logits shape: (batch_size, max_seq_length)

The compute_loss override example in the docs uses .view(-1) while performing the loss calculation.

This is the code I came up with after reading the docs:

class CustomTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        outputs = model(**inputs)
        loss = F.binary_cross_entropy_with_logits(
            input=outputs.get("logits").view(-1),
            target=labels.view(-1),
        )
        return (loss, outputs) if return_outputs else loss

I have two questions:

Can I override compute_loss in Trainer or should I use Seq2SeqTrainer? The output sequence does not need to be calculated using a beam search.
Is .view(-1) still required in the method I intend to use?

Topic		Replies	Views
Use custom loss function for training ML task Beginners	2	7182	March 17, 2022
Transformers replacing loss function 🤗Transformers	0	3375	March 26, 2022
Custom Training Loss Function for Seq2Seq BART Beginners	1	1732	July 21, 2023
Custom Loss: compute_loss() got an unexpected keyword argument 'return_outputs' Beginners	12	11029	January 12, 2022
Having troubel in understanding what loss is currently in use Beginners	1	751	November 24, 2023

Trainer code for token-wise prediction model

Related topics