Does the transformer automatically shift by one position when calculating the autoregressive loss during the forward pass?
Here need shifthttps://huggingface.co/learn/nlp-course/en/chapter7/6?fw=pt#training-with-accelerate , does it because define loss function separately? Or everytime we need shift by ourselves.
shift_logits = logits[..., :-1, :].contiguous() # remove the last position
shift_labels = labels[..., 1:].contiguous() # remove the first position
loss = loss_fct(shift_logits.view(-1, vocab_size), shift_labels.view(-1))
Is this shift handled internally?
1 Like
It seems like you have to do it manually…?
The Hugging Face Transformers library does not automatically shift the labels when calculating the autoregressive loss. Instead, the user is responsible for manually shifting the labels to align with the model’s predictions, as demonstrated in the provided code example. This ensures that each token’s prediction corresponds correctly to the next token in the sequence.
Answer: No, the Hugging Face Transformers library does not automatically handle the shifting of labels for autoregressive loss. You need to manually shift the logits and labels as shown in your code example [1].