Slower train with collator for completion only


I’m trying to fine tune Llama 2 and I don’t want to fine tune in full (instructions + completion), only the completion. Because of that I’m using the DataCollatorForCompletionOnlyLM collator.

The completion of by training dataset is a very small sentence and so I was expecting a faster training, because the training would be resumed to predict the next token of completion. Is this what is happing when we use DataCollatorForCompletionOnlyLM collator? The problem is that I’m not measuring any improvement in speed. Actually, I think it got slower.

Any comment on this would very appreciated.

Did you find an answer to this?