I’m trying to fine tune Llama 2 and I don’t want to fine tune in full (instructions + completion), only the completion. Because of that I’m using the
The completion of by training dataset is a very small sentence and so I was expecting a faster training, because the training would be resumed to predict the next token of completion. Is this what is happing when we use
DataCollatorForCompletionOnlyLM collator? The problem is that I’m not measuring any improvement in speed. Actually, I think it got slower.
Any comment on this would very appreciated.