When to use a DataCollator for SFTTrainer

Hello,

In the SFTTrainer document, it is stated that if the dataset is in the right format, we dont need to specify a DataCollator with a response_template.

However, after I formatted my dataset with the TinyLlama/TinyLlama-1.1B-Chat-v1.0 tokenizer.apply_chat_template, the labels are not correct in the train dataloader.

Here is a sample from the dataloader of the SFTTrainer:

( I removed the <(s)> )
Input : <|user|>Which is bigger, the moon or the sun?'<|assistant|> The sun.
Inputs_ids : tensor([ 1, 529, 29989, 1792, 29989, 29958, 13, 8809, 436, 338,
16600, 29892, 278, 18786, 470, 278, 6575, 29973, 2, 29871,
13, 29966, 29989, 465, 22137, 29989, 29958, 13, 1576, 6575,
29889, 2, 29871, 13, 2, 2, 2, 2])

Labels : tensor([ 1, 529, 29989, 1792, 29989, 29958, 13, 8809, 436, 338,
16600, 29892, 278, 18786, 470, 278, 6575, 29973, -100, 29871,
13, 29966, 29989, 465, 22137, 29989, 29958, 13, 1576, 6575,
29889, -100, 29871, 13, -100, -100, -100, -100])

Is the labels supposed to be like that ? Or is it incorrect ?