I am getting 0.0 loss value at the very first epoch of training bigscience/mt0-small seq2seq model

There must be something wrong with my code as the loss is 0.0 at epoch 0. I think there might be an issue with my dataset or the loss calculation logic. I am entirely new to the LLM field. Is there anyone who could point out the error

my dataset

max_length = 256
    dataset = load_dataset('tatsu-lab/alpaca').map(
        lambda elem: {
            "input_ids": tokeniser.encode(
                elem["instruction"],
                padding="max_length",
                truncation=True,
                max_length=max_length
            ),
            "label_ids": tokeniser.encode(
                elem["text"],
                padding="max_length",
                truncation=True,
                max_length=max_length
            ),
            # "label": elem["output"],
        }
    )

The training code

    trainer = Seq2SeqTrainer(
        model=model,
        train_dataset=dataset['train'],
        # eval_dataset=dataset['test'],
        args=training_args,
        # data_collator=data_collator,
    )
    trainer.train()

I also tried a modified Trainer but the loss is still 0.0

My trainer:

class ModifiedTrainer(Seq2SeqTrainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        return model(
            input_ids=inputs["input_ids"],
            attention_mask=torch.ones_like(inputs["input_ids"]).bool(),
            labels=inputs["labels"],
        ).loss