I used a trainer to pretraining a BertForMaskedLM model, but the training loss always be zero

I used a trainer to pretraining a BertForMaskedLM model, but the training loss always be zero
The code is showed below:

config = BertConfig(vocab_size=40000,num_hidden_layers=6,)
model = BertForMaskedLM(config)
print('Number of parameters: ', model.num_parameters())

pretrained_models_path = “my path”

training_args = TrainingArguments(
output_dir=pretrained_models_path,
overwrite_output_dir=True,
per_device_train_batch_size=32,
num_train_epochs=10,
save_steps=10000,
save_total_limit=2,
prediction_loss_only = True,
fp16=True,
)
trainer = Trainer(
args=training_args,
train_dataset=train_dataset,
data_collator=data_collator,
model=model,
)

trainer.train()
when I finish the training, the results show these:

Step Training Loss
500 0.000000
1000 0.000000
1500 0.000000
2000 0.000000
2500 0.000000
3000 0.000000

TrainOutput(global_step=3130, training_loss=0.0, metrics={‘train_runtime’: 367.6023, ‘train_samples_per_second’: 272.033, ‘train_steps_per_second’: 8.515, ‘total_flos’: 1779771658266624.0, ‘train_loss’: 0.0, ‘epoch’: 10.0})

How can I modify the code to resolve this issue?