I used a trainer to pretraining a BertForMaskedLM model, but the training loss always be zero

RyanKeeg · August 31, 2023, 2:01am

I used a trainer to pretraining a BertForMaskedLM model, but the training loss always be zero
The code is showed below:

config = BertConfig(vocab_size=40000,num_hidden_layers=6,)
model = BertForMaskedLM(config)
print('Number of parameters: ', model.num_parameters())

pretrained_models_path = “my path”

training_args = TrainingArguments(
output_dir=pretrained_models_path,
overwrite_output_dir=True,
per_device_train_batch_size=32,
num_train_epochs=10,
save_steps=10000,
save_total_limit=2,
prediction_loss_only = True,
fp16=True,
)
trainer = Trainer(
args=training_args,
train_dataset=train_dataset,
data_collator=data_collator,
model=model,
)

trainer.train()
when I finish the training, the results show these:

Step	Training Loss
500	0.000000
1000	0.000000
1500	0.000000
2000	0.000000
2500	0.000000
3000	0.000000

TrainOutput(global_step=3130, training_loss=0.0, metrics={‘train_runtime’: 367.6023, ‘train_samples_per_second’: 272.033, ‘train_steps_per_second’: 8.515, ‘total_flos’: 1779771658266624.0, ‘train_loss’: 0.0, ‘epoch’: 10.0})

How can I modify the code to resolve this issue?

Topic		Replies	Views
Bert LM pretraining: training loss goes to 0 at masking probability of 0.999 Beginners	2	2312	October 31, 2020
Couple of questions about Trainer Beginners	0	329	June 13, 2023
Why BertForMaskedLM has decoder layer 🤗Transformers	2	820	August 17, 2021
'No Log' for validation loss during training with Trainer Beginners	2	8128	May 15, 2024
Validation loss always 0.0 for BERT Sequence Tagger Beginners	1	915	January 14, 2022

I used a trainer to pretraining a BertForMaskedLM model, but the training loss always be zero

Related topics