Learning rate for further pretraining BERT on masked language modeling task

I want to further pretrain BERT on my corpus. Is there a standard or typical value for the learning rate that is used when training on the masked language modeling task?