Tips for PreTraining BERT from scratch

I’ve opened a new issue which is about pre-training. Training on GLUE part is resolved. Thanks for asking.