Pre-trained model with open source train test splits

Hi everyone! I hope you are having a great day!

I was looking for an encoder-based pre-trained language, e.g. BERT, model with open source train and test splits.

The BERT model in the huggingface repository is released by google and it is not clear what the train/test split for training that model was.