Request for Further Information on Datasets

Hi,

I am a newbie to NLP. However I am using the pre-trained models for my research on memory characterization. If someone can provide me details or a pointer to where I can get information on which datasets the following inbuilt models were trained, I would greatly appreciate:
1.BERT
2.DistilBERT
3. Roberta
3. OpenAI GPT2
4. Albert
5. XLNet
6. Transfomer XL

Thank You!