How were the GPT2 pretrained tensorflow models created?

sgugger · July 20, 2020, 1:41pm

I wasn’t on the team when this was done, but I guess it was convert to PyTorch using the conversion from TF scripts and then converted back to TF2 using the functions in the convert PyTorch to TF2 module.

OpenAI did not share WebText with us, and there was no retraining involved.

Topic		Replies	Views
GPT2 with TensorFlow? 🤗Transformers	1	372	November 14, 2020
How to load finetuned model in TF Beginners	2	455	September 28, 2020
I am using TFGPT2LMHeadModel and GPT2LMHeadModel, when i use tensorflow version to load pytorch_model.bin,there are some weight can not be used 🤗Transformers	0	287	August 2, 2022
[Tensorflow Export] How to export a fine tuned GPT2 model to a tensorflow model file? Beginners	1	524	January 15, 2021
.pt PyTorch Model ->PreTrainedModel Beginners	4	812	May 1, 2024