How were the GPT2 pretrained tensorflow models created?

I wasn’t on the team when this was done, but I guess it was convert to PyTorch using the conversion from TF scripts and then converted back to TF2 using the functions in the convert PyTorch to TF2 module.

OpenAI did not share WebText with us, and there was no retraining involved.

1 Like