Hi Taber,
Do you know if the same script can be used for a RoBERTa model. I used the script but it doesn’t seem to be working in case of RoBERTa. My understanding is BERT and RoBERTa are very similar except token_type_vocab, hyperparameters etc. so ideally using the same code for converting a RoBERTa TF to Pytorch should work. I am looking into this in detail now to see if anything has to be changed in the original script.
I would appreciate if you have tried this and have some insights. I also tried just loading the TF checkpoint though that throws an error as well: Error while converting a RoBERTa TF checkpoint to Pytorch · Issue #12798 · huggingface/transformers · GitHub