RoBERTa large: HF vs. FAIRseq

Hi,
I’m trying to understand what the difference is between the model

…and the version available for download here:

For example, the HF version is 1.42GB as safe tensors, but the FAIRseq version is only 678 MB.

Happy to read documentation if this is described somewhere.
Thanks in advance,
Dan

Hi,

That’s a good question. I would recommend inspecting the conversion script: transformers/src/transformers/models/roberta/convert_roberta_original_pytorch_checkpoint_to_pytorch.py at main · huggingface/transformers · GitHub.

1 Like