Hi! i’m new to HuggingFace and I have to use the models to get embeddings from text. When I was exploring the code and model checkpoints to do so, I came across this code and was confused. Please let me know if I’m thinking in the right direction.
Understanding the difference between CLM heads of different models in the given docs:
In the code mentioned in the example of XLM-R docs,
"roberta-base"checkpoint is of monolingual Roberta. The Roberta model is getting loaded from that with a CLM head on top, but the understanding of the base model is in only 1 single language.
Is this model the same as that loaded by
RobertaForCausalLM.from_pretrained("roberta-base", config=config) ? Further in this, is the CLM head for
RobertaForCausalLM same as
- Why is there a need for two different classes such as
XLMRobertaForCausalLM()if we can do that using
AutoModelForCausalLM()and load the models(both pretrained and from their configs)?