Hi! i’m new to HuggingFace and I have to use the models to get embeddings from text. When I was exploring the code and model checkpoints to do so, I came across this code and was confused. Please let me know if I’m thinking in the right direction.
-
Understanding the difference between CLM heads of different models in the given docs:
In the code mentioned in the example of XLM-R docs,"roberta-base"
checkpoint is of monolingual Roberta. The Roberta model is getting loaded from that with a CLM head on top, but the understanding of the base model is in only 1 single language.
Is this model the same as that loaded by RobertaForCausalLM.from_pretrained("roberta-base", config=config)
? Further in this, is the CLM head for RobertaForCausalLM
same as XLMRobertaForCausalLM
?
- Why is there a need for two different classes such as
RobertaForCausalLM()
andXLMRobertaForCausalLM()
if we can do that usingAutoModelForCausalLM()
and load the models(both pretrained and from their configs)?