Difference between CausalLM and LMHeadModel

Kirti · April 24, 2022, 10:39pm

What us the difference between CausalLM and LMHeadModel? Both returns the similar variables. Loss, logits etc…

Example: GPT2LMHeadModel.from_pretrained(‘gpt2’) and AutoModelForCausalLM.from_pretrained(‘gpt2’) has the same model structure.

sgugger · April 25, 2022, 1:00pm

The name LMHeadModel are old names we used before for some models, but we stopped as it’s not very informative on what kind of language model head we’re talking about. To avoid breaking changes, we won’t rename the old classes, but the auto API and all newer models should have ForCausalLM or ForMaskedLM or ForSeq2SeqLM depending on that kind of LM objective the model has.

Topic		Replies	Views
Difference between AutoModel and AutoModelForLM Beginners	2	4874	May 4, 2021
Difference between CausalLMWithValueHead vs ModelForCausalLM 🤗Transformers	2	3149	February 15, 2024
Fine-tuning with Different Model Heads Intermediate	4	766	April 30, 2024
About the origin of the model category names in `AutoModelWithLMHead` 🤗Transformers	2	1535	December 21, 2020
Perplexity from fine-tuned GPT2LMHeadModel with and without lm_head as a parameter Intermediate	4	2039	May 10, 2022

Difference between CausalLM and LMHeadModel

Related topics