AutoModel.from_pretrained(model_name) doesn't have lm_head why?

AegonTargaryen2001 · June 23, 2024, 5:22am

modelSeq2Seq = AutoModelForSeq2SeqLM.from_pretrained(model_name) 
model = AutoModel.from_pretrained(model_name)

so this modelseq2seq model has an extra Fully connected layer 756 to NUM tokens (lm_head) but the plain model does not output that why?
Even in the pre-training phase, there must be a layer to convert the embeds to logits, right?

Topic		Replies	Views
Load fine-tuned LM without the head? Beginners	2	1543	February 22, 2022
Difference between AutoModel and AutoModelForLM Beginners	2	4877	May 4, 2021
AutoModelForSeq2SeqLM.from_pretrained('facebook/nllb-200-1.3B') loads M2M100 Models	0	33	September 16, 2024
Difference between CausalLM and LMHeadModel Models	1	4046	April 25, 2022
Problems using AutoModel.from_pretrained on custom model Beginners	3	1943	December 12, 2023

AutoModel.from_pretrained(model_name) doesn't have lm_head why?

Related topics