What is LM head mean?

abhi11nav · June 4, 2023, 2:55am

LM head is the language modelling head. The output of the transformer is a vector of size (batch_size, max_target_len, model_dimension). In the final step where you convert these transformer outputs to words, you first project them linearly and them apply softmax over it returning the probability of that position (i) in the target sequence being a certain word in the vocabulary. The layer where all of this happens is the LM head.

Topic		Replies	Views
Why is the lm_head layer in GPT2LMHeadModel not a parameter? Beginners	5	7963	September 29, 2023
Separate LM fine tuning and classification head training Beginners	5	1863	July 1, 2021
Fine-tuning with Different Model Heads Intermediate	4	767	April 30, 2024
Clarification on heads, layers, training and output Beginners	0	415	June 5, 2021
Is it possible to add linear layers before lm_head in Text Generation models? Intermediate	0	265	April 1, 2023

What is LM head mean?

Related topics