Why TFBlenderbot SmallModel and TFBlenderbot SmallForConditionalGeneration are the same trainable_variables?

i downloaded model from facebook / blenderbot_small-90M , and loaded it with BlenderbotSmallTokenizer.from_pretrained() and BlenderbotSmallForConditionalGeneration.from_pretrained() respectively. When i looked into trainable_variables using:

for v in model.trainable_variables():
	print(v)

i found they were equal, but doc says there is a a language modeling head in TFBlenderbotSmallForConditionalGeneration, how can i get the weights of the head?

It is possible that the model uses tied embeddings, meaning the same embedding layer is used at the input and output of the model.

A lot of causal Transformer decoders use tied embeddings.

Thank you for your reply that is helpful, but what make me confused is: if last head dense in TFBlenderbotSmallForConditionalGeneration uses the kernel which weights shared with embeddings, how about the bias? And another problem is how can i get all the variables of the model include trainable and untrainable variables? Thank you again.

if last head dense in TFBlenderbotSmallForConditionalGeneration uses the kernel which weights shared with embeddings, how about the bias?

Checking the PyTorch implementation, it seems that the language modeling head doesn’t use a bias, as seen here.

how can i get all the variables of the model include trainable and untrainable variables?

In PyTorch, you can get all parameters of a model as follows:

for name, param in model.named_parameters():
     print(name, param.shape)

cc’ing @Rocketknight1 for how to do this in Tensorflow.

thank you a lot, that’s great helpful!