Parameter lm_head returning none in tensorflow but works for pytorch

TDP4you · September 4, 2021, 11:46pm

Hi all, I am working on a project where I am changing a pre-existing code with hugging face BART pretrained PyTorch models to TensorFlow. But the parameter lm_head is returning None value in Tensorflow. Here is an example -

bart = TFBartForConditionalGeneration.from_pretrained(‘bart-base’)
bart_py = BartForConditionalGeneration.from_pretrained(‘bart-base’)

Both model loads properly. When I try to load lm_head in PyTorch, it works fine
bart_py.lm_head
returns Linear(in_features=768, out_features=50265, bias=False)

But in case of tensorflow
bart.get_lm_head()
returns None

I have checked the configuration for both TensorFlow and PyTorch it is the same. So I tried to create lm_head from the Github page of the model - https://github.com/huggingface/transformers/blob/master/src/transformers/models/bart/modeling_tf_bart.py

self.shared = TFSharedEmbeddings(bart.config.vocab_size, bart.config.d_model, bart.config.pad_token_id, name=“model.shared”) #as shown in TFBartMainLayer

self.shared.build(outputs[0].shape) #to create weights for linear layers

lm_logits = self.shared.call(outputs[0],mode=‘linear’)

However, the lm_logits value is very different from the pytorch lm_logits value. Can someone please help me with this?

Thanks a lot.

Topic		Replies	Views
About BART lm_head? Models	0	289	September 15, 2022
Bug in BartForConditionalGeneration's intialisation of lm_head 🤗Transformers	0	263	October 16, 2021
Mismatch of tensor shapes in CrossEntropyLoss for custom head layer in BART Beginners	0	266	January 30, 2023
Why is the lm_head layer in GPT2LMHeadModel not a parameter? Beginners	5	7946	September 29, 2023
BartForConditionalGeneration "logits" shape is wrong/unexpected 🤗Transformers	4	919	November 11, 2020

Parameter lm_head returning none in tensorflow but works for pytorch

Related topics