Hi all, I am working on a project where I am changing a pre-existing code with hugging face BART pretrained PyTorch models to TensorFlow. But the parameter lm_head is returning None value in Tensorflow. Here is an example -
bart = TFBartForConditionalGeneration.from_pretrained(‘bart-base’)
bart_py = BartForConditionalGeneration.from_pretrained(‘bart-base’)
Both model loads properly. When I try to load lm_head in PyTorch, it works fine
bart_py.lm_head
returns Linear(in_features=768, out_features=50265, bias=False)
But in case of tensorflow
bart.get_lm_head()
returns None
I have checked the configuration for both TensorFlow and PyTorch it is the same. So I tried to create lm_head from the Github page of the model - https://github.com/huggingface/transformers/blob/master/src/transformers/models/bart/modeling_tf_bart.py
self.shared = TFSharedEmbeddings(bart.config.vocab_size, bart.config.d_model, bart.config.pad_token_id, name=“model.shared”) #as shown in TFBartMainLayer
self.shared.build(outputs[0].shape) #to create weights for linear layers
lm_logits = self.shared.call(outputs[0],mode=‘linear’)
However, the lm_logits value is very different from the pytorch lm_logits value. Can someone please help me with this?
Thanks a lot.