Using TFOpenAIGPTLMHeadModel load pytorch model doesn't work well

qiu · September 8, 2021, 1:44am

I download file from CDial-GPT_LCCC-large, and loaded as following way:

tokenizer = BertTokenizer.from_pretrained(……,do_lower_case=True)
model_pt = OpenAIGPTLMHeadModel.from_pretrained(……)
model_tf = TFOpenAIGPTLMHeadModel.from_pretrained(……,from_pt=True)

it’s fine when loading OpenAIGPTLMHeadModel , but it encountered some problems as loading TFOpenAIGPTLMHeadModel ：

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFOpenAIGPTLMHeadModel: ['transformer.h.2.attn.bias', 'transformer.h.5.attn.bias', 'transformer.h.9.attn.bias', 'transformer.h.6.attn.bias', 'transformer.h.0.attn.bias', 'lm_head.weight', 'transformer.h.8.attn.bias', 'transformer.h.3.attn.bias', 'transformer.h.1.attn.bias', 'transformer.h.10.attn.bias', 'transformer.h.4.attn.bias', 'transformer.h.11.attn.bias', 'transformer.h.7.attn.bias']
- This IS expected if you are initializing TFOpenAIGPTLMHeadModel from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFOpenAIGPTLMHeadModel from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFOpenAIGPTLMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFOpenAIGPTLMHeadModel for predictions without further training.

as i ignored this warning and went on, i got quite different logits, how can i solve this？

Topic		Replies	Views
GPT2LMHeadModel.from_pretrained('gpt2') not loading attn weights Beginners	1	2105	July 22, 2020
I am using TFGPT2LMHeadModel and GPT2LMHeadModel, when i use tensorflow version to load pytorch_model.bin,there are some weight can not be used 🤗Transformers	0	286	August 2, 2022
I am using TFGPT2LMHeadModel and GPT2LMHeadModel.When i use GPT2LMHeadModel weight to initialize TFGPT2LMHeadModel, there is some weight is not used.I'm comfirm the config file is the same one, but why is it happened? 🤗Transformers	0	272	July 28, 2022
How to load a torch model with transformers? 🤗Transformers	5	17546	June 22, 2023
Loading pytorch_pretrained_bert models with transformers Beginners	2	1899	April 29, 2021

Using TFOpenAIGPTLMHeadModel load pytorch model doesn't work well

Related topics