GPT2LMHeadModel.from_pretrained('gpt2') not loading attn weights

radek · July 22, 2020, 1:24pm

I am following the examples here. Specifically, this one:

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs, labels=inputs["input_ids"])
loss, logits = outputs[:2]

When I run the code, I get this warning:

Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

I am wondering if I should maybe point to some specific file to have the weights loaded?

BTW I see some friendly faces on these forums! Hi @sshleifer and @sgugger!

sgugger · July 22, 2020, 1:37pm

Hi @radek, good to see you here!

This is a known issue that is currently being addressed by #5922 (under review). You can safely discard the warning.

Topic	Replies	Views
Can't load weights for gpt2 error Beginners	1614	July 13, 2020
Loading a checkpoint from training GPT2LMHeadModel 🤗Transformers	465	May 23, 2023
I am using TFGPT2LMHeadModel and GPT2LMHeadModel.When i use GPT2LMHeadModel weight to initialize TFGPT2LMHeadModel, there is some weight is not used.I'm comfirm the config file is the same one, but why is it happened? 🤗Transformers	272	July 28, 2022
Loading weights of specific layer of gpt2 pretrained model Beginners	210	December 12, 2023
Loading a trained model gives an error that weights are randomly initialized 🤗Transformers	470	June 6, 2023

GPT2LMHeadModel.from_pretrained('gpt2') not loading attn weights

Related topics