Generate logits from hidden state embeddings and decoder weights


I am trying to compute prediction_logits using BertForPreTraining model. For some reason, I don’t want to use outputs.prediction_logits and I want to be able to generate them by multiplying the last hidden state with decoder weights. The problem is that when I do this the results I get are not equal to outputs.prediction_logits. Here is the code:

model = BertForPreTraining.from_pretrained("bert-base-multilingual-cased", output_hidden_states=True).to(device)

w = model.state_dict()['cls.predictions.decoder.weight'].cpu().numpy()
b = model.state_dict()['cls.predictions.decoder.bias'].cpu().numpy()

with torch.no_grad():
    outputs = model(**inputs)

output_logits = outputs.prediction_logits.cpu().numpy()
last_hidden_states = outputs.hidden_states[-1].cpu().numpy()

preds = output_logits[i, token_idx]
h = last_hidden_states[i, token_idx]
h_transformed =, h) + b

Basically, I expect h_transformed to be equal to preds, but it is not.

Thanks for your help :slight_smile:

I suspect dropout might be to blame here. You can create a small fully connected layer with dropout, and initialize it with decoder weights to use instead.

Thanks for your reply. I tried that but it is still not similar.

The two outputs will not be similar, because the dropout randomly affects the output. I do not insist that the cause is dropout but if it is, you would get the same training procedure by modeling it this way. You can verify it by calling model.eval() to disable dropout.

I suspect that’s not the problem, since the dropout shouldn’t be applied after getting the outputs from the last layer.