MaskedLMOutput does not have last_hidden_state

Fruits · May 27, 2021, 2:29pm

I am trying to access the last hidden state here at pooler1 and pooler2 variables towards the end of the code.

   class BERT_Arch(nn.Module):

    def __init__(self, bert):
      
      super(BERT_Arch, self).__init__()

      self.bert = MODEL_TYPE.from_pretrained(PRETRAINED_MODEL_NAME)
      freeze(self.bert, params['freeze']) 
      for name, param in self.bert.named_parameters():
        print("{} -> {}".format(name, param.requires_grad))

      self.dropout = nn.Dropout(params['dropout'])
      self.relu =  nn.ReLU()
      self.fc1 = nn.Linear(params['linear_in'], params['linear_out'])

      def _qa_embedding(self, pooler):
         x = self.fc1(pooler)
         x = normalize_vectors(x)
         return x

      #define the forward pass
      def forward(self, tokensq, maskq, tokens1, mask1, tokens2, mask2):

      #pass the inputs to the model  
      outputs1 = self.bert(torch.cat((tokensq, tokens1), 1), attention_mask=torch.cat((maskq, mask1), 1), output_hidden_states=True)
      outputs2 = self.bert(torch.cat((tokensq, tokens2), 1), attention_mask=torch.cat((maskq, mask2), 1), output_hidden_states=True)

      # last_hidden_state does only exist with the non "auto-masked" models use hidden_states for the rest
      pooler1 = torch.cat((outputs1.last_hidden_state[:,0], outputs1.last_hidden_state[:,params['max_q_len']]), 1)
      pooler2 = torch.cat((outputs2.last_hidden_state[:,0], outputs2.last_hidden_state[:,params['max_q_len']]), 1)

      v1 = self._qa_embedding(pooler1)
      v2 = self._qa_embedding(pooler2)
      
      return v1, v2

It did work when I used this BERT multiling uncased which uses BaseModelOutput. last_hidden_state is declared as avariable.

Unfortunately, now that I am using BERT mutliling cased, the class MaskedLMOutput is being used which does not seem to have the last_hidden_state variable.

How can I access the last_hidden_state?
Do I really need to copy the class MaskedLMOutput including all the dependencies and add last_hidden_state: torch.FloatTensor = None ? It seems a bit overkill as I need to import all the dependencies then.

Thanks so much

Topic		Replies	Views
Roberta hidden_states[0] == Bert pooler_output? 🤗Transformers	0	541	July 25, 2022
For tuning a classifier head on a pretrained BERT should I use `last_hidden_state` or `outputs[0][:, 0, :]` from the BERT? Beginners	0	178	February 15, 2024
Question about last_hidden_state of the bert model Beginners	0	331	December 7, 2023
BertForPretraining hidden_states extraction with input embeddings as inputs Models	0	397	June 4, 2022
Is last_hidden_state the output of Encoder block? Beginners	1	446	December 23, 2021

MaskedLMOutput does not have last_hidden_state

Related topics