When we use BertModel.forward()
, is the last_hidden_state
the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).
1 Like
When we use BertModel.forward()
, is the last_hidden_state
the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).