When we use BertModel.forward() , is the last_hidden_state the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).
1 Like
When we use BertModel.forward() , is the last_hidden_state the output of Encoder in Transformers block?
Yes! It’s a tensor of shape (batch_size, seq_len, hidden_size).