Question about last_hidden_state of the bert model

hi, everyone~

bert_config = AutoConfig.from_pretrained('bert-large-uncased')
self.bert = BertEncoder(bert_config)
sequence_output = self.bert(embedding_output).last_hidden_state

After training bert from the beginning, I find through debug that sequence_output (shape:[batchsize, seq_len, hidden_size]) has the same word vector like:

tensor([[[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]],

        [[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]],

        [[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]],

        ...,

        [[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]],

        [[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]],

        [[ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         ...,
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338],
         [ 0.4257, -0.5848,  1.6611,  ...,  0.5865, -0.3086,  1.7338]]],
       device='cuda:0')

I would like to know what might be the cause of this problem, thank you very much!:grinning: