hi, everyone~
bert_config = AutoConfig.from_pretrained('bert-large-uncased')
self.bert = BertEncoder(bert_config)
sequence_output = self.bert(embedding_output).last_hidden_state
After training bert from the beginning, I find through debug that sequence_output (shape:[batchsize, seq_len, hidden_size]) has the same word vector like:
tensor([[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]],
[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]],
[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]],
...,
[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]],
[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]],
[[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
...,
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338],
[ 0.4257, -0.5848, 1.6611, ..., 0.5865, -0.3086, 1.7338]]],
device='cuda:0')
I would like to know what might be the cause of this problem, thank you very much!