Hi, I am using the BertConfig to create a encoder decoder model in the following way:
encoder = BertConfig()
decoder = BertConfig()
config = EncoderDecoderConfig.from_encoder_decoder_configs(encoder, decoder)
bert2bert = EncoderDecoderModel(config=config)
bert2bert.config.decoder.is_decoder = True
bert2bert.config.decoder.add_cross_attention = True
bert2bert.config.encoder.num_attention_heads = 12
print(bert2bert.encoder.num_parameters(only_trainable=True),bert2bert.encoder.config.num_attention_heads)
With default attention heads 12 the trainable parameters and attention heads at encoder are as below
86742528, 12
However when I try to change the number of attention heads to 4, the number of trainable parameters does not change while the value for number of attention heads changes (as below). Can anyone help me out?
bert2bert.config.decoder.add_cross_attention = True
bert2bert.config.encoder.num_attention_heads = 4
print(bert2bert.encoder.num_parameters(only_trainable=True), bert2bert.encoder.config.num_attention_heads)
(86742528, 4)