Load EncoderDecoderModel from a checkpoint

I am trying to save the model checkpoint with transformers’ save_pretrained in order for the model to be easily initialized with from_pretrained in the future. I’ve created a model class that inherits from PreTrainedModel in order for it to have a save_pretrained method. I also created a config as described in one of the huggingface courses.

config_encoder = BertConfig()
config_decoder = BertConfig()
config = EncoderDecoderConfig.from_encoder_decoder_configs(config_encoder, config_decoder)

class TransformerShared(PreTrainedModel):
    def __init__(self, config):
        super(TransformerShared, self).__init__(config)
        self.shared = EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-cased',

    def forward(self, input_ids, attention_mask, labels):
        output = self.shared(input_ids=input_ids,
        return output
model = TransformerShared(config)
model.shared.config.decoder_start_token_id = tokenizer.cls_token_id
model.shared.config.eos_token_id = tokenizer.sep_token_id
model.shared.config.pad_token_id = tokenizer.pad_token_id
model.shared.config.vocab_size = model.shared.config.encoder.vocab_size
model.shared.config.bos_token_id = tokenizer.bos_token_id


Up to this moment, the model works just fine, but after saving and instantiating with from_pretrained the program gives a warning telling that some weights were not appropriately initialized and set to random. After that, a pre-trained model outputs poor results.
Any suggestions would be much appreciated.
Thank you!