How to load pretrained model with custom model layers

Hi

How can I load pre-trained BART model weights into a custom model layer?

I have a custom decoder layer with addition nn.Module but pretrain BART model like facebook/bart-base has prefix decoder.layer. keys cant match the custom layer.

from transformers.models.bart.modeling_bart import BartEncoder, BartDecoder

class BartDecoder(BartPretrainedModel):
  def __init__(self, config: BartConfig, embed_tokens: Optional[nn.Embedding] = None):
        super().__init__(config)
        # assume here are same 
        # ...

class BartDecoderLayer(nn.Module):
    def __init__(self, config: BartConfig):
        super().__init__()
        # assume we have original modules here 
        # ...
        self.some_new_modules1 = something_new
        self.some_new_modules2 = something_new

How can I load pretrained facebook/bart-base into this decoder and leave the rest of self.some_new_modules1 and self.some_new_modules2? The facebook/bart-base has prefix keys like endoer.layer.. and decoder.layer..

Thanks

3 Likes

I have same request. Has anyone could give an answer.

I think the easiest way is following:

  • follow the source code structure of pre-trained model
  • name all attributes containing pre-trained modules the same as in the pre-trained model
  • code a PreTrainedModel class and add the base_model_prefix attribute to it with the name of the pre-trained model, for example "bert" or "mpnet".

An example might be my small project.: GitHub - voorhs/hssa