My project is focused on imagery which doesnât have 3 channels. I would like to be able to load a pretrained TFSegformerModel, and then change the first convolutional layer within it, so that it accepts a different number of channels.
Obviously, this means that the pretrained weights for that layer will be incompatible with the newly sized convolution, but I would like to then finetune this model with randomly initialized weights in that first customized layer, leaving the rest of the pretrained model intact.
Currently, I can create a randomly initialised model by defining a SegformerConfig with a customised ânum_channelsâ. However, I cannot then find a way to then load the pretrained modelâs weights to the other layers, and to then only set the first, randomly initialised layer as trainable.
Any help or advice would be appreciated, thanks. Code below is a useful starting point for the discussion.
from transformers import SegformerConfig,TFSegformerModel
custom_config = SegformerConfig(num_channels=6)
custom_model = TFSegformerModel(custom_config)
pretrained_model = TFSegformerModel.from_pretrained("nvidia/mit-b0")
# Now I need a way to get the weights of pretrained_model into custom_model,
# except the first convolutional layer, which has a different geometry.
Here is some pseudo-code for it that shouldnât be far off from what you need:
pretrained_model = TFSegformerModel.from_pretrained("nvidia/mit-b0")
# you might want to set the entire model as non-trainable, so everything
# except your new layer stays frozen
# set the right initialization here; depending on your use case, you might
# need to copy-paste and redefine a few parts of the class
my_layer_with_six_channels = TFSegformerLayer(...)
pretrained_model.segformer.encoder.block[0][0] = my_layer_with_six_channels
Thanks for the tips @joaogante, they sent me on the right track.
Rather than using the TFSegformerLayer, it was actually easier to just create two models, one pretrained with the existing config, and then another randomly initialized one with a different config. You can then just substitute out the layer(s) you want to customise in the original.
Also, I think the first layer of the model is actually, âŚencoder.embeddings[0]
rather than, âŚencoder.block[0][0]
Code snippet here for anyone interested.
from transformers import SegformerConfig, TFSegformerForSemanticSegmentation
NUM_CHANNELS = 6
# Get pretrained model
segformer_model = TFSegformerForSemanticSegmentation.from_pretrained("nvidia/mit-b1")
# Copy the configuration of pretrained model
new_config = segformer_model.config
# Modify config's values
new_config.num_channels=NUM_CHANNEL
# Instantiate new (randomly initialized) model
new_model = TFSegformerForSemanticSegmentation(new_config)
#Substitute first layer of the pretrained model with the modified one
segformer_model.segformer.encoder.embeddings[0] = new_model.segformer.encoder.embeddings[0]