I currently have a very basic model that consist of a pre-trained backbone model with a MLP head. Think something like:
def __init__(self, config):
self.backbone = AutoModel.from_pretrained(config.model_name_or_path)
self.mlp = MLPLayer()
I’m currently using the Trainer object to save my model. Specifically, the code that I’m using has a
self.save_model(output_dir) inside the Training loop. This saves the checkpoint as a safetensors object.
When I try to load it using
model = AutoModel.from_pretrained(PATH_TO_SAFETENSORS) I’m noticing that the keys for the MLP layer are not there and only the keys for the backbone model’s embedding and encoder layers are there.
I took a look at the source code for
save_model, which seems to be using the
_save method, and don’t see any reason why the MLP layers shouldn’t be saved. Both the
save_pretrained methods use the
state_dict which contains the MLP layer’s weights.
Is there anything that I may be missing or may have configured incorrectly? Thanks.