How to load a torch model with transformers?

Hello, I previously fine-tuned a sentiment analysis model with pytorch, when I saved the model I did it with a .pth extension as recommended by pytorch.
I want to use this model remotely and I uploaded it to hugging face hub, but when uploading it with “AutoModelForSequenceClassification” I get an error saying it must be in extension .bin.
What should I do to use this model?

Hello! Can you link to the model on the Hub? I can take a quick look at it :blush:

I believe that the .bin extension is just a convention. You should be able to just rename your .pth file to be pytorch_model.bin. Can you try that and see if it loads the weights?

I tried changing the name of the file but it didn’t work, now I have even more doubts, as I understand when loading the model with AutoModelForSequenceClassificationit takes the config.json file for the configuration and then the .bin file to load the weights.
When I train the model, create a class like the following to define the model:

class BERTSentimentClassifier(nn.Module):
    def __init__(self,n_clases):
        self.bert = BertModel.from_pretrained(model_name, return_dict = False) 
        self.drop = nn.Dropout(p=0.35) 
        self.linear = nn.Linear(self.bert.config.hidden_size, n_clases) 

    def forward (self,input_ids,attention_mask):
        _,cls_output = self.bert(  
        input_ids = input_ids,
        attention_mask = attention_mask
        drop_out = self.drop(cls_output)
        output = self.linear(drop_out)
        return output

By doing this, my model doesn’t have a config attribute, it has a bert layer and this one does have a config attribute, but it’s not the entire model configuration. Because of this and because of the model extension, I can’t load my model from the hub.

I don’t know what I have to do to use my model from the hub. This is the link to the model:

Ah I see! The BertForSequenceClassification class is basically the same as yours, so I think instead of creating your own class when training the model, you would need to create it as

model = BertForSequenceClassification.from_pretrained(model_name, num_labels=n_classes)

(Pulled from Fine-tune a pretrained model)

If you want to set a value for nn.Dropout you can pass also pass in a custom BertConfig to from_pretrained, and that’s where you would set those parameters: BERT

If you had other custom stuff that you needed to add to your model, maybe this would be useful? Sharing custom models

For your current model, I can’t unpickle the pytorch_model.bin file because it looks for your BERTSentimentClassifier, but since you’ve already trained the model maybe it’s possible for you to unpickle that locally, edit the state dict manually, and use that state dict on a model created with BertForSequenceClassification.from_pretrained? (I haven’t tried doing that myself before, so I don’t know how easy/possible it is.)

1 Like

Thanks @NimaBoscarino,
I chose to do the training with the huggin face trainer and instantiating the model as model = BertForSequenceClassification.from_pretrained(model_name, num_labels=n_classes)as the tutorial shows and it worked.

1 Like

If the models are very similar, modifying the keys in the state_dict is probably a quicker way to go! That’s what I did to load a locally fine-tuned HubertForSequenceClassification model into the HF’s HubertForSequenceClassification class.