Using Huggingface Trainer for custom models

ivnle · April 16, 2022, 8:52pm

Say I want to train a simple LSTM or MLP with Trainer (Pytroch nn.Modules). Do I just need to ensure the model adheres to the following?

Is there an example of using Trainer to train models that are not HF Transformers models? Best practices?

kishore · April 17, 2022, 2:48am

I think HF trainer API is specifically for transformers but not for other models.

sgugger · April 18, 2022, 12:25pm

We don’t have an example, but as long as you follow the recommendation in that list of the documentation, you should be fine.

julien-c · April 19, 2022, 7:10pm

and if you use it successfully and want to do a short writeup, publish it, we’ll make sure to share your writeup!

ivnle · April 26, 2022, 5:32pm

Confirmed that you can train a simple LSTM or MLP with Trainer. This is nice since I can just stay within the HF ecosystem. I’m not sure I’ll have time to do a write-up but as long as you follow that list in the original post, it will work.

Petrina · May 29, 2023, 7:50am

Hello ivnle!

I am currently trying to train an LSTM model that takes as input the embeddings that are outputted from a pretrained model from the hf hub. I am following the example at Sharing custom models and my model class that inherits from PreTrainedModel is the following:

from transformers import PreTrainedModel


class LSTMModel(PreTrainedModel):
    config_class = LSTMConfig

    def __init__(self, config, pretrained_model):
        super().__init__(config)

        self.model = SentimentLSTM(pretrained_model, 
                                   output_size=config.output_size, 
                                   hidden_dim=config.hidden_dim, 
                                   n_layers=config.n_layers)




    def forward(self, tensor, labels=None):
        logits = self.model(tensor)
        if labels is not None:
            loss = torch.nn.cross_entropy(logits, labels)
            return {"loss": loss, "logits": logits}
        return {"logits": logits}

where SentimentLSTM is my custom LSTM model.

Then I initialize the training_arguments and the trainer object and try to perform trainer.train(). However I get an error, due to the tensor argument, because the function cannot find it.

Did you use the same example to perform your own training?

If so, what was your forward function in the corresponding LSTMModel class and how could you pass an extra argument for this function through Trainer?

Thank you in advance,
Petrina

Topic		Replies	Views
Training General Pytorch model with HuggingFace's Trainer 🤗Transformers	0	389	May 7, 2023
Resources for using custom models with trainer Beginners	6	5393	April 6, 2021
How to train mnist with trainer? 🤗Transformers	1	504	December 7, 2023
Using GRPOTrainer with a custom PyTorch module? 🤗Transformers	3	39	April 29, 2025
Using HF to train a custom PyTorch architecture Beginners	0	511	July 29, 2022

Using Huggingface Trainer for custom models

Related topics