Hello ivnle!
I am currently trying to train an LSTM model that takes as input the embeddings that are outputted from a pretrained model from the hf hub. I am following the example at Sharing custom models and my model class that inherits from PreTrainedModel is the following:
from transformers import PreTrainedModel
class LSTMModel(PreTrainedModel):
config_class = LSTMConfig
def __init__(self, config, pretrained_model):
super().__init__(config)
self.model = SentimentLSTM(pretrained_model,
output_size=config.output_size,
hidden_dim=config.hidden_dim,
n_layers=config.n_layers)
def forward(self, tensor, labels=None):
logits = self.model(tensor)
if labels is not None:
loss = torch.nn.cross_entropy(logits, labels)
return {"loss": loss, "logits": logits}
return {"logits": logits}
where SentimentLSTM is my custom LSTM model.
Then I initialize the training_arguments and the trainer object and try to perform trainer.train(). However I get an error, due to the tensor argument, because the function cannot find it.
Did you use the same example to perform your own training?
If so, what was your forward function in the corresponding LSTMModel class and how could you pass an extra argument for this function through Trainer?
Thank you in advance,
Petrina