Hi everyone,
I built a custom model on top of ´xlm-roberta-large´ using pytorch lightning. The code is something like this:
class Model(pl.LightningModule):
def __init__(self):
super().__init__()
self.num_labels = 20
self.model = AutoModel.from_pretrained(
'xlm-roberta-large',
config='xlm-roberta-large'
)
for param in self.model.parameters():
param.requires_grad = False
self.bilstm = torch.nn.LSTM(
input_size=self.model.config.hidden_size,
hidden_size=256,
num_layers=1,
bidirectional=True,
batch_first=True,
bias=True
)
self.classifier = torch.nn.Linear(256, self.num_labels)
def forward(self, input_ids, attention_mask=None, labels=None):
output = self.model(input_ids=input_ids,
attention_mask=attention_mask)
output, (hn, cn) = self.bilstm(output.last_hidden_state)
...
Now I wanted to upload the model to huggingface and I managed to successfully do so with this boilerplate code:
class SubjectClassifierConfig(PretrainedConfig):
def __init__(
self,
**kwargs,
):
super().__init__(**kwargs)
class SubjectClassifier(PreTrainedModel):
config_class = SubjectClassifierConfig
def __init__(self, config):
super().__init__(config)
self.model = Model()
def forward(self, *args, **kwargs):
return self.model.forward(*args, **kwargs)
However, it’s not clear to me:
- What should go into the config?
- Should I include the arguments I used for the Tokenizer somewhere? If so, where?
- Should I modify the
forward
method ofSubjectClassifier
class to already do the tokenization and convert the return value of the model to a class label? - Should I include the code for the model (and training) when uploading? The model weights themselves are not particularly useful without knowing the code I suppose.