I’m using BertForSequenceClassification + Pytorch Lightning-Flash for a text classification task. I want to add additional features besides the text (e.g. categorical features). From what I understand, I need to override BertForSequenceClassification “forward” method and change the final classification layer (at least) to include the CLS vector + features vector. However, I didn’t understand how I adapt the data loading procedure to this task - the text part is represented as input ids, and the rest supposed to be represented differently. Is there a simple way to combine text+features for Bert classification task? Thank you!
See this response where I explain how to modify BERT to add additional POS (part-of-speech) features to tokens to perform named-entity recognition.
Correction: my response above is useful in case one wants to add additional text features to the tokens.
However, if you want to combine text features with other features (like categorical or numerical ones) - which was actually the question above (apologies) - it makes sense to concatenate the final hidden state of the [CLS] token (which serves as a good representation of an entire piece of text) with the additional features. This is illustrated in this notebook.
@nielsr in the notebook it doesn’t look like the classification head is actually fitted against the data? I might be missing something, but it looks like it takes a pretrained model, adds a classification head and then uses it directly for inference without actually training it? I’m trying to do exactly this, so curious about the specifics