Hi
I have sentences with additional features for each (a 5 dims vector of floats) and a label for each (True or False).
So an example of sample:
“hello have a nice day”, [0.2, 0.1, 0.6, 0.7, 0.2], True
I have a classification task on my data.
I want to finetune Bert to take input_sentence, concat the vec to the last hidden layer, and predict label_sentence.
How can it be done? I dont find any code sample of concat additional data to the last layer before classification
One way you could do it is by precomputing the last hidden states’ CLS token embedding for each of the text in your dataset and storing it in a numpy array. Then you could concatenate this array with your desired additional features to accomplish classification task.
On a side note, you may want to rescale your additional features to the scale of bert embeddings.
@nielsr Thanks, it is working great, just a few questions about ur notebook:
In CustomSequenceClassification, why do you need to call post_init()?
Lets say that in the input, instead of a single sentence I want to insert two sentences (“This is a sentence 1”, “This is a sentence 2”) , how would you do it?
Also, if you could please help with an additional question I just posted it would be highly appreciated…
In CustomSequenceClassification, why do you need to call post_init()?
The post_init method takes care of initializing all the weights as defined in the _init_weights method of the xxxPreTrainedModel. See e.g. the _init_weights method of BERT here.
Lets say that in the input, instead of a single sentence I want to insert two sentences (“This is a sentence 1”, “This is a sentence 2”) , how would you do it?
You can leverage the tokenizer for that, as it supports pair of sentences besides single sentences. Just like this: