Say I have a binary classification problem, but in addition to the sentence I’d like to also input some scalar value as well. Is it possible to just tack on this scalar as input to the last linear layer of BERT?
For example, I’d like to detect if a particular sentence is from my source data or generated. And I know that many instances of a repeated word increases the likelihood that it is a generated sentence. So I’d like to pass the sentence itself into BERT as well as a scalar feature such as the number of unique words in the sentence.