Is there any conceptual/practical mistake I am committing in using “roberta-base” or “bert-base-uncased” pre-trained model as a Sequence Classifier for QNLI task on a custom dataset?
Say I have a dataset of questions and answers (related to general business) where human annotators determine if a sentence has the answer to a question (labels 0 and 1). I instantiate one of the aforementioned pre-trained models with SequenceClassification head and fine-tune it with the data just like the GLUE tasks colab example given in the transformers documentation. Tokenize questions and answers, encode them with [CLS] and [SEP] tokens and then fine-tune the model and use it to make predictions.