Which model of transformers to use if I want to do multiclassification of a pair of sentences containing a questionair

I have a corpus of sentences and each sentence is assigned to a code which look like a question. My aim is to classify the pair as a unit.

Given the sentence, sentence_question , the model give a classification for example A . I have three classes. Note that the question are fixed and are 21 in total.

For exemple : je suis bien; quel est son statut ? === class A. I want the sentenceQ to be an element to help the classifier choose the overall class of the whole pair. Is it possible to do it with BERT sentence similarity ? or is not the good model ?
Should I give the pair as a single sentence to BERT or not ?