RoBERTa for Sentence-pair classification

Hi,
I’m new to using HF and my current task involve sentence pair classification - the input is a pair of sentences and the output shall be binary 0 or 1.

I referred to the documentation, and tried some code out.
I know from theory and also figured out in code that some models like bert-base-uncased are able to use a pair of inputs inasmuch as they have this layer to assign token_type_ids to the sentences to be able to differentiate sentence 1 from sentence 2, like so -

from transformers import AutoTokenizer, AutoModel, AutoModelForSequenceClassification
bert_model = 'bert-base-uncased'
bert_layer = AutoModel.from_pretrained(bert_model)
tokenizer = AutoTokenizer.from_pretrained(bert_model) 
sent1 = 'how are you'
sent2 = 'all good'

encoded_pair = tokenizer(sent1, sent2, 
                                      padding='max_length',  # Pad to max_length
                                      truncation=True,  # Truncate to max_length
                                      max_length=50,  
                                      return_tensors='pt')
print(encoded_pair)

gives this:

{'input_ids': tensor([[ 101, 2129, 2024, 2017,  102, 2035, 2204,  102,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
            0,    0]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0]])}

While this is fine, there are other models that do not output the attribute token_type_ids like ‘roberta-base’ and so on.
Does this mean these models could not be used for sentence pair classification?

I eventually need to use climateBERT for my task, but it’s adaptively tuned using distillRoBERTa, so I’m asking this in the context of all such models that do not use token_type_ids.

I have only studied BERT’s paper, so not sure if models like roBERTa are meant to be used for sentence pair classification tasks or not. Please help me with answering this.

Thanks in advance for any help.