NLI 2-sentence classification with GPT2, XLNet, etc.?

facehugger2020 · September 7, 2020, 3:57am

I’m doing research on NLI with 2-sentence classification. I have already successfully used BERT and its BertForSequenceClassification class to feed in two sentences in the form of an input string [CLS] sent1 [SEP] sent2 [SEP] and then perform classification.

I’d like to do the same with the other models available, such as GPT2, XLNet, and RoBERTa. However, I can’t seem to find any example code that takes two sentences for those models. Do they use the same type of input string as BERT? Can someone point me to the relevant webpages for more information?

Thank you for any help.

facehugger2020 · September 8, 2020, 6:04pm

Surely there must be someone who’s done NLI with one of these other non-BERT models?

valhalla · September 9, 2020, 12:43pm

Hi @facehugger2020
XLNet and RoBERTa can take sequence pairs just like BERT does. For roberta its’s encoded likes <s>sent1 </s></s> sent2</s>. If you pass the sentence pair to the tokenizer it’ll automatically encode it like that.

You can use the run_glue.py script here to fine-tune XLNet, RoBERTa on NLI

WIth GPT-2, you can use GPT2Model class (without LM head), and use the final embeddings returned by the model and pass them to the classification layer. Here’s one example I found.

Topic		Replies	Views
Trying to understand XForSequenceClassification heads Intermediate	8	1322	September 24, 2020
Tokenizer decoding using BERT, RoBERTa, XLNet, GPT2 Beginners	7	8430	September 21, 2020
BERT and GPT2 embedding questions Beginners	2	1533	December 28, 2022
Can we use tokenizer from one architecture and model from another one? Beginners	2	866	September 30, 2021
RoBERTa for Sentence-pair classification Models	2	1968	April 23, 2024

NLI 2-sentence classification with GPT2, XLNet, etc.?

Related topics