How does transformers.pipeline works for NLI?

I am applying pretrained NLI models such as roberta-large-mnli to my own sentence pairs. However, I am slightly confused by how to separate the promise and hypothesis sentences. By checking through the models available on Huggingface and the examples they show on hosted inference API, some use </s></s> between sentences, some use [CLS] ... [SEP] ... [SEP], and some such as your own model do not add any placeholders

I just want to know more about how pipeline(task="sentiment-analysis", model="xxx-nli") works under-the-hood. I assume it feeds each sentence pair separately into tokenizer.encode_plus like what is done here. But what max_length does the model though?

Any information would be really appreciated! Thanks!

Hi @bwang482 ,

You are perfectly correct, the responsability of chosing the proper layout for this task goes to the tokenizer part (encode_plus or just encode).

What do you mean with max_length? I am not sure what you are referring to.

Cheers,
Nicolas

1 Like

Thanks very much for your reply @Narsil !!

By max_length, I meant the max_length parameter in encode_plus, because I saw the author of the Adversarial NLI model actually used this parameter. But now I think about this again, it might be best just leave it unset.

Just to confirm, so if I want to use the correct notations for separation for the model, I should first check out what its (Hosted Inference API) example looks like on Hugging Face – The AI community building the future., and then use the same separation tokens (e.g. </s></s> or [CLS] ... [SEP] ... [SEP])? It seems the models don’t usually use the same notations.

Yes, the README of models should include snippets of how to use. If they don’t however you will have to dive into the model to see how it behaves or use pipelines: Pipelines — transformers 4.5.0.dev0 documentation

Many thanks @Narsil !

I find most models do not have such information in their README. For example, this huggingface model doesn’t have a README at all. It’s also difficult to get any information from its vocab file. By “dive into the model” do you mean just try a bunch of examples? That’d be tricky I think. Even for the models which instructed to use SEPARATION tokens, you can even get somewhat reasonable results w/o using them as it seems. For examples, using this model, if you remove </s></s> the inference API still returns similar results…