How should one separate the premise and hypothesis in the Host inference API on the model hub, for a natural language inference task?
For example, for Roberta-large-mnli: roberta-large-mnli · Hugging Face,
the “Hosted inference API” has an example input:
I like you. </s></s> I love you.
Does the model separate the premise and hypothesis based on
</s></s>? Where is this documented? I could not find it.
For another NLI model, the example input is
I like you. I love you, which does not have
</s></s>. This is from ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli · Hugging Face
If I want to upload my own model fine-tuned on MNLI, what should I do? I.e., how should I write the
metadata in the model repo? What should the input look like? Thanks in advance!
Good question, seems to also be related to this unanswered question: RoBERTa classification (with article + sentence)
I’m pretty confused about this also. Here seem to be some relevant docs: RoBERTa
If our classification problem involves two pieces of text, should we be injecting these
characters between our two pieces of text? Presumably to give the model more of a hint that it is supposed to look at them (somewhat) separately?
So far my tests indicate that when I do that, my training uses much more RAM, and has much worse accuracy… making me think that somehow using those tokens totally separates the examples (so they are separate instances in each batch)… which isn’t what I want if I want the model to look at both texts “together” with reference to the classification.