I am performing a comparison of MNLI fine-tuning performance across different models, and am having trouble with fine-tuning on GPT-2. Since NLI takes two texts as input, the hypothesis and premise, I am unsure how to format the text. Is it sufficient to just feed in "Premise: Hypothesis: " as input, or do I need to use any tokens to separate the two? If you have suggestions on how to tokenize/process the input, that would be much appreciated. Thanks!