I want my output to be like [CLS] [SEP] text1 [SEP] text2 [SEP] text3 [SEP] eos token
. As per the default behaviour, tokenizer expects either a string or a pair of string.
tokenizer(sentence1, sentence2) # returns a single vector value for input_ids. I want this but for three sentences
I want the pair of string behavior for three sentences. I can pass a list of sentences, but that creates 3 lists of input_ids
.
tokenizer([sentence1, sentence2, sentence3]) # returns three tensors for input_ids
I want a single tensor representing the output I wrote above.
Is there any good way of doing it ?