Combine multiple sentences together during tokenization

I want my output to be like [CLS] [SEP] text1 [SEP] text2 [SEP] text3 [SEP] eos token. As per the default behaviour, tokenizer expects either a string or a pair of string.
tokenizer(sentence1, sentence2) # returns a single vector value for input_ids. I want this but for three sentences
I want the pair of string behavior for three sentences. I can pass a list of sentences, but that creates 3 lists of input_ids.
tokenizer([sentence1, sentence2, sentence3]) # returns three tensors for input_ids

I want a single tensor representing the output I wrote above.
Is there any good way of doing it ?

I don’t think tokenizer handles this case directly.

You could directly join the sentences using [SEP] and then encode it as one single text.

tok = BertTokenizer.from_pretrained("bert-base-cased")
text = "sent1 [SEP] sent2 [SEP] sent3"
ids = tok(text, add_special_tokens=True).input_ids
tok.decode(ids)
=> '[CLS] sent1 [SEP] sent2 [SEP] sent3 [SEP]'

Okay. Thanks. May have to manually add then.

@prajjwal1 … I want to highlight a point here: using more than 2 [SEP] tokens with bert is not a g scientific solution. Bert wasn’t trained for that.