Configure RobertaTokenizer

Hi, I am willing to configure RobertaTokenizer such that it outputs token_type_ids that it doesn’t by default. Is there a way to do that?

I have changed the model configuration and updated its type_vocab_size to 2, like so:

model = RobertaModel.from_pretrained('roberta-base')

# Update config to finetune token type embeddings
model.config.type_vocab_size = 2 

# Create a new Embeddings layer, with 2 possible segments IDs instead of 1
model.embeddings.token_type_embeddings = nn.Embedding(2, model.config.hidden_size)
# Initialize it, std=model.config.initializer_range)

I want to input token_type_ids to the model instance like so:

model(token_ids, attn_masks, token_type_ids)