How do I know which models will produce token_type_ids and which wont?

alpyne · April 22, 2022, 8:37pm

So this section: Processing the data - Hugging Face Course.

Mentions that certain checkpoints dont have token_types_ids.

Is there certain features of a model that indicate whether or not it supports token_type_ids? If so, what are those qualities?

Thanks

marshmellow77 · April 22, 2022, 10:14pm

Hi Edward!

The token_type_ids are returned if the model has seen them in pre-training and knows what to do with them. So it all depends how the model was pre-trained.

But as the course also mentions, you usually don’t have to worry about the token_type_ids - as long as you use the same checkpoint for the tokenizer and the model, everything will be fine as the tokenizer knows what to provide to its model.

Hope that helps,

Cheers
Heiko

Topic		Replies	Views
Do any models support 3 types of token_type_ids? 🤗Transformers	0	175	June 21, 2021
Do I need token_type_ids for BertForSequenceClassification? 🤗Transformers	2	215	October 12, 2020
Is there a version of `prepare_for_model` that works on a `List[List[int]]`? 🤗Transformers	0	220	February 4, 2022
Token_type_ids and DistilBert Models	0	390	August 2, 2022
Use of the authentication token Beginners	0	639	March 16, 2023

How do I know which models will produce token_type_ids and which wont?

Related topics