Custom Dataset with Custom Tokenizer

Oh, you should wrap your tokenizer in a PreTrainedTokenizerFact from the Transformers library (you can just pass your tokenizer with the tokenizer_object keyword argument).

1 Like