Custom Dataset with Custom Tokenizer

You should just use the tokenizer __call__: tokenizer(example["text"]).