num_proc only makes sense for slow tokenizers. If tokenizer.is_fast returns True, you should use map in the batched mode (fast tokenizers automatically tokenize a batch of samples in parallel) and set num_proc=None (fast tokenizers are written in Rust, and their multiprocessing module does not support Python multiprocessing) to parallelize the processing.
If you try that and it doesn’t work, try passing the custom tokenizer in as a fn_kwargs. The .map function does not play nice with globally defined variables. Ex.