NameError when tokenizing with num_proc

On my windows machine, running the multiprocessing code in The map() method’s superpowers fails with "NameError: name 'slow_tokenizer' is not defined"

Binding slow_tokenizer as a slow_tokenize_function parameter makes the code run …

slow_tokenizer = AutoTokenizer.from_pretrained("bert-base-cased", use_fast=False)
def slow_tokenize_function(examples, slow_tokenizer=slow_tokenizer):
    return slow_tokenizer(examples["review"], truncation=True)
tokenized_dataset =, batched=True, num_proc=8)

… but it’s much slower than without num_proc - I think the python multiprocessing issues on Jupyter and Windows are pretty well know (o: