Hmm… Like this?
#tokenizer.pre_tokenizer = AtomicUnitPreTokenizer(atomic_units) tokenizer.pre_tokenizer = PreTokenizer.custom(AtomicUnitPreTokenizer(atomic_units))