No, they will enter the context after the main process, and since everything Datasets does is cached, it will use the cache and not redo the tokenization.
No, they will enter the context after the main process, and since everything Datasets does is cached, it will use the cache and not redo the tokenization.