Hey @Oigres,
Which tokenization step do you mean for training or inference?
For training, the tokenization is done in the preprocessing in the notebook.
For inference, the tokenization is done in the sagemaker-huggingface-inference-toolkit and the toolkit leverages the transformers pipeline.