Trainer for MT with source and target tokenizers

I’m looking for examples of using HuggingFace’s Trainer with different source and target tokenizers for NMT. I see there is a as_target_tokenizer context available, but I could not find how to specify the tokenizers in the configuration.

To be specific, I have an EN-JP translation task that I want to train with HuggingFace’s Trainer. I have a HuggingFace tokenizer trained for EN and JP (separately) and I want to use them to train a vanilla seq2seq transformer model. What is the best way to set up this configuration?

I was able to find the fine-tuning T5 for EN-FR example (Translation), which seems to have a PreTrainedTokenizer that has a source and target tokenizer, but I could not find a good way to do this from scratch, where I provide the (non-pretrained) model, tokenizers, and data.

Any help would be appreciated.