How to train target tokenizer

I’m trying to retrain t5-small with a japanese to spanish dataset, I want to retrain the tokenizer to handle the words in those languages

Currently I’ve done this:

def get_training_corpus(lang: str):
    ds = dataset["train"]
    for start_idx in range(0, len(ds), 1000):
        samples = ds[start_idx : start_idx + 1000]
        yield samples[lang]

tokenizer = AutoTokenizer.from_pretrained("t5-small")
new_tokenizer = tokenizer.train_new_from_iterator(

but I don’t know how to also train the target side of the tokenizer, I would like to be able to tokenize in japanese like

model_inputs = tokenizer(examples["ja"], max_length=max_input_length, truncation=True)

and in spanish using

with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["es"], max_length=max_target_length, truncation=True)