Customizing T5 tokenizer for finetuning

Hi,
I am finetuning a T5 model for QA on my dataset but the vocab is so different than the tokenizer’s, which results in an excessive length of token_ids/tokens. can I train a new tokenizer from the existing one and use it for finetuning? if yes, any tips/resources to aid?
Thanks

what i did was make a set of words i want to be tokenized and used tokenizer.add_tokens(new_tokens).
remember to resize embedding weights in the model as well: model.resize_token_embeddings(len(tokenizer))