Hi,
I use Helsinki-NLP/opus-mt-fr-en
model for translation from french to english.
When I load the tokenizer, I see that the tokenizer isn’t fast even if I use the use_fast=True
flag:
tokenizer = AutoTokenizer.from_pretrained(
Helsinki-NLP/opus-mt-fr-en, use_fast=True)
PreTrainedTokenizer(name_or_path='Helsinki-NLP/opus-mt-fr-en', vocab_size=59514, model_max_len=512, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>'})
Doesn’t it exist fast tokenizer for MarianMTModel
?