Unable to load long-t5-local-base tokenizer after adding vocabulary

WW370 · April 13, 2024, 3:15am

I am trying to train google/long-t5-local-base to generate some demo data for me. I wrote a function that tokenized training data and added the tokens to a tokenizer. I tried to use it in a training loop, and it complained that no config.json file existed. I then tried bringing that over from the HuggingFace repo and nothing changed.

How can I get the tokenizer to load properly? Originally, I had the following files

added_tokens.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json

WW370 · April 14, 2024, 2:43pm

Resolved -

system · April 15, 2024, 2:43am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Having issues accessing tokenizer_config.json for T5 Models	3	59	June 10, 2025
Huggingface t5 models seem to not download a tokenizer file 🤗Tokenizers	0	635	December 16, 2022
Issue with Loading Custom Tokenizer: Tokenizer class BaseTokenizer does not exist or is not currently imported Error 🤗Tokenizers	6	217	November 6, 2024
OSError: Unable to load vocabulary from file Beginners	2	4841	March 24, 2023
Tokenizer issue in Huggingface Inference on uploaded models Beginners	7	3058	January 9, 2024

Unable to load long-t5-local-base tokenizer after adding vocabulary

Related topics