Trouble loading HF community's OpenAI Whisper models

I followed Sanchit Gandhi’s tutorial (Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers) and trained my own model and pushed to the HF hub (happy dance). But I am having trouble loading it from the hub:

huggingface_hub.login(token=token)
MODEL="Pardner/whisper-small-fa"
processor = WhisperProcessor.from_pretrained(MODEL)

and I get the following error

processor = WhisperProcessor.from_pretrained(MODEL)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/processing_utils.py", line 465, in from_pretrained
    args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/processing_utils.py", line 511, in _get_arguments_from_pretrained
    args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2032, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'Pardner/whisper-small-fa'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'Pardner/whisper-small-fa' is the correct path to a directory containing all relevant files for a WhisperTokenizer tokenizer.

I get identical errors when I try to load the processor & model with “processor = AutoProcessor.from_pretrained()”.

Traceback (most recent call last):                                                          
File "<stdin>", line 1, in <module>
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/models/auto/processing_auto.py", line 312, in from_pretrained                                                     return processor_class.from_pretrained(                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/processing_utils.py", line 465, in from_pretrained
    args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/processing_utils.py", line 511, in _get_arguments_from_pretrained
    args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whisper/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2032, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'steja/whisper-large-persian'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'steja/whisper-large-persian' is the correct path to a directory containing all relevant files for a WhisperTokenizer tokenizer.

I tried moving my trained model into my local ~/.cache/huggingface/hub directory and got the same error.

I have tried one other community model (steja/whisper-small-persian) and I get the same results.

Any help would be great.

~Pardner

**Admins, feel free to move to “AutoTrain” subforum

I have made very little progress. I have been able to load a community Whisper model “jonatasgrosman/whisper-large-zh-cv11”. Looking at jonatasgrosman’s files, I see that they are different from the files generated from my training.

jonatasgrosman/whisper-large-zh-cv11

  • README.md
  • added_tokens.json
  • all_results.json
  • config.json
  • eval_results.json
  • evaluation_cv11_test.json
  • evaluation_fleurs_test.json
  • evaluation_whisper-large-v2_cv11_test.json
  • evaluation_whisper-large-v2_fleurs_test.json
  • merges.txt
  • normalizer.json
  • preprocessor_config.json
  • pytorch_model.bin
  • runs
  • special_tokens_map.json
  • tokenizer_config.json
  • train_results.json
  • trainer_state.json
  • training_args.bin
  • vocab.json

Pardner/whisper-small-fa

  • README.md
  • config.json
  • generation_config.json
  • model.safetensors
  • preprocessor_config.json
  • runs
  • training_args.bin

I see that I am missing a “normalizer.json”, “pytorch_model.bin”, and “tokenizer_config.json” but I have a "model.safetensor. I believe I may have missed something in the trainer. I have a I used HF Seq2SeqTrainer to train my model and I used the Seq2SeqTrainer to push to the HF hub:

training_args = Seq2SeqTrainingArguments(
    output_dir="./training/whisper-small-fa",  # change to a repo name of your choice
    per_device_train_batch_size=16,
    gradient_accumulation_steps=1,  # increase by 2x for every 2x decrease in batch size
    learning_rate=1e-5,
    warmup_steps=500,
    max_steps=5000,
    gradient_checkpointing=True,
    fp16=False,
    evaluation_strategy="steps",
    per_device_eval_batch_size=8,
    predict_with_generate=True,
    generation_max_length=225,
    save_steps=1000,
    eval_steps=1000,
    logging_steps=25,
    report_to=["tensorboard"],
    load_best_model_at_end=True,
    metric_for_best_model="wer",
    greater_is_better=False,
    push_to_hub=True,            
)

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    train_dataset=common_voice["train"],
    eval_dataset=common_voice["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

trainer.train()

kwargs = {
    "dataset_tags": "mozilla-foundation/common_voice_16_0",
    "dataset": "Common Voice 16.0", 
    "dataset_args": "config: fa, split: test",
    "language": "fa",
    "model_name": Whisper Small Fa - Brett OConnor",  
    "finetuned_from": "openai/whisper-small",
    "tasks": "automatic-speech-recognition",
    "tags": "hf-asr-leaderboard",
}
trainer.push_to_hub(**kwargs)

~Pardner

Okay, I think I solved it . I changed:

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    train_dataset=common_voice["train"],
    eval_dataset=common_voice["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

to

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    train_dataset=common_voice["train"],
    eval_dataset=common_voice["test"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor,
)

And I am able to load both the model and the processor after the training is completed!

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.