Openai Whisper Finetune checkpoint in local directory

Unspoiled-Egg · March 21, 2024, 4:44am

Hello, I had a question regarding the openai whisper model prediction process. Suppose I finetuned the model using Trainer and saved the model and tokenizer using .save_pretrained() in my local directory, now when I create the pipeline for prediction or transcription will the pipeline work correctly? What I mean, we can save models locally by using

trainer.save_model("local/dir/path")

but for ASR models which require feature extractors, how to save the trainer so that I can later create a pipeline just by uploading the local folder?

Also, according to my understanding ASR pipeline has 3 components:

Feature extractor
Model
Tokenizer

So my question is, why don’t we have to save pass a feature extractor argument in trainer like model and pass something like this:

trainer = Seq2SeqTrainer(model=model, 
                         feature_extractor=processor.feature_extractor, 
                         tokenizer=processor.tokenizer, ....)

Edit: We can save the model and other required components by;

trainer.save_model("local/dir/path")
processor.save_pretrained("local/dir/path")

Topic		Replies	Views
Finetuned whisper model translating instead of transcribing 🤗Transformers	2	734	December 31, 2023
Korean finetuning on Whisper Beginners	1	1611	February 25, 2024
Is prompt properly implemented in the whisper model? 🤗Transformers	1	1571	September 19, 2024
There's a way to package whisper model fine tuned for download a pickle file for inference? Models	0	620	March 17, 2023
Unable to run whisper small finetune after training Beginners	2	93	November 30, 2024

Openai Whisper Finetune checkpoint in local directory

Related topics