Hello, I had a question regarding the openai whisper model prediction process. Suppose I finetuned the model using Trainer and saved the model and tokenizer using .save_pretrained()
in my local directory, now when I create the pipeline for prediction or transcription will the pipeline work correctly? What I mean, we can save models locally by using
trainer.save_model("local/dir/path")
but for ASR models which require feature extractors, how to save the trainer so that I can later create a pipeline just by uploading the local folder?
Also, according to my understanding ASR pipeline has 3 components:
- Feature extractor
- Model
- Tokenizer
So my question is, why don’t we have to save pass a feature extractor argument in trainer like model and pass something like this:
trainer = Seq2SeqTrainer(model=model,
feature_extractor=processor.feature_extractor,
tokenizer=processor.tokenizer, ....)
Edit: We can save the model and other required components by;
trainer.save_model("local/dir/path")
processor.save_pretrained("local/dir/path")