There's a way to package whisper model fine tuned for download a pickle file for inference?

cardev212 · March 17, 2023, 1:17pm

Hey there, i’m looking for a way to package a finetuned whisper model and base model in just one pickled file, to upload to server and then load and keep it in gpu for quick inference.

Since my fine tuned (40Mb) model need a base whisper model 6.7Gib, take a while download all files, and depends on openAI/whisper-model repository for correct working, so if i want to deploy it, i need to make the service more solid, (avoid external dependencies).

I’d like somethings like whisper preset load way: whisper.load_model(‘my-model’), and then whisper.transcribe(‘audio.wav’)[‘text’]

For now the only way that i found should be down load base and finetunned files, in 2 folder, and make a pipeline with models, processor, feature extractor, and tokenizer.

Other things that i would improve, should be choose the best format for pickling or serializing this model file to make it light and quick, or delete weight trained of languages that the service, never going to use (Hebrew, Arabic, etc), but i understand that this could be difficult or impossible since the model base already was trained.

Topic		Replies	Views
Openai Whisper Finetune checkpoint in local directory Beginners	0	265	March 21, 2024
Finetuned whisper model translating instead of transcribing 🤗Transformers	2	734	December 31, 2023
Loading Fine tuned whisper model (LOCAL) Intermediate	0	750	November 13, 2023
Problems tracing fine tuned whisper model to torchscript Beginners	1	401	June 27, 2024
Cannot load fine-tuned whisper model Beginners	1	1515	October 7, 2023

There's a way to package whisper model fine tuned for download a pickle file for inference?

Related topics