Finetune whisper-tiny in german for tflite runtime

Datadave09 · October 11, 2024, 2:03pm

Helly everyone,

I’ve been trying to finetune a customized whisper-tiny model in german which I want to serve with the tflite runtime in a rust application for fast inference on mobile devices.
Loading the stock model from huggingface (openai/whisper-tiny at main) and converting it to tflite is working fine, but I haven’t found any solution to finetune the model in tensorflow format so that I get the model weights as .h5 file in the end which I can then load with TFWhisperForConditionalGeneration.from_pretrained() for the tflite conversion.

All finetuning tutorials I’ve found so far are pytorch based, and converting the model from pytorch over onnx to tensorflow format has resulted in asaved_model.pb file which I can’t load with TFWhisperForConditionalGeneration.from_pretrained() .

HuggingFace provides the .h5 format for the whisper-tiny model in above link, so I assume there must be a way to either finetune the model or convert it from pytorch to tensorflow format but I haven’t found a solution so far.

Does anyone have a hint to solve this issue?

My current pytorch-to-tf conversion script looks as follows:

import torch
import onnx
import librosa
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import onnx_tf
import numpy as np

np.bool = np.bool_

path = "openai/whisper-tiny"
# only model and processor need to be kept during training
processor = WhisperProcessor.from_pretrained(path)

model = WhisperForConditionalGeneration.from_pretrained(path)

# laod examplary input
y, sr = librosa.load("audio/en.wav", sr=16_000)
assert sr==16_000, "Only 16k sr supported"
input_features = processor(y, sampling_rate=sr, return_tensors="pt").input_features 
decoder_input_ids = torch.tensor([[50258]])

res = model(input_features, None, decoder_input_ids)
# print(res.logits.shape)
torch.onnx.export(model,
                  (input_features, None, decoder_input_ids),
                  'whisper.onnx',
                  input_names=['input_features',                         
                               'attention_mask',
                               'decoder_input_ids'],
                  output_names=['output'], 
                  opset_version=14)

onnx_model = onnx.load('whisper.onnx')

# print("Model Inputs: ", [inp.name for inp in onnx_model.graph.input])

tf_model = onnx_tf.backend.prepare(onnx_model)
tf_model.export_graph("whisper.tf")

requirements:

python 3.10.13
tensorflow 2.15.0
torch 2.1.4
transformers 4.45.2

John6666 · October 12, 2024, 4:39am

Scripts for HF staff to use for their own use may be available on github. And in many cases, there is no manual. The scripts below are just examples; explore github.

github.com

huggingface/transformers/blob/main/src/transformers/convert_pytorch_checkpoint_to_tf2.py

# coding=utf-8
# Copyright 2018 The HuggingFace Inc. team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Convert pytorch checkpoints to TensorFlow"""

import argparse
import os

from . import (

This file has been truncated. show original

Datadave09 · October 16, 2024, 11:14am

Thanks, will definitely take a look at it:)

Topic		Replies	Views
Fine tuned whisper model export to .h5 Models	0	140	July 1, 2024
Problems tracing fine tuned whisper model to torchscript Beginners	1	404	June 27, 2024
Converting finetuned Pytorch Whisper model to HF Beginners	0	690	January 15, 2023
Convert OpenAI whisper transformer model to Quantized tflite model 🤗Transformers	1	2359	November 7, 2023
Finetuning model with smaller sequence size and Dmodel Models	0	337	April 15, 2021

Finetune whisper-tiny in german for tflite runtime

Related topics