Issues with Configuring dtype for Local Models in Whisper-Web (Experimental WebGPU)

ikostia18 · November 17, 2024, 12:12pm

Hello everyone,

I am currently using the experimental-webgpu branch of whisper-web to run Hugging Face’s Whisper models.
My setup utilizes local models with the following environment configuration:

env.allowLocalModels = true;
env.localModelPath = "./models";

I am trying to load a small ONNX Whisper model with specific dtype settings for the encoder and decoder to optimize memory and performance. Here’s the pipeline code:

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "my-whisper-model",
  {
    dtype: {
      encoder_model: "fp32", // Full precision for stability
      decoder_model_merged: "q4", // 4-bit quantization for memory savings
    },
    device: "webgpu",
  }
);

However, the pipeline initialization fails with the following error:

Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.

Precision Levels: Are there recommended or supported dtype precision levels for Whisper models in WebGPU? Is it always necessary to use fp32 for the encoder, or can other settings work reliably?
File Naming Conventions: Do ONNX file names or structures need to follow specific conventions for the dtype keys (e.g., encoder_model, decoder_model_merged) to work correctly?
General Guidance: Are there any guidelines or best practices for configuring dtype with local models in this library, especially with the WebGPU backend?

Thanks in advance!

Topic		Replies	Views
Fine-tuning Wav2Vec2 for English ASR with 🤗 on local machine Transformers 🤗Transformers	1	434	August 10, 2021
Tensor types mismatch when trying to enable GPU Beginners	0	999	June 16, 2023
Facing difficulty while fine tuning speech recognition model in local pc Beginners	3	426	April 28, 2022
Wav2vec fine-tuning with multiGPU Models	16	6989	May 22, 2021
Fine tuning Wav2vec for wolof Beginners	10	546	November 30, 2021

Issues with Configuring dtype for Local Models in Whisper-Web (Experimental WebGPU)

Related topics