Deploying private model to inference endpoint "./ does not appear to have a file named config.json"

TL;DR: How can I load my private model from within for custom inference endpoint?

Longer version:

My ultimate goal is to run a private fine tuned whisper model that forces a language and transcribe token. I fine tuned the model for a specific language but the model does not reliably identify that language and frequently transcribes to very bad english.

First I tried the default ASR inference endpoint setup, but it seemed there was no way to force tokens (It seems there’s no parameters that can be passed [post)].

So in order to force the tokens, I’m now using a custom Because it’s a private model, I couldn’t download it by repo name because I don’t have a way of accessing an HF token in the

I saw found this code which seems to suggest there’s a way of locally accessing the model using “./” which seems to make sense

But when I try it, I get variations on this error:

OSERROR: ./ does not appear to have a file named config.json. Checkout '' for available files.

Even though my repo does have a config.json file at the root.

When I sneak an os.listdir('.') into the script, I don’t see any files or directories.

Here’s my code -

from typing import Dict
from transformers.pipelines.audio_utils import ffmpeg_read
import torch
# from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
from transformers import WhisperTokenizer
from transformers import WhisperProcessor
from transformers import WhisperForConditionalGeneration
import os


class EndpointHandler:
    def __init__(self, path=""):
        self.processor = WhisperProcessor.from_pretrained("openai/whisper-large-v3", language='french', task='transcribe')
        self.model = WhisperForConditionalGeneration.from_pretrained("./.")  # THIS IS WHERE THE ERROR OCCURS
        self.model.config.forced_decoder_ids = self.processor.get_decoder_prompt_ids(
            language="french", task="transcribe"

    def __call__(self, data: Dict[str, bytes]) -> Dict[str, str]:
            data (:obj:):
                includes the deserialized audio file as bytes
            A :obj:`dict`:. base64 encoded image
        # process input
        inputs = data.pop("inputs", data)
        audio_nparray = ffmpeg_read(inputs, SAMPLE_RATE)
        #  audio_tensor = torch.from_numpy(audio_nparray)

        # run inference pipeline
        result = self.model.transcribe(audio_nparray)

        # postprocess the prediction
        return {"text": result["text"]}

I hadn’t noticed but it looks like the way to do this is to use path in the __init__ function.

self.model = WhisperForConditionalGeneration.from_pretrained(path)

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.