Deploying Open AI's whisper on Sagemaker

MLLife · April 18, 2023, 10:52am

found the fix, need to add these to config

    env={
        'HF_TASK':'automatic-speech-recognition'
    }

without this model is unable to identify the given task

thusken:

@sohoha is correct. The current Huggingface implementation of Whisper only supports 30 seconds of audio, although they are working on supporting longer files. See this issue 24 and this PR 24.

I was able to bypass max output limit, which is set in “generation_config.json” as * max_length:448, you can update this to some bigger number like 6000000

getting a warning afterwards,

UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 36000 (`generation_config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

what then the issue I am getting is error 413, which is related to how sagemaker works and has nothing to do hf model

github.com/aws/amazon-sagemaker-examples

Getting error while invoking sagemaker endpoint

opened 08:23PM - 08 May 18 UTC

closed 09:37PM - 29 Jun 18 UTC

Harathi123

I created training job in sagemaker with my own training and inference code usin…g MXNet framework. I am able to train the model successfully and created endpoint as well. But while inferring the model, I am getting the following error: ‘ClientError: An error occurred (413) when calling the InvokeEndpoint operation: HTTP content length exceeded 5246976 bytes.’ What I understood from my research is the error is due to the size of the image. The image shape is (480, 512, 3). I trained the model with images of same shape (480, 512, 3). When I resized the image to (240, 256), the error was gone. But producing another error 'shape inconsistent in convolution' as I the trained the model with images of size (480, 512). I didn’t understand why I am getting this error while inferring. Can't we use images of larger size to infer the model? Any suggestions will be helpful Thanks, Harathi

currently the code, that is working directly streams the file using the path to the endpoint; so the solution is rather to give s3 path and download and process file inside the endpoint;
which needs an custom “inference.py”

let me know if anyone still working on this

Topic		Replies	Views
Modelerror when deploying openchat3.5 Amazon SageMaker	0	223	April 2, 2024
Keep getting error '400' status code Amazon SageMaker	0	369	February 29, 2024
Cannot invoke sagemaker endpoint, keep getting OS error Amazon SageMaker	3	2842	February 2, 2024
Getting ModelError when trying to interact with deployed fine-tuned (LoRA/PEFT) model via Amazon API Gateway and AWS Lambda Amazon SageMaker	3	1671	July 21, 2023
Fairseq MMS HuggingFace model deployment Amazon SageMaker	1	743	November 23, 2023

Deploying Open AI's whisper on Sagemaker

Related topics