Payload too large for Async Inference on Sagemaker

philhd · May 25, 2023, 12:35pm

I got it up and running by doing it slightly different

def infer_async():
    sagemaker_runtime = boto3.client("sagemaker-runtime")

    # Specify the location of the input. Should be JSON with the input audion file (example in 02_deploy_whisper-Async.ipynb notebook)
    input_location = "s3://async-inf/input.json"

    # The name of the endpoint. The name must be unique within an AWS Region in your AWS account.

    # After you deploy a model using SageMaker hosting
    # services, your client applications use this API to get inferences
    # from the model hosted at the specified endpoint.
    response = sagemaker_runtime.invoke_endpoint_async(
        EndpointName=endpoint_name,
        # ContentType='audio/mpeg',
        InputLocation=input_location,
    )
    print(response)

Topic		Replies	Views
Deploying Open AI's whisper on Sagemaker Amazon SageMaker	54	16202	April 12, 2024
Async TEI Deployment Cannot Handle Requests greater than 2mb Amazon SageMaker	2	97	November 4, 2024
Curl parameters for aws-whisper-large inference end point? Amazon SageMaker	2	1123	October 17, 2022
Sagemaker serverless endpoint deployment error (Image size greater than support size)) Amazon SageMaker	3	1236	July 21, 2023
Using Inference API with large audio files Beginners	4	1185	September 16, 2022

Payload too large for Async Inference on Sagemaker

Related topics