Payload too large for Async Inference on Sagemaker

For AsyncInference there is another very important configuration required to prevent the 413 error.


env={
        'MMS_MAX_REQUEST_SIZE': '2000000000',
        'MMS_MAX_RESPONSE_SIZE': '2000000000',
        'MMS_DEFAULT_RESPONSE_TIMEOUT': '900'
    }

HuggingFaceModel(env=env …)

@philschmid
would be nice to have it mentioned in the documentation

3 Likes