Hello,
- There is no fix yet for it but there is a workaround. You can set an environment variable
MMS_DEFAULT_WORKERS_PER_MODEL=1
when creating the endpoint. - Since Serverless Inference is powered by AWS Lambda and AWS Lambda doesn’t have GPU support yet Serverless Inference won’t have it as well. And i assume it will get GPU support when AWS Lambda has GPU support.