We are interested in cost effective solution and also interested in hosting multiple models in one container.
But I think we can not host multiple models in one container behind one endpoint with both elastic inference and Inferentia but it’s possible with only cpu based instances. Thanks
1 Like