Using the sample code below:
I’m getting the following error:
{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027bloom\u0027"
}
Any ideas on what could be causing this issue?
Using the sample code below:
I’m getting the following error:
{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027bloom\u0027"
}
Any ideas on what could be causing this issue?
Having the same issue. Did you end up resolving it?
@mel-zheng it’s because BLOOM requires transformers version 4.21.0 but the inference containers offered only supports up to version 4.17.0. I ended up not using SageMaker. I went with a serverless approach and leverage our custom container that has transformers version 4.21.0 installed. Even tho I got the BLOOM model working, it is so large it is practically unusable. It’s unbearably slow running any inference.
Thanks for the update! I actually resolved it by supplying my own custom inference.py with the latest transformers version. Ended up using bloom-560m because bloom is too large like you said.
Hi @mel-zheng  did you solve the issue?
Could you share your solution?