This is a continuation of my post here. I’m trying to deploy BERT for text classification with tensorflow. When I use the model.deploy() method, I can successfully get inferences from BERT. Here’s my problem: I have four different models for classification and I want to run them on the same instance, not multiple instances, to save on cost. So I tried using the MultiDataModel class, but I keep getting the following error:
The CloudWatch logs don’t add any additional information, unfortunately. Here’s the structure of counterargument.tar.gz in the s3 bucket, which I cloned from my HuggingFace account and zipped.
@wsunadawong when using Multi-Model-Endpoints SageMaker stores the models differently. That’s why model.deploy() seems to work and MME does not. We (Amazon & HF) are looking into it. Hopefully, we can come back with a fix as soon as possible!
Could you share logs from the endpoint’s cloudwatch logs? Also, the counterargument.tar.gz is a flat directory or does that tar have a directory inside which these files exist?
The container seems to be using default model directory to lookup for model instead of MME platforms model directory. We will try to reproduce this and get back.