Sagemaker/mms/models/model does not appear to have a file named config.json

alvations · December 20, 2022, 9:49am

When trying to deploy a model on sagemaker for an EncoderDecoderModel, the predictor throws an error that says it can’t find the config.json.

The model is created as such:

from transformers import PreTrainedTokenizerFast
from transformers import EncoderDecoderModel
from transformers import pipeline

tokenizer = PreTrainedTokenizerFast.from_pretrained("bert-base-multilingual-uncased")

tokenizer.bos_token = tokenizer.cls_token
tokenizer.eos_token = tokenizer.sep_token
tokenizer.add_special_tokens({'pad_token': '[PAD]'})

multibert = EncoderDecoderModel.from_encoder_decoder_pretrained(
    "bert-base-multilingual-uncased", "bert-base-multilingual-uncased"
)

# set special tokens
multibert.config.decoder_start_token_id = tokenizer.bos_token_id
multibert.config.eos_token_id = tokenizer.eos_token_id
multibert.config.pad_token_id = tokenizer.pad_token_id


m = pipeline("translation", model=multibert, tokenizer=berttokenizer)
m.save_pretrained('test-model')

Note: When the model is not properly trained, it outputs poor translations but it’s a valid model object.

Then I’ve compressed the model and push it up to S3 like this:

! tar -cvzf test-model.tar.gz test-model
! aws s3 cp test-model.tar.gz s3://mybucket/test-model.tar.gz

And deployed the model like this:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

import boto3

client = boto3.client('sts')
account = client.get_caller_identity()['Account']
sess = boto3.session.Session()

role = sagemaker.get_execution_role()

ecr_uri = '123456789000.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-inference-custom'

hub = {
    'HF_TASK':'translation',
    'SAGEMAKER_CONTAINER_LOG_LEVEL': 10
}

huggingface_model = HuggingFaceModel(
    model_data="s3://mybucket/test-model.tar.gz",
    image_uri=ecr_uri,
    env=hub,
    role=role,
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge"
)

Note: The custom ECR image is just an extension of the canonical ones supported on deep-learning-containers/available_images.md at master · aws/deep-learning-containers · GitHub. I’ve tried to deploy other of-the-shelves models and it works out of the box, e.g. Helsinki-NLP/opus-mt-de-en · Hugging Face

The deployment looks successful but when trying the predict, e.g.

predictor.predict(["hello world"])

It throws the error:

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-20-c5d2aedef5cb> in <module>
----> 1 predictor.predict(["hello world"])

/opt/conda/lib/python3.8/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
    159             data, initial_args, target_model, target_variant, inference_id
    160         )
--> 161         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    162         return self._handle_response(response)
    163 

/opt/conda/lib/python3.8/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    512                 )
    513             # The "self" in this scope is referring to the BaseClient.
--> 514             return self._make_api_call(operation_name, kwargs)
    515 
    516         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.8/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    936             error_code = parsed_response.get("Error", {}).get("Code")
    937             error_class = self.exceptions.from_code(error_code)
--> 938             raise error_class(parsed_response, operation_name)
    939         else:
    940             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "/.sagemaker/mms/models/model does not appear to have a file named config.json. Checkout \u0027https://huggingface.co//.sagemaker/mms/models/model/None\u0027 for available files."
}

Q: Are there additional config that I need to do after `m = pipeline("translation", model=multibert, tokenizer=berttokenizer); m.save_pretrained('test-model')`?

Q: Are there examples of deploying `EncoderDecoderModel` with different `pipeline` tasks?

marshmellow77 · December 21, 2022, 9:34am

Hi @alvations

I see a few potential issues in your code, but before I list them - what are you eventually trying to achieve? What is the job of the endpoint going to be? Just trying to get more context in case there are other ways to tackle the underlying problem.

In any case, here are some of my observations re your code:

The first block of code won’t run because you don’t define berttokenizer
Replacing berttokenizer with tokenizer (which I assume was your intention) I can set up the pipeline. But when running the pipeline with a sample text I get an error ValueError: 'decoder_start_token_id' or 'bos_token_id' has to be defined for encoder-decoder generation.
When tarballing the model directory I don’t think you’re supposed to include the root directory (test-model in your case). The inference script will look in mms/models/model/ for the config file, but I beleieve your config file will end up in mms/models/model/test-model/

Hope that helps!

Cheers
Heiko

alvations · March 8, 2023, 2:18pm

Thanks Heiko for the response. Sorry had to be away for a while. Apologies for the non-working example in the previous comment.

The goal is to try to save a model and load it such that I can deploy it in sagemaker. Given the model:

from transformers import PreTrainedTokenizerFast
from transformers import EncoderDecoderModel
from transformers import pipeline

tokenizer = PreTrainedTokenizerFast.from_pretrained("bert-base-multilingual-uncased")

tokenizer.bos_token = tokenizer.cls_token
tokenizer.eos_token = tokenizer.sep_token
tokenizer.add_special_tokens({'pad_token': '[PAD]'})

multibert = EncoderDecoderModel.from_encoder_decoder_pretrained(
    "bert-base-multilingual-uncased", "bert-base-multilingual-uncased"
)

# set special tokens
multibert.config.decoder_start_token_id = tokenizer.bos_token_id
multibert.config.eos_token_id = tokenizer.eos_token_id
multibert.config.pad_token_id = tokenizer.pad_token_id


m = pipeline("translation", model=multibert, tokenizer=tokenizer)
m.save_pretrained('test-model')

It saves in the directory with a structure that looks like:

! ls test-model/*

test-model/config.json		       test-model/special_tokens_map.json
test-model/generation_config.json  test-model/tokenizer_config.json
test-model/pytorch_model.bin	   test-model/tokenizer.json

After that I tarball it and push it into an S3 bucket:

! tar -cvzf test-model.tar.gz test-model/*
! aws s3 cp test-model.tar.gz s3://mybucket/test-model.tar.gz

Is that the right way to compress the model into tar.gz format?

And to deploy the model up, are there additional steps that needs to be checked such that the model can be loaded? E.g. do I need an inference.py file to use the predictor?

katespada97 · August 28, 2023, 9:01am

I experienced the same problem. I would like to know if it has been resolved?

nickprock · August 31, 2023, 10:51am

I experienced the same problem.
My problem was when I created the tar.gz file. I compressed the folder, it’s wrong. Go into the folder and compress the files.

Topic		Replies	Views
Sagemaker pipeline: /opt/ml/model does not appear to have a file named config.json Amazon SageMaker	0	760	September 11, 2023
ModelError when I run predict after deploying wizardcoder for text-generation Amazon SageMaker	1	903	September 25, 2023
InternalServer Exception when deploying fine tuned model on Sagemaker Amazon SageMaker	4	853	September 14, 2021
Inference Toolkit - custom inference with multiple models Amazon SageMaker	1	609	April 4, 2024
Save and deploy distilbert model in AWS SageMaker 🤗Transformers	2	2605	April 9, 2021

Sagemaker/mms/models/model does not appear to have a file named config.json

Q: Are there additional config that I need to do after m = pipeline("translation", model=multibert, tokenizer=berttokenizer); m.save_pretrained('test-model')?

Q: Are there examples of deploying EncoderDecoderModel with different pipeline tasks?

Related topics

Q: Are there additional config that I need to do after `m = pipeline("translation", model=multibert, tokenizer=berttokenizer); m.save_pretrained('test-model')`?

Q: Are there examples of deploying `EncoderDecoderModel` with different `pipeline` tasks?