Use my finetuned Bert Model in SageMaker BatchTransform

Hi,

I finetuned a BERT model on AWS EC2 (i.e. not on SageMaker), and would like to use the resulting model in a Sagemaker Pipeline that ultimately does a BatchTransform step.

I’ve saved the model as a pt file and added it to a tar.gz archive, but get an error when I try to use it for inference. I’ve tried adding a json.config file from sample HuggingFace model repos on the HuggingFace website, but still get the same error. I’ve also tried following tutorials to train/finetune a HuggingFace model on Sagemaker, but unfortunately do not have quota for GPU machines for SageMaker training - only for EC2 training off of SageMaker.

from sagemaker.huggingface import HuggingFaceModel
import sagemaker 

role = sagemaker.get_execution_role()

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="https://sagemaker-studio-....s3.amazonaws.com/awsmodel2.tar.gz",  # path to your trained sagemaker model
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.12.3", # transformers version used
   pytorch_version="1.9.1", # pytorch version used
   py_version="py38", 
   env={ 'HF_TASK':'text-classification' }# python version of the DLC
)

predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.m5.xlarge"
)

Error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
“code”: 400,
“type”: “InternalServerException”,
“message”: “Can\u0027t load config for \u0027/.sagemaker/mms/models/model\u0027. Make sure that:\n\n- \u0027/.sagemaker/mms/models/model\u0027 is a correct model identifier listed on \u0027https://huggingface.co/models\u0027\n (make sure \u0027/.sagemaker/mms/models/model\u0027 is not a path to a local directory with something else, in that case)\n\n- or \u0027/.sagemaker/mms/models/model\u0027 is the correct path to a directory containing a config.json file\n\n”
}
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2022-04-28-20-10-46-917 in account XX for more information.

1 Like

Hi Rony,

it sounds like the model.tar.gz file hasn’t been created properly. Please first make sure that you have all the required files in the tar.gz file, see here: Deploy models to Amazon SageMaker

One question also regarding the config.json file - you say you used a sample file from the model hub, is that correct? You should use the config file associated with your model that you trained.

Hope that helps!

Cheers
Heiko

Thank you @marshmellow77!

I used

model.save_pretrained("dir_name") after training a model and putting it in eval mode

The command created two files: config.json and pytorch_model.bin

I compressed the directory using the following command

 tar -czf model_try_429.tar.gz outputs_429/

and uploaded the tar.gz file to S3

Used the same code to create the model as my first post, and got an error when predicting with

data = {
   "inputs": "The new Hugging Face SageMaker DLC makes it super easy to deploy models in production. I love it!",
}
predictor.predict(data)

2022-04-29T20:09:05,830 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/configuration_utils.py”, line 594, in _get_config_dict
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - resolved_config_file = cached_path(
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py”, line 1936, in cached_path
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise EnvironmentError(f"file {url_or_filename} not found")
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - OSError: file /.sagemaker/mms/models/model/config.json not found
2022-04-29T20:09:05,831 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - During handling of the above exception, another exception occurred:
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py”, line 219, in handle
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - self.initialize(context)
2022-04-29T20:09:05,832 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py”, line 77, in initialize
2022-04-29T20:09:05,833 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - self.model = self.load(self.model_dir)
2022-04-29T20:09:05,833 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py”, line 104, in load
2022-04-29T20:09:05,833 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - hf_pipeline = get_pipeline(task=os.environ[“HF_TASK”], model_dir=model_dir, device=self.device)
2022-04-29T20:09:05,833 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/sagemaker_huggingface_inference_toolkit/transformers_utils.py”, line 272, in get_pipeline
2022-04-29T20:09:05,834 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
2022-04-29T20:09:05,834 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/pipelines/init.py”, line 541, in pipeline
2022-04-29T20:09:05,834 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config = AutoConfig.from_pretrained(model, revision=revision, _from_pipeline=task, **model_kwargs)
2022-04-29T20:09:05,835 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py”, line 637, in from_pretrained
2022-04-29T20:09:05,835 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
2022-04-29T20:09:05,835 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/configuration_utils.py”, line 546, in get_config_dict
2022-04-29T20:09:05,836 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
2022-04-29T20:09:05,836 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/transformers/configuration_utils.py”, line 630, in _get_config_dict
2022-04-29T20:09:05,836 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise EnvironmentError(
2022-04-29T20:09:05,837 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - OSError: Can’t load config for ‘/.sagemaker/mms/models/model’. If you were trying to load it from ‘Models - Hugging Face’, make sure you don’t have a local directory with the same name. Otherwise, make sure ‘/.sagemaker/mms/models/model’ is the correct path to a directory containing a config.json file
2022-04-29T20:09:05,837 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2022-04-29T20:09:05,837 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - During handling of the above exception, another exception occurred:
2022-04-29T20:09:05,837 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2022-04-29T20:09:05,838 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2022-04-29T20:09:05,841 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/mms/service.py”, line 108, in predict
2022-04-29T20:09:05,844 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ret = self._entry_point(input_batch, self.context)
2022-04-29T20:09:05,845 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File “/opt/conda/lib/python3.8/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py”, line 243, in handle
2022-04-29T20:09:05,845 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
2022-04-29T20:09:05,845 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: Can’t load config for ‘/.sagemaker/mms/models/model’. If you were trying to load it from ‘Models - Hugging Face’, make sure you don’t have a local directory with the same name. Otherwise, make sure ‘/.sagemaker/mms/models/model’ is the correct path to a directory containing a config.json file : 400

I suspect that this tar command is incorrect:

  1. You seem to zip the directory instead of the files, which would mean that the structure of your file will be incorrect.
  2. The file name HAS to be model.tar.gz, as far as I know.

Instead you could try to navigate into the directory and use tar zcvf model.tar.gz *, as described in the documentation:

image

In the end you want the model.tar.gz structure look like this:

image

4 Likes

@marshmellow77 That fixed it, thank you so much!