I am creating a batch job with the code below. However it fails immediately with 403 forbidden client error. My cloudwatch has the following output (full traceback below)
This is an experimental beta features, which allows downloading model from the
Hugging Face Hub on start up. It loads the model defined in the env var `HF_MODEL_ID'
.
immediately followed by:
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/sentence-transformers/all-mpnet-base-v2
after which the batch job fails. Deploying to an endpoint is working fine.
The full code for batch job:
from sagemaker.huggingface import HuggingFaceModel
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'sentence-transformers/all-mpnet-base-v2',
'HF_TASK':'feature-extraction'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.6',
pytorch_version='1.7',
py_version='py36',
env=hub,
role=role,
)
batch_job = huggingface_model.transformer(
instance_count=1,
instance_type='ml.p3.2xlarge',
output_path='s3://kj-temp/hf/out', # we are using the same s3 path to save the output with the input
strategy='SingleRecord'
)
# starts batch transform job and uses s3 data as input
batch_job.transform(
data=test_input,
content_type='application/json',
split_type='Line',
wait = False)
and the full traceback:
Traceback (most recent call last):
File "/usr/local/bin/dockerd-entrypoint.py", line 23, in <module>
serving.main()
File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/serving.py", line 34, in main
_start_mms()
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 206, in call
return attempt.get(self._wrap_exception)
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/opt/conda/lib/python3.6/site-packages/six.py", line 719, in reraise
raise value
File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/serving.py", line 30, in _start_mms
mms_model_server.start_model_server(handler_service=HANDLER_SERVICE)
File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/mms_model_server.py", line 75, in start_model_server
use_auth_token=HF_API_TOKEN,
File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/transformers_utils.py", line 154, in _load_model_from_hub
model_info = _api.model_info(repo_id=model_id, revision=revision, token=use_auth_token)
File "/opt/conda/lib/python3.6/site-packages/huggingface_hub/hf_api.py", line 155, in model_info
r.raise_for_status()
File "/opt/conda/lib/python3.6/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)