Deploying TheBloke/Luna-AI-Llama2-Uncensored-GGML

I’m trying to deploy the following huggingface model to AWS SageMaker:

I created a domain, launched the studio, and opened a new notebook:

Image: Data Science 3.0
Kernel: Python 3

I tried running the following code:

import json

import sagemaker

import boto3

from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri


role = sagemaker.get_execution_role()

except ValueError:

iam = boto3.client('iam')

role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

# Hub Model configuration.

hub = {


'SM_NUM_GPUS': json.dumps(1)


# create Hugging Face Model Class

huggingface_model = HuggingFaceModel(





# deploy model to SageMaker Inference

predictor = huggingface_model.deploy(





# send request


"inputs": "My name is Clara and I am",


I’m getting the following errors and warning:

UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-tgi-inference-2023-09-10-11-59-20-948: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint…

ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
distributed 2022.7.0 requires tornado<6.2,>=6.0.3, but you have tornado 6.3.2 which is incompatible.

WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead:

I checked the CloudWatch logs following the instructions in first error, and I found many DownloadError logs for different files. For example:

Error: DownloadError File “/opt/conda/bin/text-generation-server”, line 8, in sys.exit(app()) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/”, line 182, in download_weights utils.convert_files(local_pt_files, local_st_files, discard_names) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/”, line 106, in convert_files convert_file(pt_file, sf_file, discard_names) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/”, line 65, in convert_file loaded = torch.load(pt_file, map_location=“cpu”) File “/opt/conda/lib/python3.9/site-packages/torch/”, line 815, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File “/opt/conda/lib/python3.9/site-packages/torch/”, line 1033, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args)

2023-09-12T02:08:45.377+08:00 _pickle.UnpicklingError: could not find MARK