HF Model Deployment Trust Remote Code

Hi,

I am deploying gte-large-en-v1.5 l to sagemaker via the sagemaker.huggingface.HuggingFaceModel.deploy method.

When requesting inference I get the following error:

An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "Loading /.sagemaker/mms/models/Alibaba-NLP__gte-large-en-v1.5 requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option `trust_remote_code\u003dTrue` to remove this error."
}

This is my Hub env var dictionary for deployment:

# Hub Model configuration. <https://huggingface.co/models> 
hub = {
    'HF_MODEL_ID':'Alibaba-NLP/gte-large-en-v1.5', # model_id from hf.co/models
    'HF_TASK':'feature-extraction', 
    'MMS_MAX_REQUEST_SIZE': json.dumps(200000000),
    'MMS_MAX_RESPONSE_SIZE': json.dumps(200000000),
    'TRUST_REMOTE_CODE': json.dumps(True) # https://github.com/huggingface/text-generation-inference/issues/493
}

I have also tried using HF_TRUST_REMOTE_CODE, it has the same result. How can I set this bool to true in my Sagemaker Endpoint Env?

Thanks.

My total deplay code is here:

# Hub Model configuration. <https://huggingface.co/models> 
hub = {
    'HF_MODEL_ID':'Alibaba-NLP/gte-large-en-v1.5', # model_id from hf.co/models
    'HF_TASK':'feature-extraction', # NLP task you want to use for predictions
    'MMS_MAX_REQUEST_SIZE': json.dumps(200000000),
    'MMS_MAX_RESPONSE_SIZE': json.dumps(200000000),
    'TRUST_REMOTE_CODE': json.dumps(True) # https://github.com/huggingface/text-generation-inference/issues/493
    # 'MAX_INPUT_LENGTH': json.dumps(3000) # https://github.com/huggingface/text-embeddings-inference/issues/141
  # 'SM_NUM_GPUS': '1',
}

from sagemaker.huggingface import HuggingFaceModel

huggingface_model = HuggingFaceModel(
    env=hub, # configuration for loading model from Hub
    role=role, # iam role with permissions to create an Endpoint
    py_version='py310',
    transformers_version="4.37.0", # transformers version used
    pytorch_version="2.1.0", # pytorch version used
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge",
    endpoint_name = ENDPOINT_NAME,
)