Hi,
I want to deploy my own model on sagemaker.
usually i do this providing just a model id.
But in this case I want to use a local model (llama2 converted in huggingface format).
llm_image = get_huggingface_llm_image_uri(
"huggingface",
version="0.8.2"
)
# create HuggingFaceModel
llm_model = HuggingFaceModel(
model_data=s3_location,
transformers_version="4.31.0",
role=role,
model_server_workers=4,
image_uri=llm_image,
# env=config
)
The archive is structured as mentioned in the docs
model.tar.gz/
|- pytorch_model.bin
|- ....
|- code/
|- inference.py
|- requirements.txt
inference.py
def model_fn(model_dir):
device = 0 if torch.cuda.is_available() else "cpu"
pipe = pipeline(model=model_dir, device=device)
return pipe
def predict_fn(data, model):
print("inside predict_fn")
print("data")
print(data)
return model(data["text"])
requirements.txt
git+https://github.com/huggingface/transformers.git
torch==1.13.1
boto3
I do get an error about a missing model id.
HF_MODEL_ID must be set
I dont quite understand it since i provide ‘model_data’
@philschmid Do you have any idea why the ID is required here?
Your help is highly appreciated. kindest regards,
Philip