I am trying to deploy a Huggingfac model (flan-xxl) to a SageMaker endpoint (g4dn instance and update the volume size) and based on this documentation its possible. But when I try to deploy the endpoint, I get the following error:
ParamValidationError: Parameter validation failed:
Unknown parameter in ProductionVariants[0]: "VolumeSizeInGB", must be one of: VariantName, ModelName, InitialInstanceCount, InstanceType, InitialVariantWeight, AcceleratorType, CoreDumpConfig, ServerlessConfig.
My code looks like this:
huggingface_model = HuggingFaceModel(
model_data=s3_location, # this is a zip file w requirements and an inference script where i download flan-xxl
role=role,
transformers_version="4.17",
pytorch_version="1.10",
py_version='py38'
)
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g4dn.12xlarge",
endpoint_name="g4-12xl-flan-ul2",
volume_size=250,
)
Any help would be greatly appreciated!