I am trying to deploy deepseek quantized models on AWS SageMaker through this guide that uses the Bring Your Own Container (BYOC) approach: GitHub - aws-samples/deploy-gguf-model-to-sagemaker.
Error: [Errno28] No space left on device
I am getting an Errno28 for not having enough space when loading the model (through update_model(bucket, key)
function with s3 downloading the model with MODELPATH
parameter) during endpoint deployment.
The deepseek model i downloaded and have stored in s3 is: https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF, using the Q6_K_L quant type, saved as gguf format.
The only change I made to the sample code was referencing my deepseek model on S3 which stores the gguf model artifact. This is done by changing the MODELPATH
variable to reference the s3 artifact. I also used a different compute from the sample - ml.g4dn.2xl
for my endpoint.
This method of deployment works fine for smaller models - like the one in the sample, but seem to fail when i reference a much larger model.
Maybe I’m missing out on something with regards to how instance storage works.
Could someone guide me on how I could potentially resolve this issue?
df -h
: