GGUF BYOC Deployment with AWS SageMaker: [Errno28] No space left on device

MatthiasLee · March 24, 2025, 5:53am

I am trying to deploy deepseek quantized models on AWS SageMaker through this guide that uses the Bring Your Own Container (BYOC) approach: GitHub - aws-samples/deploy-gguf-model-to-sagemaker.

Error: [Errno28] No space left on device
I am getting an Errno28 for not having enough space when loading the model (through update_model(bucket, key) function with s3 downloading the model with MODELPATH parameter) during endpoint deployment.

The deepseek model i downloaded and have stored in s3 is: https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF, using the Q6_K_L quant type, saved as gguf format.

The only change I made to the sample code was referencing my deepseek model on S3 which stores the gguf model artifact. This is done by changing the MODELPATH variable to reference the s3 artifact. I also used a different compute from the sample - ml.g4dn.2xl for my endpoint.

This method of deployment works fine for smaller models - like the one in the sample, but seem to fail when i reference a much larger model.

Maybe I’m missing out on something with regards to how instance storage works.

Could someone guide me on how I could potentially resolve this issue?

df -h:

John6666 · March 24, 2025, 10:41am

It seems that the same error occurs in SageMaker due to inode exhaustion as well as disk space exhaustion.

However, the fact that it only occurs with large models suggests that either Llama.cpp is using too much RAM and causing disk swapping, or there may be something wrong with the cache when downloading models or during inference. If it’s the cache, it’s possible that you can specify the location and maximum size using environment variables.

It’s also worth noting that DeepSeek R1 is a relatively new model, but if smaller ones are working, it’s probably not the architecture…

Topic		Replies	Views
"OS Errorr: No space left on device" when trying to load a trained model from S3 Amazon SageMaker	1	1361	December 28, 2023
"no space left on device" when downloading a large model for the Sagemaker training job Amazon SageMaker	4	5005	July 18, 2024
SageMaker OS Error No Space Left On Device while trying to train Falcon40B Amazon SageMaker	3	1310	August 24, 2023
Sagemaker endpoint - no space left on device with large models Amazon SageMaker	3	4907	February 21, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3070	March 27, 2024

GGUF BYOC Deployment with AWS SageMaker: [Errno28] No space left on device

Related topics