GGUF BYOC Deployment with AWS SageMaker: [Errno28] No space left on device

It seems that the same error occurs in SageMaker due to inode exhaustion as well as disk space exhaustion.

However, the fact that it only occurs with large models suggests that either Llama.cpp is using too much RAM and causing disk swapping, or there may be something wrong with the cache when downloading models or during inference. If it’s the cache, it’s possible that you can specify the location and maximum size using environment variables.

It’s also worth noting that DeepSeek R1 is a relatively new model, but if smaller ones are working, it’s probably not the architecture…

1 Like