Transformers cache not loading from a new vm

We have setup an aws ec2 instance and initialized https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct and mistralai/Pixtral-12B-2409 · Hugging Face models.

when we create ami from these vm and create new instance, the transformers library isn’t able to load the pretrained models.

When i delete .cache in newly created instances, then it’s able to download and load models.

is it an expected beahaviour? as the instance configuration is exactly the same

1 Like

The symptoms are different but the problem could be similar to this.

Space doesn’t seem to be issue , I have enough free space on vm. The moment I delete .cache it redownloads and starts working.

Your error is coming from caching the dataset. Datasets is caching the dataset on disk to work with it properly. The default cache_dir is ~/.cache/huggingface/datasets. This directory seems not to be on the mounted EBS volume.

I think it’s not that way, it’s around here.

I’m not trying to cache dataset, it’s just model weights and the ami has the weights in the root for itself. The primary purpose is to boot the process as fast as possible

Perhaps unresolved issue?

I wonder if it’s possible to avoid this by manually clearing the cache before execution.

I have seen this issue in the GitHub issues

I’m not facing this , but in my case it starts loading weights and remains stuck at 0% when I spin up new instance