Downloading a model from the hub without loading it

Is there a way to download a model with the same API as in .from_pretrained but without loading it? I want to separate the two steps. I’m already aware of huggingface_hub.snapshot_download but it downloads the whole repo, whereas .from_pretrained is more finegrained and only downloads .safetensors for instance if I pass the appropriate flags.

4 Likes

Hi,
i tried:

from huggingface_hub import hf_hub_url, hf_hub_download

# Generate/show the URL
hf_hub_url(
   repo_id=model,
   filename="file_name",
)

# Download the file
hf_hub_download(
   repo_id=model,
   filename=filename,
    local_dir = "/kaggle/working/"
)

i do not know your use case, but would love to hear.
Regards,
Ankush Singal

2 Likes

Hi, thank you for your answer, I was aware of this solution too but this unfortunately requires to know in advance all the files that need to be downloaded. I could create a wrapper for this, it’s not too difficult, but I figured that this is common enough that there may be a simpler way, as easy as .from_pretrained.

hi @bilelomrani did you ever resolve this? I’m running in the same issue and working if theres a more idiomatic way to download files without knowing which one I need beforehand.

hi @trunghnguyen, I haven’t found an idiomatic way of doing it with transformers unfortunately. I switched to the curated-transformers library maintained by Explosion for this use case, it provides a from_hf_hub_to_cache method that integrated well with the HuggingFace Hub. It is exactly what I was looking for.

2 Likes

Hey @bilelomrani. Could you provide us the code snippet u used to download using from_hf_hub_to_cache