Downloading Large LFS Files

Shivi1982 · January 31, 2024, 8:52am

I am trying to download few models which are very heavy and have LFS files. for this i was using dask for parallelization. But the code is not able to download the model and suggest the model name is not availableo on git, here is the code used . Can someone suggest how to download multiple large LFS files and not use git clone as it could be very slow

import dask
from dask.diagnostics import ProgressBar
import huggingface_hub
import time
import os
import subprocess
from huggingface_hub import hf_hub_download

Your Hugging Face token (replace with your actual token)

token = “your_actual_token”

Model IDs

model_ids = [“lysandre/arxiv-nlp”, “tiiuae/falcon-180B”]

def download_model(model_id):
start_time = time.time()
try:
# Download the model metadata (including LFS pointers)
hf_hub_download(repo_id=model_id, token=token, cache_dir=‘/path/to/your/directory’)

    # Change to the directory where the model is downloaded
    model_dir = os.path.join('/path/to/your/directory', model_id)
    os.chdir(model_dir)

    # Run git-lfs fetch to download the LFS files
    subprocess.run(["git", "lfs", "fetch", "--all"], check=True)
    subprocess.run(["git", "lfs", "checkout"], check=True)
except Exception as e:
    print(f"Error downloading {model_id}: {e}")
elapsed_time = time.time() - start_time
print(f"Downloaded {model_id} in {elapsed_time:.2f} seconds")

Create a Dask list of delayed objects for each model information

model_infos = [dask.delayed(download_model)(model_id) for model_id in model_ids]

Download models in parallel and track progress

with ProgressBar():
dask.compute(model_infos)

Topic		Replies	Views
Do I need the objects under .git/lfs if I just want to use current version of models? 🤗Hub	2	1137	March 6, 2023
Files required for offline running Beginners	0	1155	December 13, 2022
The download of a model is too slow Beginners	0	7598	September 2, 2023
Cannot download big model files Beginners	1	1316	March 13, 2025
Problem with LFS Pull Models	7	7226	January 12, 2024

Downloading Large LFS Files

Your Hugging Face token (replace with your actual token)

Model IDs

Create a Dask list of delayed objects for each model information

Download models in parallel and track progress

Related topics