Hello everyone,
When I use the upload large folder i see a .cache folder that contains a folder called “upload”. This is created on the same directory of the folder I want to upload. Is there a way to change the location of this .cache folder?
I tried setting HF_HOME, but this doesn’t seem to work.
Thanks!
1 Like
There doesn’t seem to be a gentle way to do this using environment variables or arguments. If you really want to do it, you could change the code in the library in the Python folder, but…
revision: Optional[str] = None,
private: Optional[bool] = None,
allow_patterns: Optional[Union[List[str], str]] = None,
ignore_patterns: Optional[Union[List[str], str]] = None,
num_workers: Optional[int] = None,
print_report: bool = True,
print_report_every: int = 60,
) -> None:
"""Upload a large folder to the Hub in the most resilient way possible.
Several workers are started to upload files in an optimized way. Before being committed to a repo, files must be
hashed and be pre-uploaded if they are LFS files. Workers will perform these tasks for each file in the folder.
At each step, some metadata information about the upload process is saved in the folder under `.cache/.huggingface/`
to be able to resume the process if interrupted. The whole process might result in several commits.
Args:
repo_id (`str`):
The repository to which the file will be uploaded.
E.g. `"HuggingFaceTB/smollm-corpus"`.
folder_path (`str` or `Path`):
Path to the folder to upload on the local file system.
"""
paths = get_local_download_paths(local_dir, filename)
with WeakFileLock(paths.lock_path):
with paths.metadata_path.open("w") as f:
f.write(f"{commit_hash}\n{etag}\n{time.time()}\n")
def _huggingface_dir(local_dir: Path) -> Path:
"""Return the path to the `.cache/huggingface` directory in a local directory."""
# Wrap in lru_cache to avoid overwriting the .gitignore file if called multiple times
path = local_dir / ".cache" / "huggingface"
path.mkdir(exist_ok=True, parents=True)
# Create a .gitignore file in the .cache/huggingface directory if it doesn't exist
# Should be thread-safe enough like this.
gitignore = path / ".gitignore"
gitignore_lock = path / ".gitignore.lock"
if not gitignore.exists():
try:
with WeakFileLock(gitignore_lock, timeout=0.1):
gitignore.write_text("*")
system
Closed
March 28, 2025, 11:24pm
4
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.