Logging in to HF Git repository from the code

Hi, I have an issue authenticating to the private Hugging Face repository from the code. I have three repositories, one is for the source code of my application that I’ve developed over past months, the second is for my HF code that will genearate summaries of the code commits using JetBrains-Research/cmg-codereviewer-with-history and the third one is empty repository to which I want to save summaries of all the code commits.

I use this code:

# Log in to Hugging Face Hub
login(token=os.getenv("HF_TOKEN"), write_permission=True) 

# Git configuration
subprocess.run(["git", "config", "--global", "user.name", "my-hf-user-name"])

...

log("Cloning repository")
repo_url = "https://huggingface.co/spaces/my-organization-name/my-real-application"

# Clone the repository (no need for token in URL)
subprocess.run(["git", "clone", repo_url, output_dir])

Where of course there are real values in my-hf-user-name, my-organization-name and my-real-application. I also have defined HF_TOKEN in environment variables.

However, I receive:

[LOG] Cloning repository
Cloning into ‘repo’…
fatal: could not read Username for ‘https://huggingface.co’: No such device or address
[LOG] Saving summaries to file
[LOG] Summaries saved to repo/commit_summaries.json
[LOG] Configuring Git
error: could not lock config file //.gitconfig: Permission denied
[LOG] Committing and pushing changes
fatal: detected dubious ownership in repository at ‘/app’
To add an exception for this directory, call:
git config --global --add safe.directory /app

In my Dockerfile, I have additionally:

RUN git config --system --add safe.directory /app/repo
RUN git config --global user.name "my-hf-user-name"

Currently I’m trying to save to the repository (and this is where it fails) but later I’ll have to deal with reading from source repository too.

How can I correctly authenticate to HF Git repository from my code?

Thanks!

1 Like

i think you need to setup your ssh credentials first

if you don’t want to setup your ssh credentials, haggingface offers other tools that you can use for downloading repos or files and uploading them in python :

  • download :
    • entire repo download : snapshot_download
    • single file download : hf_hub_download
  • upload :
    • entire folder : upload_folder
    • single file : upload_file
1 Like

Thank you! So I’m doing it this way now:

repo_id_to_summarize = "my-organization-name/my-real-project"
local_dir = "repo_to_summarize"
snapshot_download(
    repo_id=repo_id_to_summarize, 
    local_dir=local_dir, 
    repo_type="space", 
    cache_dir=".cache"
)

if os.path.exists(os.path.join(local_dir, ".git")):
    log("Repository downloaded successfully")
else:
    log("Error: Repository not found")

But here I get “Error: Repository not found” and also later, when it gets git commits:

Traceback (most recent call last):
File “summarize_commits.py”, line 65, in
commits = get_git_commits()
File “summarize_commits.py”, line 27, in get_git_commits
repo = Repo(local_dir)
File “/usr/local/lib/python3.8/site-packages/git/repo/base.py”, line 289, in init
raise InvalidGitRepositoryError(epath)
git.exc.InvalidGitRepositoryError: /app/repo_to_summarize

try this instead

repo_id_to_summarize = "my-organization-name/my-real-project"
local_dir = "repo_to_summarize"
snapshot_download(
    repo_id=repo_id_to_summarize, 
    local_dir=local_dir, 
    repo_type="space", 
    cache_dir=".cache"
)

-if os.path.exists(os.path.join(local_dir, ".git")):
+if os.path.exists(local_dir):
    log("Repository downloaded successfully")
else:
    log("Error: Repository not found")

Thanks! But it will only help me with custom validation of the path, it won’t solve the issue with “git.exc.InvalidGitRepositoryError: /app/repo_to_summarize”