How to sync Hugging Face model commits with GitHub?

Hi everyone,

What’s the best way to sync Hugging Face model commit activities with their corresponding GitHub repositories? From what I understand, GitHub Actions can only be used to sync Hugging Face Spaces, not models. Is there any way to automate this process for models too?

Would love to hear how others are handling this!

1 Like

Hmm… I don’t know anything about github automation, so I asked the chatbot anyway, and it gave me a confident (inside Chatbot) answer.

I wonder if it’s really usable…:sweat_smile:


by HuggingChat

Based on the sources provided, there are a few approaches to automate syncing Hugging Face model commit activities with GitHub repositories. While the sources primarily discuss GitHub Actions for syncing with Hugging Face Spaces, there are ways to adapt this process for models as well.

Approach 1: Using GitHub Actions with Custom Workflows [1]

You can create a custom GitHub Action workflow to push your main branch to Hugging Face. Here’s an example workflow:

name: Sync to Hugging Face hub
on:
  push:
    branches: [main]
  workflow_dispatch:

jobs:
  sync-to-hub:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0
          lfs: true
      - name: Push to hub
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: git push https://HF_USERNAME:$HF_TOKEN@huggingface.co/spaces/HF_USERNAME/SPACE_NAME main

This workflow can be adapted for models by replacing the Spaces URL with your model’s Hugging Face repository URL.

Approach 2: Using the huggingface-sync-action [2]

The GitHub Action huggingface-sync-action can be used to sync files from a GitHub repository to a Hugging Face repository, including model repositories. Here’s how to use it:

  1. Set up a Hugging Face token with write access and add it to your GitHub repository as a secret named HF_TOKEN.

  2. Create a workflow file (e.g., .github/workflows/sync-with-huggingface.yml) in your GitHub repository with the following content:

name: Sync to Hugging Face
on:
  push:
    branches: [main]

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3
      - name: Sync to Hugging Face
        uses: nateraw/huggingface-sync-action@v0.0.5
        with:
          github_repo_id: '' # Replace with your GitHub repository ID
          huggingface_repo_id: 'username/model-name' # Replace with your Hugging Face model repository ID
          hf_token: ${{ secrets.HF_TOKEN }}
          repo_type: 'model' # Set this to 'model' for model repositories
          private: false # Optional: Set to true if you want the repository to be private

This workflow will sync your GitHub repository with your Hugging Face model repository whenever there is a push to the main branch.

Approach 3: Using a Custom GitHub Action with File Modifications [3]

If you need to remove certain files before syncing (e.g., due to Hugging Face’s restrictions), you can modify the workflow to include a step to remove unwanted files:

name: Sync to Hugging Face Spaces
on:
  push:
    branches:
      - main

jobs:
  sync:
    name: Sync
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3
      - name: Remove bad files
        run: rm -rf my-bad-files
      - name: Sync to Hugging Face Spaces
        uses: JacobLinCool/huggingface-sync@v1
        with:
          github: ${{ secrets.HF_TOKEN }}
          # Other parameters as needed

Conclusion

While GitHub Actions are commonly used for syncing Hugging Face Spaces, they can also be adapted for models. The best approach is to use the huggingface-sync-action with a custom workflow that specifies repo_type: 'model'. This allows you to automate syncing your GitHub repository with your Hugging Face model repository.

If you need to handle file restrictions, you can modify the workflow to remove unwanted files before syncing.

You can check the following link.

https://huggingface.co/docs/hub/en/spaces-github-actions

However, if I were to sync, I would create a pre-push hook in my Gitflow which would manage the syncing operation at both upstream.

1 Like

Thank you so much, John — I really appreciate your detailed response! If you don’t mind me asking, which method have you personally used that worked well for you? Just trying to understand from experience.

Also, maybe there’s really no need to synchronize the two platforms for model management after all — just a thought.

1 Like

Thank you very much, Talismanic — I really appreciate your response. Just to be sure, has this method worked well for you in practice, or have you not had a need to synchronize activities between the two platforms so far?

1 Like

Hi, I’m the creator of SE-Arena, an innovative evaluation crowd-sourced platform for software engineering-related tasks. My project is hosted on GitHub (GitHub - SE-Arena/Software-Engineer-Arena) and also on Hugging Face Spaces (SE-Arena - a Hugging Face Space by SE-Arena).

Here’s how I resolved the sync between GitHub and Hugging Face:

  1. Start with GitHub: I create the repository there and then use GitHub Actions for the sync.
  2. Configure GitHub Actions: I insert the following script in my GitHub Action, which checks out the repo, installs Git LFS, configures Git, and pushes changes to my Hugging Face space.
name: Sync to Hugging Face Space

on:
  push:
    branches:
      - main

jobs:
  sync:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout GitHub Repository
        uses: actions/checkout@v3
        with:
          fetch-depth: 0  # Fetch the entire history to avoid shallow clone issues

      - name: Install Git LFS
        run: |
          curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
          sudo apt-get install git-lfs
          git lfs install

      - name: Configure Git
        run: |
          git config --global user.name "GitHub Actions Bot"
          git config --global user.email "actions@github.com"

      - name: Push to Hugging Face
        env:
          HF_TOKEN: ${{ secrets.HF_TOKEN }}
        run: |
          git remote add huggingface https://user:${HF_TOKEN}@huggingface.co/spaces/SE-Arena/Software-Engineering-Arena
          git fetch huggingface
          git push huggingface main --force
  1. Create and Sync HF Repo: Check the corresponding repo in Hugging Face.

This setup works smoothly for me and might help others with similar synchronization requirements even if it is a Hugging Face model. Happy syncing!

1 Like

Thank you very much. I will try and see if it adaptation of this space synchronization code will work effectively. In short, there is no tool dedicated for the commit synchronization task except for space synchronization.

1 Like

which method have you personally used that worked well for you? Just trying to understand from experience.

I’m managing it in a very primitive way, using folders instead of git, so I’m not using any of them…:sweat_smile:

Thank you very much for your responses so far. I appreciate it a lot

1 Like