How to fork (in the git sense) a model repository?


For production, we need to use models from the hub, from which we control the updates.

Ideally, we need a way to fork the model repo in our own (public) organization, so that we control the updates ourselves. We cannot incur the risk that someone would delete a repo and changing it without us knowing.

We could create a “copy” of the model manually and re-commit it to our organization, but this is not ideal, as we would lose the ability to track and merge future updates of the original repo.

Is there any way to achieve a fork in the current state of the huggingface hub?

Thanks in advance,

Alex Combessie

You can add the original repository as “upstream” repository in order to track and merge future updates, like so:

git remote add upstream <URL of model>.git

You can then sync again by doing:

git fetch upstream 
git rebase upstream/master

Thanks Niels! Let me try that and report findings.

So after investigating, I hit a blocker right here:

(py36_bert) alexandrecombessie@MacBook-Pro-7 average_word_embeddings_glove.6B.300d % git rebase upstream/main

First, rewinding head to replay your work on top of it...
Downloading 0_WordEmbeddings/pytorch_model.bin (480 MB)
Error downloading object: 0_WordEmbeddings/pytorch_model.bin (d819348): Smudge error: Error downloading 0_WordEmbeddings/pytorch_model.bin (d819348e583fca49cf3980e34505d52a3f842064ebd9dc255484125357771240): [d819348e583fca49cf3980e34505d52a3f842064ebd9dc255484125357771240] Object does not exist: [404] Object does not exist

Errors logged to /Users/alexandrecombessie/huggingface/dataikunlp/average_word_embeddings_glove.6B.300d/.git/lfs/logs/20210901T164955.693927.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: 0_WordEmbeddings/pytorch_model.bin: smudge filter lfs failed

I have googled the error and it seems linked to this issue, which seems pretty complex. Do you have some advice?



After more investigating, I managed to make the rebase work, using this script:

huggingface-cli login
huggingface-cli repo create ${MODEL_NAME} --organization ${NEW_ORG}
git lfs install --skip-smudge
git clone${NEW_ORG}/${MODEL_NAME}
git remote add upstream${ORIGINAL_ORG}/${MODEL_NAME}
git fetch upstream
git rebase upstream/main
git push --force-with-lease

I thought I had solved the case… Except that the new model is somehow forbidden when I tried to load it in my code:

requests.exceptions.HTTPError: 403 Client Error: Forbidden for url:

Is this issue solvable from my end or does it require a change on your model hub infrastructure?

Thanks for your help,



This seems to work

git lfs clone${NEW_ORG}/${MODEL_NAME}
git lfs install --skip-smudge --local # --local affects only this clone, try without it
git remote add upstream${ORIGINAL_ORG}/${MODEL_NAME}
git fetch upstrream
git checkout -b temp upstream/main
git rebase main # resolve conflicts if needed and finish rebasing
git lfs pull upstream
git push origin temp
git lfs push --all origin temp
git lfs install --force --local

for reference : how to rebase with git lfs? · Issue #1287 · git-lfs/git-lfs · GitHub

Hi @dataiku did the last snippet worked ?