Google Colab tokenizer.push_to_hub(repo_name) leads to "Repository not found"-Error

Hello,

I’m using a Google Colab notebook.

This line worked out:
!pip install huggingface_hub

Next, I wanted to write a json-file, which worked out, too:
import json
with open(‘my_language_vocab.json’, ‘w’) as vocab_file:
json.dump(vocab_dict, vocab_file)

Then, I ran the following line and got an access token (able to write) from my own account:
from huggingface_hub import notebook_login
notebook_login()

Next, I create a Repository and I can find it when I visit my page:
from huggingface_hub import create_repo
create_repo(“Kristinabckr/testpublic2”, private=False)

But if I run this line:
tokenizer.push_to_hub(repo_name) #I tried “Kristinabckr/testpublic2” and “testpublic2” for
#repo_name

I get the following Error:

HTTPError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/huggingface_hub/utils/_errors.py in hf_raise_for_status(response, endpoint_name)
212 try:
→ 213 response.raise_for_status()
214 except HTTPError as e:

9 frames
HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/repos/create

The above exception was the direct cause of the following exception:

RepositoryNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/huggingface_hub/utils/_errors.py in hf_raise_for_status(response, endpoint_name)
240 + “\nIf the repo is private, make sure you are authenticated.”
241 )
→ 242 raise RepositoryNotFoundError(message, response) from e
243
244 elif response.status_code == 400:

RepositoryNotFoundError: 401 Client Error. (Request ID: AfU1Ldq9xvEPG6pNZ_3jE)

Repository Not Found for url: https://huggingface.co/api/repos/create.
Please make sure you specified the correct repo_id and repo_type.
If the repo is private, make sure you are authenticated.
Unauthorized - Unauthorized

I am facing the same problem.

I’m getting the same error:(

Hey all! I was not able to reproduce the error. This is my code

# !pip install huggingface_hub transformers
from huggingface_hub import notebook_login, create_repo
from transformers import AutoTokenizer

notebook_login()
create_repo("osanseviero/test_bug", private=False)

tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.push_to_hub("osanseviero/test_bug")
# Result commit https://huggingface.co/osanseviero/test_bug/commit/42718dae9d039325f55143cbff23f57c5098eb7d

Would you be able to share a fully reproducible example that causes this bug?

1 Like

cc @Wauplin

Wasn’t able to reproduce the error either (with @osanseviero 's snippet).

Just to be sure, notebook_login() has to be run in a separate code block, but I guess that’s already what you are doing.

Once logged in, could you try to run the following lines and tell us the result ?

from huggingface_hub import whoami

whoami()
# you should see something like {'type': 'user',  'id': '...',  'name': 'Wauplin', ...}

and

from huggingface_hub import HfFolder

HfFolder().get_token()
# you should see your token is well saved

and

create_repo("test_bug_temporary", token="hf_***") # explicitly paste your token

Thank you so much, this solved my problem. I therafter was able to successfully run those lines from @osanseviero:

tokenizer = AutoTokenizer.from_pretrained(“gpt2”)
tokenizer.push_to_hub(“osanseviero/test_bug”)

1 Like

Hello,
I tried the Wav2Vec2CTCTokenizer instead of AutoTokenizer. Unfortunately, when I run the following cell:

tokenizer.push_to_hub(“test_bug_temporary_private_333”)

then I get the following Error:


TypeError Traceback (most recent call last)
in
1 #tokenizer = AutoTokenizer.from_pretrained(“gpt2”) #works
2 #tokenizer.push_to_hub(“test_bug_temporary_private_333”, organization = None) #does not
work
----> 3 tokenizer.push_to_hub(“test_bug_temporary_private_333”) #does not work
4 #tokenizer.push_to_hub(repo_name) #does not work
5

4 frames
/usr/local/lib/python3.7/dist-packages/huggingface_hub/utils/_deprecation.py in inner_f(*args, **kwargs)
43 )
44 kwargs.update(zip(sig.parameters, args))
—> 45 return f(**kwargs)
46
47 return inner_f

TypeError: create_repo() got an unexpected keyword argument ‘organization’

Thank you very much in advance for your help!

Hi @Kristinabckr , in the latest version of huggingface_hub (v0.11 that you are probably having), you cannot specify organization in a separate keyword argument. It has been deprecated for some time and is now completely removed.

What you should do it build a repo_id and provide it to push_to_hub. Example:

# Will default to user namespace
tokenizer.push_to_hub("my-tokenizer")

# Explicit username
tokenizer.push_to_hub("username/my-tokenizer")

# Explicit organization
tokenizer.push_to_hub("my-organization/my-tokenizer")

Hope this helps you :slight_smile: Please let me know if you need any further information.

Hi @Wauplin - I got the same error. Could you please elaborate what you meant by " What you should do it build a repo_id and provide it to push_to_hub?

I tried:
token = “my-tokenizer”
tokenizer.push_to_hub(token )

Thank you!

@caozxin1230 What I meant by “building the repo_id yourself” is that instead of doing

tokenizer.push_to_hub(“test_bug”, organization="my_organization")

You should do

tokenizer.push_to_hub(“my_organization/test_bug”)

But in your case it seems that you don’t need that.

To help you debug it, could you tell me which versions of huggingface_hub, transformers and tokenizer you are using ? You can find this information by running pip freeze in your local environment.

1 Like

Hi,

These aremy versions :

  • huggingface-hub==0.11.0
  • transformers==4.17.0
  • tokenizers==0.13.2

I’m still having the same error with this:

tokenizer.push_to_hub(“my_organization/test_bug”) → error:

create_repo() got an unexpected keyword argument ‘organization’

Thank you

Hi @Wauplin,
yes I have the v0.11.0. I created a repo with “MyUsername/test_pushtohub” (since I am not in any organization, I used my user account name). This new created repo can be seen by me in my hugging face account.

Unfortunately, when I run the tokenizer and then run tokenizer.push_to_hub(“MyUsername/test_pushtohub”) the same Error as before occurs:

TypeError: create_repo() got an unexpected keyword argument ‘organization’

There is also written in the Error message:
/usr/local/lib/python3.7/dist-packages/huggingface_hub/utils/_deprecation.py:42: FutureWarning: Deprecated positional argument(s) used in ‘create_repo’: pass token=‘test_pushtohub’ as keyword args. From version 0.12 passing these as positional arguments will result in an error,
FutureWarning,

Is there a problem with the Python 3.7 version?

Hi @enrybds, you have to upgrade transformers to a newer version. Your version (4.17) has been released in march 2022 and is therefore not compatible with the latest version of huggingface_hub. Another solution is to downgrade huggingface_hub version to 0.10.

Hi @Kristinabckr I guess your issue is similar even though I don’t know which version are installed in your environment. In any case, it is not a Python3.7 issue.

Hi everyone,

I tried with different versions and realized that you have to change the version of transformers to 4.24.0 (released this month!) and finally my tokenizer could push to hub with

tokenizer.push_to_hub(“MyName/MyRepo”)

these are my versions:
transformers==4.24.0
tokenizers== 0.13.2
huggingface_hub==0.11

I hope it works for you, too! @enrybds