### Describe the bug
I got access to Llama-3.2 but when I tried to access the… model, I got the 401 Error.
```
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct.
401 Client Error. (Request ID: Root=1-66fc667a-79545df845c9253539ea509f;f6f84d32-d328-45e5-b78d-96461057d7d0)
Cannot access gated repo for url https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/resolve/main/config.json.
Access to model meta-llama/Llama-3.2-1B-Instruct is restricted. You must have access to it and be authenticated to access it. Please log in.
```
Now this means either I don't have access to 3.2B or I am not logged in. I got approved access, so it's not the first one (more about this in Reproduction).
I added my HF token key using `huggingface-cli login`, but I am still facing the same issue after that.
I tried to debug and found that the command `huggingface-cli whoami` gives me this error -
`Invalid user token. If you didn't pass a user token, make sure you are properly logged in by executing huggingface-cli login, and if you did pass a user token, double-check it's correct. {"error":"Invalid username or password."}`
So, basically even if I login (either via CLI or in code), I am getting this error
```
from huggingface_hub import login
login(token = "xxx")
```
**Note:** The only change I made (from yesterday when Llama-2 is working to today) is to change some environment variables i.e
```
export HF_HOME=xxx
export HF_ASSETS_CACHE=xxx
export XDG_CACHE_HOME=xxx
export HF_TOKEN=xxx
export HF_HUB_CACHE=xxx
```
### Reproduction
I also experimented with `Llama-2-7B` (which I used till yesterday and it was fine) but was getting the same `GatedError`.
```
from transformers import AutoTokenizer
import transformers
import torch
from huggingface_hub import login
login(token = "xxx")
model = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
sequences = pipeline(
'I liked "Breaking Bad" and "Band of Brothers". Do you have any recommendations of other shows I might like?\n',
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
max_length=200,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
```
### Logs
```
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /qumulo/satya/huggingface/token
Login successful
Traceback (most recent call last):
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
response.raise_for_status()
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/resolve/main/config.json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/utils/hub.py", line 403, in cached_file
resolved_file = hf_hub_download(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
return f(*args, **kwargs)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1232, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1339, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1854, in _raise_on_head_call_error
raise head_call_error
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1746, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1666, in get_hf_file_metadata
r = _request_wrapper(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 364, in _request_wrapper
response = _request_wrapper(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 388, in _request_wrapper
hf_raise_for_status(response)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 423, in hf_raise_for_status
raise _format(GatedRepoError, message, response) from e
huggingface_hub.errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66fc6aa2-51edf73166f9906f2c6e4de2;44b58636-df14-4a1a-bb9c-4d83423f68ed)
Cannot access gated repo for url https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/resolve/main/config.json.
Access to model meta-llama/Llama-2-7b-chat-hf is restricted. You must have access to it and be authenticated to access it. Please log in.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/qumulo/satya/Compress_Align/scratch_code/pytorch_test.py", line 15, in <module>
tokenizer = AutoTokenizer.from_pretrained(model)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 864, in from_pretrained
config = AutoConfig.from_pretrained(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 1006, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/configuration_utils.py", line 567, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/configuration_utils.py", line 626, in _get_config_dict
resolved_config_file = cached_file(
File "/qumulo/satya/anaconda3/envs/prune_llm/lib/python3.9/site-packages/transformers/utils/hub.py", line 421, in cached_file
raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf.
401 Client Error. (Request ID: Root=1-66fc6aa2-51edf73166f9906f2c6e4de2;44b58636-df14-4a1a-bb9c-4d83423f68ed)
Cannot access gated repo for url https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/resolve/main/config.json.
Access to model meta-llama/Llama-2-7b-chat-hf is restricted. You must have access to it and be authenticated to access it. Please log in.
```
### System info
```shell
- huggingface_hub version: 0.25.1
- Platform: Linux-5.15.0-1065-nvidia-x86_64-with-glibc2.35
- Python version: 3.9.19
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /qumulo/satya/huggingface/token
- Has saved token ?: True
- Configured git credential helpers:
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.4.1
- Jinja2: 3.1.4
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.9.2
- aiohttp: 3.10.8
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /qumulo/satya/huggingface
- HF_ASSETS_CACHE: /qumulo/satya/huggingface
- HF_TOKEN_PATH: /qumulo/satya/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
```