I tried out Codellama-34B and was able to access it but when I tried with the 7B model, I got this error:
HfHubHTTPError: (Request ID: Root=1-67f55b30-5693f55f2f0a7fc92412a1dc;b6d8c553-8194-4992-8a92-ac28976802f1)
403 Forbidden: None.
Cannot access content at: https://api-inference.huggingface.co/models/codellama/CodeLlama-7b-Instruct-hf/v1/chat/completions.
If you are trying to create or update content, make sure you have a token with the `write` role.
The model codellama/CodeLlama-7b-Instruct-hf is too large to be loaded automatically (13GB > 10GB).
So I am wondering if there is a list of all models accessible via the API ?
The method is described to some extent in the following post. If you want to limit it to the available models, you can add the option Inference=”warm“.