Inference Llama-2-13b not working

Following the blog, I was able to run inference on my gpu server for meta-llama/Llama-2-7b-chat-hf as has been done in the tutorial. (Have been granted access to Llama-2 already)

Now I wish to run inference on meta-llama/Llama-2-13b. I keep getting the error:
“OSError: meta-llama/Llama-2-13b does not appear to have a file named config.json. Checkout ‘’ for available files.”

Checking the mentioned repo for config.json, I see there actually isn’t one for meta-llama/Llama-2-13b (and most other Llama2 models except meta-llama/Llama-2-7b-chat-hf).

Could the missing file(s) it be added please? Alternatively, if there’s another way to do a run the model privately, it would be great. Thanks!

1 Like

I was able to load llama-2-13b-chat-hf with this Google Colab notebook, but inference failed for me as well due to a RuntimeError. I’m not sure if it’s entirely related to your problem, but maybe you’d like to take a look?

Got the model working. Just need to use the models with ‘hf’ in the name.

1 Like

Just to make it more clear, this runs perfectly fine:


But it says that PRO license is required to use hf ones