I thought today I will give huggingface a try because it seems to be the easiest option to run LLAMA-2. So I followed this example:
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "JPdQvYmlWmNc"
},
"source": [
"[](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2/llama-2-70b-chat-agent.ipynb) [](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2/llama-2-70b-chat-agent.ipynb)\n",
"\n",
"# LLaMa 70B Chatbot in Hugging Face and LangChain\n",
"\n",
"In this notebook we'll explore how we can use the open source **Llama-70b-chat** model in both Hugging Face transformers and LangChain.\n",
"At the time of writing, you must first request access to Llama 2 models via [this form]() (access is typically granted within a few hours).\n",
"\n",
"---\n",
"\n",
"🚨 _Note that running this on CPU is practically impossible. It will take a very long time. If running on Google Colab you go to **Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > A100**. Using this notebook requires ~38GB of GPU RAM._\n",
"\n",
This file has been truncated. show original
And pasted in my access token. However I always get a HTTP 403 error. What could be wrong?
After googling some more I figured out what the problem is: After getting authorized by Meta you also need to get authorization by HF which can be requested by pressing the button here: meta-llama/Llama-2-7b-chat-hf · Hugging Face
So much drama with the llama!
1 Like