Hello there!
I’am trying to build a basic RAG application using llama_index. I’m unable to load the LLama2 model from huggingface. The code Iam using is:
llm = HuggingFaceLLM(
context_window = 4096,
max_new_tokens = 256,
generate_kwargs = {“temperature”: 0.0, “do_sample”:False},
system_prompt = system_prompt,
query_wrapper_prompt = query_wrapper_prompt,
tokenizer_name = “meta-llama/Llama-2-7b-chat-hf”,
model_name = “meta-llama/Llama-2-7b-chat-hf”,
device_map = “auto”,
model_kwargs = {“torch_dtype”: torch.float16, “load_in_8bit”: True},
)
I made sure of logging into huggingface f=befor executing this. The error Iam getting is:
OSError: We couldn’t connect to ‘https://huggingface.co’ to load this file, couldn’t find it in the cached files and it looks like meta-llama/Llama-2-7b-chat-hf is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at ‘Installation’.