I am using vertex ai and workbench and a notebook RAG_llama2_vertexai.ipynb
to use llama2 models using transformers:
pip install -U git+https://github.com/huggingface/transformers.git git+https://github.com/huggingface/accelerate.git
then I authenticate HF:
from huggingface_hub import notebook_login
# Login to Huggingface to get access to the model
notebook_login()
then
import os
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
and when I try to get the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")from transformers
I get
File Save Error for RAG_llama2_vertexai.ipynb
Invalid response: 524
and the kernel crashes…
I tried to spin a vm with more memory and it crashes at the same point. Any idea what is going on? or what else I could try? Many thanks!