I use an m1 mac (macos 13.7.8), a normal virtual environment (as is customary in Python), and jupyter lab.
I generated multiple access tokes during my first trials, I now saved the latest one I created.
Procedure I used (squared brackets are used to designate the beginning and end of code or output):
-
logged in with hf auth login in the terminal (using my access token)
-
opened jupyter lab from terminal.
-
nothing worked so I logged in a second time from the jupyter lab notebook using (from huggingface_hub import notebook_login
notebook_login()) and the same access token
-
still nothing works, e.g. executing the code from the quickstart tutorial: [from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained(âmeta-llama/Llama-2-7b-hfâ, dtype=âautoâ, device_map=âautoâ) tokenizer = AutoTokenizer.from_pretrained(âmeta-llama/Llama-2-7b-hfâ)] ; Unfortunately, I cannot share the error message as apparently new users can only put two links in a post.
Maybe a free account is not sufficient to have access but in this case I would find it weird that this code is then used for the quick start tutorial. Very weird.
I also tried to implement a model using code provided by Huggingface: executed code: [test.py
import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(âopenbmb/MiniCPM-Vâ, trust_remote_code=True, torch_dtype=torch.bfloat1
model = model.to(device=âmpsâ, dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(âopenbmb/MiniCPM-Vâ, trust_remote_code=True)
model.eval()
image = Image.open(image_file_path).convert(âRGBâ)
question = âWhat is in the image?â
msgs = [{âroleâ: âuserâ, âcontentâ: question}]
res, context, _ = model.chat(
image=image,
msgs=msgs,
context=None,
tokenizer=tokenizer,
sampling=True,
temperature=0.7
)
print(res) ]
In short I tried several models (with code provided by Huggingface and nothing works (several different kinds of error messages). On the other hand, using a workaround by qnguyen3 (solving the flash_attn problem for m1 macs) immediately worked. So, I am confused: If qnguyen3 can provide a working solution why can Huggingface not provide examples that are easy to get to work? It seems to me that especially at the beginning of a tutorial that would make sense. Or is this some kind of filter that should make clear that only people with a background in computer science (as opposed to data scientists) should be using this platform? Or is it some kind of compatibility problem? Can someone help or give a hint?