New to HF, cannot get any model to work in jupyter lab

HieronymusEsteban · September 26, 2025, 2:57pm

I use an m1 mac (macos 13.7.8), a normal virtual environment (as is customary in Python), and jupyter lab.

I generated multiple access tokes during my first trials, I now saved the latest one I created.

Procedure I used (squared brackets are used to designate the beginning and end of code or output):

logged in with hf auth login in the terminal (using my access token)
opened jupyter lab from terminal.
nothing worked so I logged in a second time from the jupyter lab notebook using (from huggingface_hub import notebook_login

notebook_login()) and the same access token
still nothing works, e.g. executing the code from the quickstart tutorial: [from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained(“meta-llama/Llama-2-7b-hf”, dtype=“auto”, device_map=“auto”) tokenizer = AutoTokenizer.from_pretrained(“meta-llama/Llama-2-7b-hf”)] ; Unfortunately, I cannot share the error message as apparently new users can only put two links in a post.

Maybe a free account is not sufficient to have access but in this case I would find it weird that this code is then used for the quick start tutorial. Very weird.

I also tried to implement a model using code provided by Huggingface: executed code: [test.py

import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(‘openbmb/MiniCPM-V’, trust_remote_code=True, torch_dtype=torch.bfloat1

model = model.to(device=‘mps’, dtype=torch.float16)

tokenizer = AutoTokenizer.from_pretrained(‘openbmb/MiniCPM-V’, trust_remote_code=True)
model.eval()

image = Image.open(image_file_path).convert(‘RGB’)
question = ‘What is in the image?’
msgs = [{‘role’: ‘user’, ‘content’: question}]

res, context, _ = model.chat(
image=image,
msgs=msgs,
context=None,
tokenizer=tokenizer,
sampling=True,
temperature=0.7
)
print(res) ]

In short I tried several models (with code provided by Huggingface and nothing works (several different kinds of error messages). On the other hand, using a workaround by qnguyen3 (solving the flash_attn problem for m1 macs) immediately worked. So, I am confused: If qnguyen3 can provide a working solution why can Huggingface not provide examples that are easy to get to work? It seems to me that especially at the beginning of a tutorial that would make sense. Or is this some kind of filter that should make clear that only people with a background in computer science (as opposed to data scientists) should be using this platform? Or is it some kind of compatibility problem? Can someone help or give a hint?

John6666 · September 26, 2025, 10:57pm

I think you’re encountering multiple errors simultaneously. Personally, I recommend using MLX or Ollama. They make it easier to effectively utilize MPS.

gustavokuklinski · September 27, 2025, 2:08pm

Have you tried using Ollama or Llama.cpp?

To run any model you need to get: pip install transformers pytorch accelerate on your venv. Them call the script.

I got a lot of struggles at the beginning too.

John6666 · September 27, 2025, 2:15pm

Yeah. Transformers are standard and relatively versatile for experiments and modifications. While inference is straightforward if you’re only using the pipeline, setup makes it more suitable for intermediate users and above…

I think it’s best to start by trying Ollama for CLI or LM Studio for GUI. Once you can use one, there’s no fundamental difference beyond speed or usage. Getting started is the first hurdle.

JamesDavids · September 28, 2025, 8:10am

ChatGPT said:

You’re running into a combo of two things:

Access / account permissions on Hugging Face
- Some models (like Llama-2) require you to:
  - Sign the license on Hugging Face’s model page (Meta requires approval).
  - Be logged in with a valid token tied to that license acceptance.
- If you skip that, the code fails even if your token is correct.
- A free account is fine, but you must accept the license before download.
Compatibility issues on Apple Silicon (M1, macOS)
- A lot of Hugging Face examples assume Linux + CUDA (NVIDIA).
- On M1/M2 Macs, you only have CPU or MPS (Metal Performance Shaders).
- Many models (like Llama-2, MiniCPM-V) try to use CUDA by default → errors.
- That’s why qnguyen3’s workaround (patching FlashAttention + MPS) worked — Hugging Face hasn’t fully baked Apple-friendly defaults yet.

How to Fix It (Step-by-Step)

Make sure your token is active
```
huggingface-cli whoami
```
If it shows your username, you’re good. If not, run:
```
huggingface-cli login
```
and paste your token.
Accept the license on model page
- Go to meta-llama/Llama-2-7b-hf.
- Click “Agree and access model”.
- Try again with your token.

Force MPS backend on M1
In your notebook:

import torch

device = "mps" if torch.backends.mps.is_available() else "cpu"

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    torch_dtype=torch.float16,
    device_map={"": device},
)

Install MPS-friendly PyTorch
Make sure you’re on a version that supports Metal:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

Stick to smaller models first
Large ones (7B, 13B) often OOM on M1. Start with:
- openlm-research/open_llama_3b
- tiiuae/falcon-rw-1b

Why Hugging Face tutorials feel “broken”

They’re written for Linux + NVIDIA GPUs.
Apple Silicon support is still patchy, especially for models needing FlashAttention, bfloat16, or CUDA kernels.
That’s why community fixes (like qnguyen3’s) often work better.

Topic		Replies	Views
Newb unable to use a model on Mac Beginners	4	4294	October 18, 2022
Jupiter notebook unable to login Beginners	1	1185	September 24, 2022
Is, or will be, GPU accelerating supported on Mac device? 🤗Accelerate	8	7295	March 15, 2022
Prohibition on loading models (Probable) 🤗Transformers	0	488	March 25, 2023
Model illuin/camembert-large-fquad do not work anymore Models	2	1011	January 4, 2021

New to HF, cannot get any model to work in jupyter lab

ChatGPT said:

How to Fix It (Step-by-Step)

Why Hugging Face tutorials feel “broken”

Related topics