I Use google colab but connected localy with my computer using jupyter. I have Windows 10, RTX 3070 and no cuda or cudnn because I didn’t succed to make it works
Reproduction
!pip install transformers trl accelerate torch bitsandbytes peft datasets -qU
!pip install flash-attn --no-build-isolation
from datasets import load_dataset
instruct_tune_dataset = load_dataset("mosaicml/instruct-v3")
...
model_id = "mistralai/Mixtral-8x7B-v0.1"
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map='auto',
quantization_config=nf4_config,
use_cache=False,
attn_implementation="flash_attention_2"
)
Error message:
ImportError: Using bitsandbytes
8-bit quantization requires Accelerate: pip install accelerate
and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes
Expected behavior
This code works when using the google colab free T4 but don’t when I run the code localy. Even if I have installed bitsandbytes and accelerate.
Please help me
Thank you very much!