I got a problem loading the Llama-2-13b-hf model with the following code,
LlamaForCausalLM.from_pretrained(base_model, trust_remote_code=True, device_map = “cuda:0”, load_in_8bit = True),
A error returned as
ImportError: Using load_in_8bit=True
requires Accelerate: pip install accelerate
and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes
or pip install bitsandbytes`.
But I am pretty sure the two package are installed with version bitsandbytes-0.41.3, accelerate 0.25.0. Anyone has encountered the issue before and know how to resolve it, thank you very much!