Running mpt-7b on Mac m1

darrenoakey · May 18, 2023, 9:26pm

I have a 64gb m1 max. If I try to us mpt-7b from python:

model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b-chat',
  trust_remote_code=True
)

I get an error with flash_attn - and there doesn’t seem to be any way to install flash_attn on a mac - it seems devoted to cuda.
HOWEVER
If I run gpt4all - I can use that model - so it is definitely possible to use from my machine.
What am I doing wrong?

abhinavkulkarni · May 22, 2023, 4:07am

Hey @darrenoakey,

You may want to take a look at this answer of mine to see how to load the model (fully or partially) on CPU: How to use trust_remote_code=True with load_checkpoint_and_dispatch? - #2 by abhinavkulkarni

Topic		Replies	Views
Performance of mtb-7b on mac M1 Beginners	0	1265	January 3, 2024
Need help performance issues transformers.AutoModelForCausalLM.from_pretrained( 'mosaicml/mpt-7b-instruct' Beginners	0	930	June 12, 2023
Best practices to use models requiring flash_attn on Apple silicon macs (or non CUDA)? Models	2	6738	August 23, 2024
Is it possible to create a chatbot from mpt-7b Models	1	406	June 14, 2023
Running transformer models on mps instead of cpu on mac Beginners	1	1692	January 18, 2025

Running mpt-7b on Mac m1

Related topics