I have a 64gb m1 max. If I try to us mpt-7b from python:
model = transformers.AutoModelForCausalLM.from_pretrained( 'mosaicml/mpt-7b-chat', trust_remote_code=True )
I get an error with flash_attn - and there doesn’t seem to be any way to install flash_attn on a mac - it seems devoted to cuda.
If I run gpt4all - I can use that model - so it is definitely possible to use from my machine.
What am I doing wrong?