Load_in_8bit requires device_map but also does not support it

Hi! I am trying to optimize inference on a MarianMTModel, following the huggingface guide for optimizing inference on single GPU. Upon trying to run as a mixed-int8 model

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, load_in_8bit=True).to(device)

I am prompted to provide a device map

ValueError: A device map needs to be passed to run convert models into mixed-int8 format. Please run`.from_pretrained` with `device_map='auto'`

Adding the device_map

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, device_map="auto", load_in_8bit=True).to(device)

results in the following

ValueError: MarianMTModel does not support `device_map='auto'` yet.

I have tried recreating my Conda environment from, and still get the same problems. Has anyone had similar issues?

I am using PyTorch 1.12.0, Accelerate 0.15.0, bitsandbytes 0.35.4, transformers 4.24.0

3 Likes