Load_in_8bit requires device_map but also does not support it

owinton · December 19, 2022, 12:38pm

Hi! I am trying to optimize inference on a MarianMTModel, following the huggingface guide for optimizing inference on single GPU. Upon trying to run as a mixed-int8 model

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, load_in_8bit=True).to(device)

I am prompted to provide a device map

ValueError: A device map needs to be passed to run convert models into mixed-int8 format. Please run`.from_pretrained` with `device_map='auto'`

Adding the device_map

model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint, device_map="auto", load_in_8bit=True).to(device)

results in the following

ValueError: MarianMTModel does not support `device_map='auto'` yet.

I have tried recreating my Conda environment from, and still get the same problems. Has anyone had similar issues?

I am using PyTorch 1.12.0, Accelerate 0.15.0, bitsandbytes 0.35.4, transformers 4.24.0

Topic		Replies	Views
Unable to load LLM with load_in_8bits 🤗Transformers	1	855	May 9, 2023
Device_map not wokring for ORTModelForSeq2SeqLM - Potential bug? 🤗Optimum	1	455	May 22, 2024
Device_map="auto" Beginners	5	19801	September 25, 2024
Runtime error when using device_map 🤗Transformers	1	1174	September 20, 2023
Anywhere where I can read more about the `device_map` kwarg in `from_pretrained`? Beginners	2	13653	January 5, 2024

Load_in_8bit requires device_map but also does not support it

Related topics