How to merge a ORPO fine tuned llama3 model without OOM?

celsowm · May 24, 2024, 9:51pm

Hi !
I am trying to do a simple ORPO fine tunning using a very small dataset: “celsowm/auryn_dpo_orpo”

The problem is, when I try to do this after training:

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    low_cpu_mem_usage=True,
    return_dict = True,
    torch_dtype=torch.float16,
    device_map="auto"
)

model, tokenizer = setup_chat_format(model, tokenizer)

model = PeftModel.from_pretrained(model, new_model)
model = model.merge_and_unload()

I got OOM !

The complete kaggle is here: https://www.kaggle.com/code/celsofontes/fine-tunning-orpo

Any hints?

jnmuiruri · May 24, 2024, 10:04pm

this worked for me

model = AutoModelForCausalLM.from_pretrained(
base_model,
low_cpu_mem_usage=True,
return_dict = True,
torch_dtype=torch.float16,
device_map=“auto”,
)

gc.collect()

model, tokenizer = setup_chat_format(model, tokenizer)
gc.collect()
model = PeftModel.from_pretrained(model, new_model)
gc.collect()
model = model.merge_and_unload()
gc.collect()

celsowm · May 24, 2024, 10:31pm

unfortunately still not working:

CUDA out of memory. Tried to allocate 1.96 GiB. GPU 0 has a total capacty of 15.89 GiB of which 670.12 MiB is free. Process 2065 has 15.22 GiB memory in use. Of the allocated memory 14.80 GiB is allocated by PyTorch, and 120.73 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I even tried this new param “offload_buffers”:

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    low_cpu_mem_usage=True,
    return_dict = True,
    torch_dtype=torch.float16,
    device_map="auto",
    offload_buffers=True
)

But when I tried to merge, OOM again

celsowm · May 26, 2024, 10:37pm

I’ve discovered the way:

instead:
device_map="auto"

using:

device_map="cpu"

system · May 27, 2024, 10:38am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Proper Adapter Loading for SFT and DPO Beginners	1	34	January 9, 2025
Autotrain ORPO Error 500 Beginners	2	58	October 9, 2024
Llama2-70b-chat loading Cuda Out of Memory Models	0	1202	July 26, 2023
ORPO Trainer giving error when fine-tuning Llama3-8b in Multi-GPU environment 🤗Accelerate	8	1162	May 27, 2024
After fine tuning, saving and reloading the model, he is "forgetting" fine tuning 🤗Transformers	0	792	August 9, 2023

How to merge a ORPO fine tuned llama3 model without OOM?

Related topics