OutOfMemoryError: CUDA out of memory(LLM) Tuning

I am currently working on large language model (LLM) and frequently encounter the error: “OutOfMemoryError: CUDA out of memory.” Despite trying various solutions from multiple sources, I have been unable to resolve the issue.
Could you please anyone can guide me to resolve this? TIA

1 Like

There are probably many unknown bugs related to multi-GPU, so it may be quicker to investigate the cause of the error and resolve each case individually. In some cases, the issue may be resolved by simplifying the operation, such as setting device_map=“sequential”.