Fine Tuning LLama 3.2 1B Quantized Memory Requirements

John6666 · June 16, 2025, 4:08pm

I see, the dataset could also be a possible cause…
Well, the best practices for datasets are probably available in this forum or on GitHub if you search for them…

Also, depending on the model, gradient checking may not be available (I think it should be available in Llama 3.2 1B, though…), and there may still be some potential bugs in multi-GPU environments.

When trying to isolate the issue, it’s usually faster to temporarily switch to a smaller, simpler model or dataset.

github.com/huggingface/trl

CUDA OOM when using SFTTrainer with Phi-1.5B

opened 02:40PM - 25 Nov 23 UTC

closed 03:05PM - 07 Jan 24 UTC

SupreethRao99

Hi, I'm trying to supervised fine-tune a phi-1.5B model on a custom dataset w…ith the SFTTrainer, my script closely follows the [sft_llama2.py](https://github.com/huggingface/trl/blob/main/examples/research_projects/stack_llama_2/scripts/sft_llama2.py). I'm training the model on 4x2080Ti (11G), the model 1.3B params should comfortable fit it the combined VRAM of the GPU's but I see a CUDA OOM errors when I start my training. my hyper parameters are as follows: ```python per_device_train_batch_size: Optional[int] = field(default=1, metadata={"help": "The batch size per GPU."}) per_device_eval_batch_size: Optional[int] = field(default=1, metadata={"help": "The batch size per GPU for evaluation."}) gradient_accumulation_steps: Optional[int] = field(default=8, metadata={"help": "The number of gradient accumulation steps."}) gradient_checkpointing: Optional[bool] = field(default=False, metadata={"help": "Whether to use gradient checkpointing."}) ``` furthermore , this is how I'm instantiating my model: ```python self.base_model = AutoModelForCausalLM.from_pretrained( pretrained_model_name_or_path=self.script_args.model_name, quantization_config=self.bnb_config, device_map="auto",#{"": Accelerator().local_process_index}, trust_remote_code=True, # torch_dtype=torch.float16, # use_flash_attention_2=False ) ``` I cannot use PEFT or Gradient Checkpointing as Phi models are not supported.

Topic		Replies	Views
Memory requierements Models	2	334	February 18, 2025
Inquiry Regarding Out of Memory Issue During LoRA Fine-Tuning Models	2	63	May 5, 2025
CUDA out of memory on multi-GPU 🤗Transformers	1	2632	March 6, 2024
Fine tune Meta-Llama-3.1-8B OOM error after the 1st training step Models	0	157	September 6, 2024
Training CodeLlama2 using LORA doesnt save any memory Beginners	0	698	November 23, 2023

Fine Tuning LLama 3.2 1B Quantized Memory Requirements

Related topics