PEFT LoRA GPT-NeoX - Backward pass failing

eusip · April 5, 2023, 11:39am

I have written a training script that makes use of the Accelerate and PEFT libraries to finetune GPT-NeoX and repeatedly encounter the following two messages resulting in a runtime error.

The first message is:

/opt/conda/envs/accelerate/lib/python3.7/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")

and the second is:

File "/opt/conda/envs/accelerate/lib/python3.7/site-packages/torch/autograd/__init__.py", line 199, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I use the following code except to load the model.

peft_config = LoraConfig(
        r=16,
        lora_alpha=32,
        lora_dropout=0.05,
        target_modules = ["query_key_value", "xxx"],
        bias="none",
        task_type="CAUSAL_LM",
    )

model = AutoModelForCausalLM.from_pretrained(model_name)

model = get_peft_model(model, peft_config)
model.print_trainable_parameters()

The terminal command I am executing is:

accelerate launch train.py --data_path_file ./prompts.jsonl -m EleutherAI/gpt-neox-20b -te 3 -lr 1.41e-5 --eval_size 0.1 --batch_size 7 --gradient_checkpointing False

Any tips on successfully backpropagating using LoRA would be appreciated!

Environment details

`(accelerate) root@de1305f1fa1f:/mnt/training# python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.13.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.26.1
Libc version: glibc-2.10

Python version: 3.7.3 (default, Mar 27 2019, 22:11:17)  [GCC 7.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-58-generic-x86_64-with-debian-bullseye-sid
Is CUDA available: True
CUDA runtime version: 11.2.152
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: GRID A100D-7-80C
  MIG 7g.80gb     Device  0:

Nvidia driver version: 525.85.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.1.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.13.1
[conda] numpy                     1.21.6                   pypi_0    pypi
[conda] torch                     1.13.1                   pypi_0    pypi

ybelkada · April 5, 2023, 11:59am

Hi @eusip !
Can you share with us the entire training script? I suspect you are calling gradient checkpoint under the hood for some reason (even if the flag gradient_checkpointing is set to False)
The error you are seeing is due to the fact that the inputs does not have requires_grad set to True. For that you might need to call:

if hasattr(model, "enable_input_require_grads"):
    model.enable_input_require_grads()
else:
    def make_inputs_require_grad(module, input, output):
         output.requires_grad_(True)

    model.get_input_embeddings().register_forward_hook(make_inputs_require_grad)

Somewhere in your training script, right before get_peft_model.
Let me know if this works

eusip · April 5, 2023, 1:08pm

Thanks for the prompt response @ybelkada !

Your code snippet did the trick! For general reference my updated training script can be found here.

NoelK · May 31, 2023, 10:13am

For future users, I had the same error messages but the posted solution didn’t work. For me the problem was caused by these lines:

from peft import prepare_model_for_kbit_training

model = prepare_model_for_kbit_training(model)

The fix was to remove them.

thusinh1969 · June 14, 2023, 2:28am

You saved my day.
Thank you.

doolayer · February 15, 2024, 6:09am

I think the comments in this link may be helpful.

github.com

huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L2112


      
          
              if not _is_using_old_format:
                  self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradient_checkpointing_func)
              else:
                  self.apply(partial(self._set_gradient_checkpointing, value=True))
                  logger.warn(
                      "You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it)."
                      "Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model."
                  )
          
              if getattr(self, "_hf_peft_config_loaded", False):
                  # When using PEFT + gradient checkpointing + Trainer we need to make sure the input has requires_grad=True
                  # we do it also on PEFT: https://github.com/huggingface/peft/blob/85013987aa82aa1af3da1236b6902556ce3e483e/src/peft/peft_model.py#L334
                  # When training with PEFT, only LoRA layers will have requires grad set to True, but the output of frozen layers need to propagate
                  # the gradients to make sure the gradient flows.
                  self.enable_input_require_grads()
          
          def _set_gradient_checkpointing(self, enable: bool = True, gradient_checkpointing_func: Callable = checkpoint):
              is_gradient_checkpointing_set = False
          
              # Apply it on the top-level module in case the top-level modules supports it

SaeedNajafi · March 2, 2024, 5:50am

same, removed the line.

abbasm2 · July 29, 2024, 3:46pm

Thanks, I am getting the same error but this solution does not work for me.

Topic		Replies	Views
RuntimeError: tensors must be contiguous when finetuning GPT-J-6B using PEFT Lora DeepSpeed	0	872	July 29, 2023
LoRA finetuning without quantization (8bit) 🤗Transformers	1	978	February 23, 2024
PEFT LoRA GPT-NeoX - LoraConfig 🤗Transformers	2	2875	April 6, 2023
[RuntimeError] DPOTrainer - "element 0 of tensors does not require grad and does not have a grad_fn" on 8x A100 GPUs 🤗Accelerate	1	37	May 20, 2025
When using SGD: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn 🤗Transformers	0	1903	October 9, 2023

PEFT LoRA GPT-NeoX - Backward pass failing

Related topics