A potential in-place operation that caused an RuntimeError

Randool · January 19, 2021, 8:56am

Hi, I’m using transformers to generate sentences containing gradients using model.generate with the modification of removing @torch.no_grad() ahead of def generate(...):, since the current version (4.2.1) of model.generate doesn’t support keeping gradient.

And because I set do_sample=True and num_beams>1 in generate, the return type is BeamSampleEncoderDecoderOutput. According to the documents, the scores of BeamSampleEncoderDecoderOutput consists of log softmax scores for each vocabulary token and the sum of log softmax of previously generated tokens in this beam.

What I want to do is gathering the non-inf value from the last beam and applying gradient descent to train the network later. The key pseudo-code is:

outputs = self.generate(input_ids, ..., **model_kwargs)
# The type of outputs is BeamSampleEncoderDecoderOutput
scores = outputs.scores
last_step_score = scores[-1]
last_step_score = last_step_score[torch.where(last_step_score != -float('inf'))]
last_step_score = last_step_score[::num_beams]

However, when I run the program, I receive an error:

RuntimeError:
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [16, 50265]], which is output 0 of LogSoftmaxBackward, is at version 17; expected version 0 instead.
Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

which means that there is an in-place operation in generate, and I guess the in-place operation lies in BeamSearchScorer.finalize, but I can’t figure out what source code to change to make it viable.

sgugger · January 19, 2021, 9:16pm

cc @patrickvonplaten

Topic		Replies	Views
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [1, 128]] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed t 🤗Transformers	1	1735	August 16, 2024
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation 🤗Transformers	4	1478	November 14, 2023
Invalidate beam in do_sample mode with LogitsProcessor by setting it to -inf 🤗Transformers	0	328	July 8, 2023
Beam_search and generate are not consistent 🤗Transformers	0	497	May 10, 2022
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss 🤗Accelerate	3	4696	January 24, 2024

A potential in-place operation that caused an RuntimeError

Related topics