LongFormer inplace modification error

sb1 · July 26, 2021, 8:38pm

I keep getting this error “one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12, 4096, 159]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead”
Here is the code where I specify the global attention mask I feed into longformer:
attention_mask = mask_src
cls_ids = clss.detach().cpu().data.numpy()
global_attention_mask = torch.zeros(src.shape,dtype=torch.long,device=src.device)
global_attention_mask_curr = global_attention_mask.clone()
global_attention_mask_curr[:,cls_ids] = 1
I realize that global_attention_mask_curr[:,cls_ids] = 1 is an inplace operation as I’m setting specific indices to 1s. However in the example, you do pretty much exactly the same thing. How do I get around this error. (Btw if I comment out global_attention_mask_curr[:,cls_ids] = 1, it trains just fine.)

Topic		Replies	Views
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [1, 128]] is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed t 🤗Transformers	1	1735	August 16, 2024
Longformer seemingly initializing global attention mask for every step Intermediate	0	730	October 25, 2021
Error when using the forward() function of `LongformerLayer` class 🤗Transformers	6	1146	May 26, 2021
A potential in-place operation that caused an RuntimeError 🤗Transformers	1	2312	January 19, 2021
Mask2Former: CUDA training Models	5	688	July 30, 2023

LongFormer inplace modification error

Related topics