One of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 768]], which is output 0 of TanhBackward, is at version 1; expected version 0 instead

ammaarahmad1999 · May 26, 2021, 2:22pm

I am getting this error.

Here is my code

class Adversarial_Multitask(nn.Module):

def __init__(self, bert):
  
  super(Adversarial_Multitask, self).__init__()

  self.bert1 = bert
  self.bert2 = bert
  self.bert3 = bert
  self.drop = nn.Dropout(0.1)
  self.fc1 = nn.Linear(1536, 2)       #Main Task
  self.fc2 = nn.Linear(1536, length)  #Section Identifcation
  self.act = nn.ReLU()

def diff_loss(self, output1, output2):

  print(output1.shape)
  output1 -= torch.mean(output1, 0)
  output2 -= torch.mean(output2, 0)

  print(output1.shape)

  output1 = torch.norm(output1, dim=1)
  output2 = torch.norm(output2, dim=1)

  print(output1.shape)

  loss_dif = torch.matmul(output1, output2)
  return loss_dif

#define the forward pass
def forward(self, batch, x):

  #pass the inputs to the model  
  output2 = self.bert2(batch[0], attention_mask=batch[1])

  if (x == 1):
    output1 = self.bert1(batch[0], attention_mask=batch[1])
    pooled_output = torch.cat((output1[1], output2[1]), dim=1)
    output = self.fc1(pooled_output)
    loss_dif = 0 #
    self.diff_loss(output1[1], output2[1])
    
  else:
    output3 = self.bert3(batch[0], attention_mask=batch[1])
    pooled_output = torch.cat((output3[1], output2[1]), dim=1)
    output = self.fc2(pooled_output)
    loss_dif = 0 #
    self.diff_loss(output3[1], output2[1])   
  return output, loss_dif

I am trying to use orthogonality constraint loss for multi task learning as used in this paper: https://www.aclweb.org/anthology/P17-1001.pdf

I am multiplying 2 vectors to get the loss_dif but it is creating this issue. I need that loss for better performance. How can this be solved?

ptrblck · May 27, 2021, 5:01am

You could remove the inplace operations and replace them with their out of place equivalent, e.g.:

  output1 -= torch.mean(output1, 0)
  output2 -= torch.mean(output2, 0)

and check, if this would solve the issue.

ammaarahmad1999 · June 28, 2021, 10:41am

Sorry for late reply. This issue was solved. I don’t know how. Maybe there was bug somewhere else

ammaarahmad1999 · June 28, 2021, 10:59am

Can you answer to this question. I need little urgent