My model architecture is defined as the following:
class NN(torch.nn.Module):
def __init__(self):
super(NN, self).__init__()
self.linear1 = nn.Linear(3, 9)
self.act1 = nn.ReLU(inplace=False)
self.linear2 = nn.Linear(9, 3)
def forward(self, x):
x = self.linear1(x)
x = self.act1(x)
x = self.linear2(x)
return x
And I am trying to train a model by using codes similar to the following:
model = NN()
model = model.double()
loss_fn = nn.MSELoss()
optimizer=torch.optim.Adam(model.parameters(),lr=1e-3)
epoches = 100
batch_size = 5
iterations = total_num / batch_size
for epoch in range(1,epoches+1):
train_a, ans = dataset_init(A, B, ratio)
preditions = torch.tensor(np.zeros((total_num, 2, 3, 1, 3)))
for iteration in range(0,iterations):
loss_train = torch.empty(batch_size)
for i in range(0+batch_size*iteration, batch_size*(iteration+1)):
prediction[i] = model(torch.unsqueeze(train_a[i].to(torch.float64),0))
loss_train[i-batch_size*iteration] = loss_fn(prediction[i].clone(),ans[i])
loss_train_avg = torch.mean(loss_train.clone())
optimizer.zero_grad()
loss_train_avg.backward(retain_graph=True)
optimizer.step()
After running the code, I got:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [9, 3]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
However, I did not find inplace operations in my codes. And the ‘torch.DoubleTensor [9, 3]’ in the error message looks like something related to my 2nd Linear layer, so I deleted it to check, then the codes run smoothly without the error. I have also tried to use ‘x = self.linear2(x.clone())’, but the codes also report the same error message above.
Now I have no idea how to modify it, and I dont know why my linear layer is considered as an inplace operation. Can anybody help me?