Manually applied gradient to neural network raises Error: Can't detach views in-place

Hi; I’m working on gradient estimator; so I need to implement model that requires manually compute and apply gradients

here is my current script

from itertools import chain
net1 = Net1() # neural network
net2 = Net2() # neural network

opt = Adam(list(net1.parameters()) + list(net2.parameters()), lr=1e-3)

params = chain(net1.parameters(), net2.parameters())

for _ in range(epoches):
    opt.zero_grad()  # at second iterations, line raises error
    grad = torch.autograd.grad(loss, params)

    for i,p in enumerate(params):
        p.grad = grad[I]
   
    opt.step()
    

At second iterations,

opt.zero_grad()

leads to

/anaconda3/lib/python3.7/site-packages/torch/optim/optimizer.py in zero_grad(self)
    161             for p in group['params']:
    162                 if p.grad is not None:
--> 163                     p.grad.detach_()
    164                     p.grad.zero_()
    165 

RuntimeError: Can't detach views in-place. Use detach() instead

I couldn’t know where does my code go wrong here. Any help ?

Here is a short script of simple model

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(1,4)
        self.fc2 = nn.Linear(4,1)
    def forward(self,x):
        h = self.fc1(x)
        h = self.fc2(nn.ReLU()(h))
        return h


# create data
x = torch.torch.randn(20,1)
y = 1.2 * x ** 2 - 1
loss_fn = lambda y_hat,y: (y_hat - y).pow(2).sum()


net = Net()
optimizer = optim.SGD(net.parameters(), lr=1e-3)

Loss = []
for _ in range(1000):
    optimizer.zero_grad()
    y_hat = net(x)
    loss = loss_fn(y_hat,y)
    grad = torch.autograd.grad(loss, net.parameters())
    for i,p in enumerate(net.parameters()):
        p.grad = grad[i]
        print(p.grad)
    optimizer.step()
    Loss.append(loss.item())

Thanks in advance

Could you post the definition of the models so that we can reproduce this issue?

Thanks for the reply; I’ve attached a simple example that raises the error and my pytorch version is 1.1.0

Thanks for the code.
Could you clone the grad, so that you won’t create a view of this tensor?

p.grad = grad[i].clone()

Thanks ! It solves the problem!

Hi, I was also facing the problem of RuntimeError: Can't detach views in-place. Use detach() instead with x.detach_()
It got resolved using both x = x.detach() and x = x.contiguous().detach_() .

But still, I don’t exactly understand the differences between these three lines? And when is the view created of a variable?

Thanks. Any help would really be appreciated!