I am working on a network need regularize the gradient, so I must get a second derivative.
But I got an error of
RuntimeError: the derivative for _cudnn_rnn_backward is not implemented
I minimize my code to reproduce the error
cell = nn.GRUCell(10, 10).cuda() parameters = list(cell.parameters()) x = torch.rand(1, 10).cuda() y = torch.rand(1, 10).cuda() incoming = cell(x, torch.zeros(1, 10).cuda()) incoming = cell(y, incoming) loss = torch.sum(incoming) grad_all = grad(loss, parameters, retain_graph=True, create_graph=True, only_inputs=True) print(grad_all.requires_grad) loss2 = torch.sum(torch.cat([v.view(-1) for v in grad_all])) loss2.backward()
and come up with another error
RuntimeError: trying to differentiate twice a function that was markedwith @once_differentiable
So is there any workaround for me to get the second order gradient? (I’m on pytorch0.4.1)