One of the differentiated Tensors does not require grad

Neda · August 29, 2019, 4:07pm

I’m trying to get a hessian vector product, but the following codes from DARTS model don’t work as expected while self.named_parameters() is iterating over pairs n,p.

the error is: RuntimeError: One of the differentiated Tensors does not require grad

How can I fix this error? Thank you in advance.

 def _hessian_vector_product(self, vector, input, target, r=1e-2):
    R = r / _concat(vector).norm()
    for p, v in zip(self.model.parameters(), vector):
      p.data.add_(R, v)
    loss = self.model._loss(input, target)
    grads_p = torch.autograd.grad(loss, self.model.arch_parameters())

    for p, v in zip(self.model.parameters(), vector):
      p.data.sub_(2*R, v)
    loss = self.model._loss(input, target)
    grads_n = torch.autograd.grad(loss, self.model.arch_parameters())

    for p, v in zip(self.model.parameters(), vector):
      p.data.add_(R, v)

    return [(x-y).div_(2*R) for x, y in zip(grads_p, grads_n)]

albanD · August 30, 2019, 9:15pm

Hi,

This comes from the fact that you call autograd.grad on something that does not require gradient.
You should check where that happens and track down why this Tensor does not require gradients if it should.

LLlearner · July 4, 2023, 8:26pm

I am having a similar problem. I am using vmap get the forward output for different ensembles and then taking the loss as a mean over all ensembles. When I run run autograd.grad on the loss I get the same error. The requires_grad flag of the loss is true. Any help in this matter?

damkow · August 3, 2024, 9:03pm

Hello
You should add following lines before grad_p definition:
for p in self.model.arch_parameters():
p.requires_grad = True