Hi,
I am working on a project where I need to calculate gradients for a particular model layer independently for each input in the batch. I have it working well unvectorized, but when I try to vectorize to speed things up I get the following error: “RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”
Has anyone see this issue before?
Some relevant code:
from torch.autograd import grad
def point_gradients(loss, params=None):
gradients = grad(loss, params, retain_graph = True,)
gradients = np.concatenate([g.cpu().numpy().flatten() for g in gradients])
return(gradients)
if vecorize:
grad_fn = vmap(point_gradients, in_dims=0)
batch_gradients = grad_fn(losses, params=layer_params)
else:
batch_gradients = [point_gradients(loss, params=layer_params) for loss in losses]
If I set vectorize to false, so point_gradients is called on individual losses everything works as expected
but if I set vectorize to True so the tensor of losses is passed to the vmap-ed function, I get the error:
“RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”
the shape of losses is torch.Size([256, 1]) if that is important.
Thanks for the help!