Stack expects a non-empty Tensor List pytorch while using gradient clipping

I want to use gradient clipping

torch.nn.utils.clip_grad_norm(loss, model.parameters()

I’m getting the error stack expects a non-empty Tensor List. I read in the forums, @tom highlighted that gradients need to be computed first (loss.backward()) before calling clip_grad_norm_ (not much info in the documentation). I’m updating the model’s parameter in a bit different manner.

grad = torch.autograd.grad(loss, model.parameters())

And then doing the updating manually.

How can gradient clipping be used in this case ?

2 Likes

Personally, I’d call that variable grads rather than grad.

You can grab clip_grad_norm_ from the source (link below) and then convert

for p in parameters:
   something with p.grad

into

for gr in grads:
  something with gr

Best regards

Thomas

Thanks for replying but I’m still not sure clip norm in this case. Inside the function, we don’t calculate grads, so do I compute that and then make modifications. I see that this clipping operation is applied on params, what changes do I make if I want an equivalent operation on grads then. And is this all really needed ? The only difference in how I update the model params.