Custom loss function with a gradient operation

vicramr · July 30, 2021, 1:53am

I am attempting to implement adversarial training for the Fast Gradient Sign Method, which is a widely-used adversarial attack (see section 6 of [1412.6572] Explaining and Harnessing Adversarial Examples). This involves creating a custom loss function which includes a gradient operation. What is the proper way to do this? I don’t believe it is sufficient to use .backward inside the loss computation, because this would set the .grad attribute on all my tensors as a side effect.

From reading the documentation, it seems that the function I need is torch.autograd.grad, but the docs don’t clearly specify what side effects this function has. Will it modify the .grad attributes of my tensors?

Varal7 · August 4, 2021, 4:51am

Calling torch.autograd.grad will not accumulate gradients in your tensors, but will free the computation graph.
Can your problem be solved by using a strategically-placed .detach somewhere in your loss computation?

vicramr · August 5, 2021, 3:59pm

Thanks for your reply! I did consider using .detach but it doesn’t seem to fit my use case, as I need to backpropagate through the gradient operation. I ended up using torch.autograd.grad with retain_graph=True and create_graph=True and it appears to be doing the right thing.

I do have one follow-up question: in torch.autograd.grad, what is the grad_outputs argument used for? The description in the docs is unclear to me. I’m leaving it as the default value and my code appears to be working fine.