Please help. I have a simple net which takes several inputs, and outputs a single value. I can compute gradient of the output wrt to all the inputs by autograd. Now I want to minimize the norm of this gradient wrt to the model parameters… Is it possible? I remember it was possible to do with Tensorflow 1.x, but not so sure about Pytorch. The sample code would be something like
x.requires_grad_()
y = Model(x)
y.backward()
loss = x.grad.square().sum()
loss.backward()
Thanks!