How can I optimise the Gradient? Something like grad.backward()

I want to optimise the graident like the paper “Improved Training of Wasserstein GANs”. Something like, D(input), D.backward(), loss = D_loss + || input.grad - 1||. I found that input.grad don’t have creator, so the graph of grad don’t connect to the net, it won’t implement somethine like input.grad.backward(). How could I do this? By the way, the author of that paper using tensorflow.

Currently not suppoted. Check here for more discussion How to implement gradient penalty in PyTorch

It’s already implemented, but the PR waits for review. It’s probably going to be merged next week.

Thank you, I’ll see~

@Sora already merged two days ago