How to implement gradient penalty in PyTorch

chenyuntc · May 8, 2017, 2:42am

You are right, most function are still old-style which don’t support grad of grad.
There is a temporary fix: use difference rather than differential

x_1,x_2 are sampled from x_hat
idea from 郑华滨