First Derivative matirx in L2 norm constraint

yu_Magic · March 9, 2020, 5:10am

Dear all,
Recently, I work on this loss function which has a special L2 norm constraint.

The G denotes the first derivative matrix for the first layer in the neural network. I try to search for a lot of methods. However, it can not work for this constraint. How can I implement this constraint? To write a new autograd function for the first layer in the neural network? or implement a new optimizer?
Thank you for your kindest help!

albanD · March 10, 2020, 6:38pm

Hi,

If G = dce_loss / dW1 where ce_loss is the cross entropy loss. Then you can compute this as:

ce_loss = criterion(output, target)
G = autograd.grad(ce_loss, W1, create_graph=True)[0]
loss = ce_loss + (G * W1).pow(2).sum()