I am trying to add a custom regularization term to the standard cross entropy loss. However, the total loss diverges, and the addition of the regularized loss to the cross entropy loss does not seem to have any impact whatsoever as if the gradients for the regularized loss do not backpropagate at all.
I have custom regularization function implemented in the following manner :
def li_regularizer(net, loss):
li_reg_loss = 0
for m in net.modules():
temp_loss = torch.sum(((torch.sum(((m.weight.data)**2),1))**0.5),0)
li_reg_loss += 0.1temp_loss
loss = loss + Variable(0.01li_reg_loss, requires_grad= True)
In the training loop my loss is defined as follows:
. . . . . . . . . . . . . .
criterion = nn.CrossEntropyLoss()
loss = criterion(outputs, labels)
loss = li_regularizer(net,loss)
Am I missing something? How do I ensure that the regularized loss contributes something to the total loss and also is used to compute the gradients? As it can be seen from the code I have already defined it to be a Variable with requires_grad=True. This is the most common method to add a regularizer and works in tensorflow. However, it appears that in Pytorch the regularized loss is ignored.