I used nn.Module to write my own loss, which is combined with nn.CrossEntropyLoss.
class MyLoss(nn.Module):
def __init__(self):
XXXX
def forward(self, input):
"""
Args:
Input: tensor from a network
"""
loss = Compute(input)
"""
Here has some numerical issue, loss should be greater than 0, but may get loss <0
"""
return (torch.sqrt(loss) if loss.item()>0 else 0)
mymodel = Net()
myloss = MyLoss()
ce = CrossEntropyLoss()
total_loss = myloss + ce
When MyLoss returns 0. The optimizer should backpropagate on nn.CrossEntropyLoss. But it turns out that the gradient is zero. The problem might be a constant return. But cross-entropy should have gradient.
Thanks for the reply. I am double-checking the numerical issue. Currently, it seems like tensor broadcasting introduces a non-negligible error, an entry is expected to be zero, but we got something like 7E-7.
The loss should be greater than or equal to 0, but sometimes got a very small negative value. I haven’t figured out the reason.