I used nn.Module to write my own loss, which is combined with nn.CrossEntropyLoss.
def forward(self, input):
Input: tensor from a network
loss = Compute(input)
Here has some numerical issue, loss should be greater than 0, but may get loss <0
return (torch.sqrt(loss) if loss.item()>0 else 0)
mymodel = Net()
myloss = MyLoss()
ce = CrossEntropyLoss()
total_loss = myloss + ce
When MyLoss returns 0. The optimizer should backpropagate on nn.CrossEntropyLoss. But it turns out that the gradient is zero. The problem might be a constant return. But cross-entropy should have gradient.
Does anyone come across this type of problem?
If you don’t initialize the parameters in the network, you’re likely to have a gradient problem
Your code snippet should work, even if you return a zero in your custom loss function as seen here:
output = torch.randn(10, 10, requires_grad=True)
target = torch.randint(0, 10, (10,))
criterion = nn.CrossEntropyLoss()
loss = 0 + criterion(output, target)
Could you check the value of
ce before calling
Thanks for the reply. I am double-checking the numerical issue. Currently, it seems like tensor broadcasting introduces a non-negligible error, an entry is expected to be zero, but we got something like 7E-7.
The loss should be greater than or equal to 0, but sometimes got a very small negative value. I haven’t figured out the reason.