# Torch.logsumexp returning nan gradients when inputs are -inf

Hello,

When I try to backpropagate on a tensor full of -inf and I have a torch.logsumexp , the gradients of that becomes nan. Like the code below:

``````>>> a = torch.nn.Parameter(torch.tensor([-float("inf"), -float("inf"), -float("inf")]))
>>> b = 2 + a
>>> torch.logsumexp(a, dim=0)
>>> torch.logsumexp(a, dim=0).backward()
tensor([nan, nan, nan])
``````

But even if one of the elements in the tensor is non -inf the gradients are propagated properly:

``````>>> a = torch.nn.Parameter(torch.tensor([1.0, -float("inf"), -float("inf")]))
>>> b = 2 + a
>>> torch.logsumexp(a, dim=0)
>>> torch.logsumexp(a, dim=0).backward()
tensor([1., 0., 0.])

``````

How do I ensure that even in case of all -inf my gradients should be zero?

The problem is that at the point where the final result is `-inf`, the gradient is infinite. So during backprop, the gradient becomes nan.
In your second example, the gradient at point 1. is finite and everything works fine.

So I would say this is expected behavior no?

But in a larger graphical models like CTC or HMMs, this is introducing nans in my backpropagation, What could be the best possible way to avoid this behavior; as I want my model to be training on the non-infinity parameters in the different locations what I mean is that consider a matrix;

``````x = torch.tensor([[ 1.0, -float("inf")], [-float("inf"), -float("inf")]])
>>> x.shape
torch.Size([2, 2])
>>> x
tensor([[1., -inf],
[-inf, -inf]])
>>> x = nn.Parameter(x)
>>> y = torch.logsumexp(x, dim=1)
>>> y
>>> z = torch.sum(y, dim=0)
>>> z.backward()
tensor([[1., 0.],
[nan, nan]])
``````

Take this example for say; how can I make changes that instead of nan I get zero gradients at x.grad?

I would argue that you need to make sure not to get a whole row of infinites. Because at this point, your logsumexp function is not properly differentiable and so it’s gradients will be nan.

But incase I don’t want to do gradient update of those places where there are nan’s. What could be the possible work around this?