# Softmax output has zero grad

I want to multiply two vectors `a` and `b` with different dimensions, and then send the product vector `c` into the objective function.
For example, the demo code is as follows:

``````import torch

c=torch.cat((a*b[:2], b[4:]), dim=0)

d = torch.nn.functional.softmax(c, dim=0)
d.sum().backward(retain_graph=True)

c.sum().backward()
``````

The outputs are as follows:

``````tensor([0., 0.])
tensor([0., 0., 0., 0.])
tensor([0.4431, 0.1512])
tensor([0.2158, 0.2565, 1.0000, 1.0000])
``````

I don’t know why the grad of `a` and `b` are both zero if there is a softmax function. As shown above, when I remove the softmax function, the grad of `a` and `b` are both correct.

Because d is the output of a softmax, the sum of d is always 1 by definition no matter what you input. The gradient of a constant function is 0. You could try taking `c[0].backward()` for something more interesting.

Best regards

Thomas

1 Like