I want to multiply two vectors a
and b
with different dimensions, and then send the product vector c
into the objective function.
For example, the demo code is as follows:
import torch
a=torch.rand(2,requires_grad=True)
b=torch.rand(4,requires_grad=True)
c=torch.cat((a*b[:2], b[4:]), dim=0)
d = torch.nn.functional.softmax(c, dim=0)
d.sum().backward(retain_graph=True)
print(a.grad)
print(b.grad)
c.sum().backward()
print(a.grad)
print(b.grad)
The outputs are as follows:
tensor([0., 0.])
tensor([0., 0., 0., 0.])
tensor([0.4431, 0.1512])
tensor([0.2158, 0.2565, 1.0000, 1.0000])
I don’t know why the grad of a
and b
are both zero if there is a softmax function. As shown above, when I remove the softmax function, the grad of a
and b
are both correct.