I’m trying to normalize an array(tensor) so that they sum to 1.

I did the following computation, however, the output gradient seems to be wrong. ( why are they all the same?? they should be different with respect to each element in a

code:

a = torch.rand(10, requires_grad=True)
print(a)
x = torch.sum(a)
y = a / x
y.backward(torch.ones_like(a))
pp(a.grad)

Yes but this is not what you get here.
Since you backward a Tensor full of 1, you get for the grad of a1: 1*d(y1)/d(a1) + 1*d(y3)/d(a1) + 1*d(y3)/d(a1).
And I guess the same term will appear if you do the same for a2 and a3 hence the constant result that you see.