I am losing my mind a bit, I guess I missed something in the documentation somewhere but I cannot figure it out. I am taking the derivative of the sum of distances from one point (0,0) to 9 other points ( [-1,-1],[-1,0],…,[1,1] - AKA 3x3 grid positions).

When I reshape one of the variables from (9x2) to (9x2) the derivative seems way off… If i leave it as is and do not reshape it from (9x2) to (9x2) then the derivate seems correct within float error.

I’ve looked but I can’t figure out why. Can anyone help? or point me in the right direction?

```
from __future__ import print_function
from torch.autograd import Variable
import torch
#make a grid
t=torch.Tensor([-1,0,1])
a=t.reshape(1,3,1).expand(3,3,1)
b=t.reshape(3,1,1).expand(3,3,1)
grid=torch.cat((a,b),2)
points=grid.reshape(-1,2)
point=Variable(torch.Tensor([0,0]),requires_grad=True)
#distances=torch.norm(points-point.expand(9,2).reshape(9,2),2,1)
distances=torch.norm(points-point.expand(9,2),2,1)
output=distances.sum()
d_output_to_point = torch.autograd.grad(output,point,create_graph=True)[0]
print(d_output_to_point)
```

UPDATE:

I found the answer,