Grad changes after reshape

Hi,
I am losing my mind a bit, I guess I missed something in the documentation somewhere but I cannot figure it out. I am taking the derivative of the sum of distances from one point (0,0) to 9 other points ( [-1,-1],[-1,0],…,[1,1] - AKA 3x3 grid positions).

When I reshape one of the variables from (9x2) to (9x2) the derivative seems way off… If i leave it as is and do not reshape it from (9x2) to (9x2) then the derivate seems correct within float error.

I’ve looked but I can’t figure out why. Can anyone help? or point me in the right direction?

Thank you!

from __future__ import print_function
from torch.autograd import Variable
import torch

#make a grid
t=torch.Tensor([-1,0,1])
a=t.reshape(1,3,1).expand(3,3,1)
b=t.reshape(3,1,1).expand(3,3,1)
grid=torch.cat((a,b),2)
points=grid.reshape(-1,2)

point=Variable(torch.Tensor([0,0]),requires_grad=True)

#distances=torch.norm(points-point.expand(9,2).reshape(9,2),2,1)
distances=torch.norm(points-point.expand(9,2),2,1)

output=distances.sum()
d_output_to_point = torch.autograd.grad(output,point,create_graph=True)[0]
print(d_output_to_point)

UPDATE:

I cannot post because my account is on hold for some reason :frowning:
I found the answer,

:slight_smile:

Ah! I’m a fool!

Makes sense :slight_smile:

Hi! This is actually a bug. I’m tracking it at https://github.com/pytorch/pytorch/issues/8626 and working on it currently. Sorry about it!

Interesting :slight_smile: I assumed that since a reshape copied over data it somehow lost the history in the process. With view it seems to be much more reasonable… at least in my case. Would this bug also affect contiguous and view calls after expand? or just reshape?

In this case view and reshape does the same thing. You can either replace reshape with contiguous() or clone() here as a workaround.

Awesome! Thanks for the quick reply :slight_smile: Maybe this fix will make some of the existing code I have run a bit better :slight_smile: , in the mean time will use workarounds! Thanks again!

1 Like