Hey, I don’t understand why doesn’t the following bit of code work:
d1, d2 = 12, 24
timesteps = 3
FC1 = nn.Linear(d1,d2)
FC2 = nn.Linear(d2,d1)
L1 = nn.L1Loss()
a = torch.randn(d1)
b_storage = torch.zeros(timesteps, d2)
for t in range(timesteps):
b_storage[t] = FC1(a)
a = FC2(b_storage[t])
print(b_storage._version) #<- 1,2,3
target = torch.randn_like(a)
loss = L1(a, target)
loss.backward() #<- expects version 2?!
[out]: **RuntimeError:** one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 24]], which is output 0 of UnsqueezeBackward0, is at version 3; expected version 2 instead.
Whatever I set the number of timesteps to be, autograd expects the final version to be at timesteps-1. If I don’t use the b_storage tensor and instead feed FC2 with the output of FC1(a) directly then the issue doesn’t come up. However the problem doesn’t make sense: looking at the code makes it obvious that autograd should be expecting b_storage to be at version=timesteps, not timesteps-1. Is there a bug with UnsqueezeBackward0?