from torch.autograd import Variable
x = Variable(torch.Tensor(), requires_grad = True)
y = x + 2
x1 = Variable(torch.Tensor(), requires_grad = True)
x2 = Variable(torch.Tensor(), volatile = True)
y.data = torch.cat((y.data , x1.data, x2.data), 0)
#y.data = torch.cat((y.data , x1.data, x2.data), 0).contiguous()
z = y * y *3
out = z.mean()
print(x, x.grad, x1.grad)
[torch.FloatTensor of size 1]
[torch.FloatTensor of size 5]
Ideally, x.grad (size 5) should be of the same size as x (size 1 ). if x is a non-leaf variable this raises an error (backward() raises an error). Any workaround for using torch.cat
volatile, this means that when pytorch sees
x2 used in any calculation it does not store the computation graph. If any of your model’s inputs is volatile, then pytorch won’t be able to backpropagate.
Secondly, you can use torch.cat on Variables directly. If you operate on
.data then pytorch doesn’t track the operations and can’t backpropagate properly.
y = torch.cat((y, x1, x2), 0)
Another potential problem could be the use of
.resize_() which you can replace with a simple slice.
y = y[:5]
x.grad must contain the same number of elements, but assigning the result of
y.data confuses the autograd mechanism because you are changing the size of the underlying tensor without informing the computation graph of the change. It works fine with the above modifications.
The basic rule of backprop is to never use
.data if you want to backpropagate. Don’t use
volatile=True either, unless you are running in inference mode.