I am trying to run
out.backward(torch.Tensor([2.0])) and I get a error shape mismatch, (which makes sense) but I am try to run the same in Pytorch 1.0.2 it is working if I print grad post this operation elements in the matrix get multiplied by 2.0.
out.backward(torch.tensor([2.0])) throws error
RuntimeError: Mismatch in shape: grad_output has a shape of torch.Size() and output has a shape of torch.Size().
out.backward(torch.tensor([1.0])) doesn’t work as well in the latest version.
0 dimensional tensors have been introduced to represent tensors with no dimension and a single input value.
You can use
torch.tensor(2.0) to get a 0 dimensional Tensor that contains the value 2.
Thanks @albanD, it works now but I get different output for x.grad if I use
(out.backward(torch.tensor([2.0])) in pytorch version
A 2x2 square matrix where each values is 6
Output 2: (out.backward(torch.tensor(2.0)) in pytorch version
A 2x2 square matrix where each values is 9
May you please explain the difference in results?
Values of x,y and z are as follows:
x = torch.ones(2, 2)
y = x+2
z = y * y * 3
out = z.mean()
OUTPUT - tensor([[9., 9.],
Your full function here is:
f(x) = ( sum_i (xi+2)^2 * 3 ) / size(x) = (sum_i (xi+2)^2 * 3 ) / 4
df(x)/dxj = 2 * (xj + 2) *3 / 4
given that you start with xi = 1 for all i.
the gradients you expect with no scaling is: 2 * (1 + 2) *3 / 4 = 4.5
So if you multiply the result by 2, you get 9 as expected.
I am not sure why you get 6 for your pytorch 1.2 code, but I would say this is most likely a small bug in your test code? do your start with torch.zeros() for example instead of torch.ones()?
Sorry @albanD it was mistake from my end. That should return 9.0 as expected, I am using different function for whose derivative is 3.0 and hence when multiplied by 2.0 = 6.0
Thanks again for your help!