Is there any difference between different kinds of <grad_fn>?

yukino · October 26, 2019, 8:28am

I would like a function such as 1 - x with autograd, but I come up with some questions about the difference between the following code.

x = 1.0 - nn.Parameter(torch.zeros(1, 3, 1, 1))
print(x)
y = torch.ones(1, 3, 1, 1) - nn.Parameter(torch.zeros(1, 3, 1, 1))
print(y)
z = - nn.Parameter(torch.zeros(1, 3, 1, 1)) + 1.0
print(z)

Output:

tensor([[[[1.]],

         [[1.]],

         [[1.]]]], grad_fn=<RsubBackward1>)
tensor([[[[1.]],

         [[1.]],

         [[1.]]]], grad_fn=<SubBackward0>)
tensor([[[[1.]],

         [[1.]],

         [[1.]]]], grad_fn=<AddBackward0>)

the <grad_fn> between them look different, but I dont konw how to test them to know the difference between them.

Someone have any idea ? Thanks a lot.

ptrblck · October 26, 2019, 8:34am

Since different operations are used in the forward calculations, Autograd will create the corresponding grad_fns. How would you like to test them?
If you would like to see the gradients, you could access it via the .grad attribute.

yukino · October 26, 2019, 8:55am

whatever I choose, the output will be the same?
some people use torch.ones_like(x) and some people use 1.0, I want to know whether the choice matter the output and Which is the preferable one to use in practice.

yukino · October 26, 2019, 8:59am

If they are the same maybe we can use 1.0 - x in models, and at the end wrtie models.to(device), but not torch.ones_like() every where.

Mendel123 · October 26, 2019, 9:02am

I think there is no difference in the gradient aspect.

ptrblck · October 26, 2019, 9:20am

Sometimes you need to explicitly create a tensor with certain sizes, so there is nothing wrong in using torch-ones_like.
Do you have some issues with this method?