Confusion about clone()

jian_zhang · May 30, 2018, 9:56am

Hi,

The following

import torch
from torch.autograd import Variable

a = Variable(torch.zeros(10)).cuda()
b = a
c = a.clone()
a[0] = 1
print(a)
print(b)
print(c)

produces

tensor([ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], device='cuda:0')
tensor([ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], device='cuda:0')
tensor([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], device='cuda:0')

which is expected as a and b share the same memory. An update a also updates b.

However,

import torch
from torch.autograd import Variable

a = Variable(torch.ones(10)).cuda()
b = a
c = a.clone()
a = a * 10
print(a)
print(b)
print(c)

gives me

tensor([ 10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.], device='cuda:0')
tensor([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.], device='cuda:0')
tensor([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.], device='cuda:0')

In this case, b is not changed. Why?

Thanks

albanD · May 30, 2018, 9:59am

Because a = a * 10 is not an in-place operation: It creates a new tensor containing the result of a*10 then associate it with the python variable a. If you want to do this change in-place, you can do a *= 10.

jian_zhang · May 30, 2018, 10:03am

@albanD Thanks for the quick reply. Yes, you are right. How could I forgot about this.

dragen · May 30, 2018, 12:14pm

I have a question.
For feed-forward network, suppose the formula is : y = 10*x
what’s the difference when i write the computation graph as y*=10 or y=10*y?

albanD · May 30, 2018, 12:16pm

One will contain an in-place operation and the other one an out of place one. So very little. It might be problematic in the in-place case if the original y value is used in other places (that will raise an error saying that a needed tensor has been modified in-place).