Why Tensor.clone() is called clone and not copy?

Sybil · October 11, 2020, 3:49am

Hi, albanD.
Could you explain the difference between b = a.clone() and b.copy_(a)?

The docs said that

Unlike copy_(), clone() is recorded in the computation graph. Gradients propagating to the cloned tensor will propagate to the original tensor.

However, in the example below, the gradient was also backpropagated to the original tensor:

>>> x = torch.randn(2,2,requires_grad=True)
>>> x
tensor([[0.5113, 0.3028],
        [0.7036, 1.4417]], requires_grad=True)
>>> x.grad
>>> y = x*2+3
>>> y_copy = torch.zeros_like(y)
>>> y_copy.copy_(y)
tensor([[4.0227, 3.6057],
        [4.4072, 5.8835]], grad_fn=<CopyBackwards>)
>>> z = y_copy*3+3
>>> z
tensor([[15.0681, 13.8171],
        [16.2216, 20.6504]], grad_fn=<AddBackward0>)
>>> loss=torch.sum(z-15)
>>> loss.backward()
>>> x.grad
tensor([[6., 6.],
        [6., 6.]])

And the y_clone = y.clone() operation showed the same behavior. Could you explain the difference?