Why Tensor.clone() is called clone and not copy?

Hi, albanD.
Could you explain the difference between b = a.clone() and b.copy_(a)?

The docs said that

Unlike copy_(), clone() is recorded in the computation graph. Gradients propagating to the cloned tensor will propagate to the original tensor.

However, in the example below, the gradient was also backpropagated to the original tensor:

>>> x = torch.randn(2,2,requires_grad=True)
>>> x
tensor([[0.5113, 0.3028],
        [0.7036, 1.4417]], requires_grad=True)
>>> x.grad
>>> y = x*2+3
>>> y_copy = torch.zeros_like(y)
>>> y_copy.copy_(y)
tensor([[4.0227, 3.6057],
        [4.4072, 5.8835]], grad_fn=<CopyBackwards>)
>>> z = y_copy*3+3
>>> z
tensor([[15.0681, 13.8171],
        [16.2216, 20.6504]], grad_fn=<AddBackward0>)
>>> loss=torch.sum(z-15)
>>> loss.backward()
>>> x.grad
tensor([[6., 6.],
        [6., 6.]])

And the y_clone = y.clone() operation showed the same behavior. Could you explain the difference?