What's the difference between Variable.detach() and Variable.clone()

Is tha detach() share the same memory , but clone() creates a new memory?

That’s correct, that is the main difference. x and x.detach() share the same memory, but x.clone() makes new memory.

There are also some subleties around requires_grad (and possibly other things). If x has requires_grad=True, then x.clone() also has requires_grad=True but x.detach() doesn’t.


Thank you.
I used detach() to prevent grads back,like stop_gradient() in Tensorflow. Is this correct? If not, what should be done to prevent the grad back ?

To my understanding stop_gradient() in Tensorflow treats the thing as a constant. Sending in x.detach() to a nn layer will prevent gradients from being computed for x, so I believe the behavior is the same

1 Like

I was playing around with .clone and had a little confusion with this piece of code

x = torch.tensor([3],requires_grad=True)
y = x.clone()
z = y**2

print y.grad # None
print x.grad # tensor([6])

From reading your reply , i expected a new leaf node, completely seperate from the first graph , but with values of x . But , here it seems like y is a part of the original graph .
Could you please explain what is going on . Also , if possible could you explain where .clone can be useful using a simple example (possibly code).

Thanks ,


y is computed by the clone operation in x, so it is not a leaf variable that it will does not have its grad. But the backpropagation will though it to x, so x will get the grad.