Copy.deepcopy() vs clone()

Shisho_Sama · September 3, 2019, 7:33am

when copying modules/tensors around, which one should I use?
are they interchangable?
Thanks a lot

spanev · September 3, 2019, 9:19am

For Tensors in most cases, you should go for clone since this is a PyTorch operation that will be recorded by autograd.

>>> t = torch.rand(1, requires_grad=True)
>>> t.clone()
tensor([0.4847], grad_fn=<CloneBackward>) # <=== as you can see here

When it comes to Module, there is no clone method available so you can either use copy.deepcopy or create a new instance of the model and just copy the parameters, as proposed in this post Deep copying PyTorch modules.

Shisho_Sama · September 3, 2019, 10:53am

Hi, Thanks a lot.
So this means, when I do clone() on tensors, their clones will still be on the graph and any operations on them will be reflected in the graph right? for example changing the values or attributes will also change the original tensor as well? or affect the graph computation when doing backward pass?
In the case of the following two code snippets, what does happen in each case?
I think, deepcopy disregards any graph related information and just copies the data
as if it is a simple object while the clone, will create a new tensor which any operations on it will be reflected in the graph and in order to prevent this I need to use detach as well. am I right?

weights_encoder = sae_model.encoder[0].weight.data.clone() 
weights_decoder = sae_model.decoder[0].weight.data.clone()

or

weights_encoder = copy.deepcopy(sae_model.encoder[0].weight.data) 
weights_decoder = copy.deepcopy(sae_model.decoder[0].weight.data)

spanev · September 3, 2019, 12:11pm

When you use .data, you get a new Tensor with requires_grad=False, so cloning it won’t involve autograd. So both are equivalent, but there might be a (small) speed difference, I am not sure about that.

Another use case could is when you want to clone/copy a non-parameter Tensor without autograd. You should use .detach() (and not data) before cloning:

>>> t = torch.rand(1, requires_grad=True)
>>> t.detach().clone()
tensor([0.4847])

Shisho_Sama · September 3, 2019, 12:25pm

Thank you very much. I really appreciate it

dio_din · April 24, 2020, 12:42pm

Is there any difference with t.clone().detach()?

bobvo23 · May 7, 2020, 7:56pm

Yes there is. Though both methods create same outcomes, however, t.clone().detach() is less efficient. The t.clone() with create a copy that attaches to the graph, then it will create another copy (detach()). So there will be more redundant.

pinocchio · June 16, 2020, 8:57pm

I never understood this. Why would one ever want to have clone be in the computation graph? It’s just the identity!

When I make a copy of something I usually expect a brand new object, with new memory allocation and new instance of the object class it belongs. Not just copying pointers/references around. Can you clarify?

ptrblck · June 17, 2020, 7:38am

Answered here and here.

pinocchio · June 17, 2020, 7:01pm

Let me see if I understand (it seems the accepted answer here is outdated, .data is not in the library or going to be removed according to what I’ve read in other answers with from albanD).

.clone() produces a new tensor instance with a new memory allocation to the tensor data. In addition it remembers the history of the original tensor and is connected to the earlier graph and appears as CloenBackward. The main advantage it seems is that its safer wrt in-place ops afaik.
deepcopy make a deep copy of the original tensor meaning it creates a new tensor instance with a new memory allocation to the tensor data (it definitively does this part correctly from my tests). I assume it also does a complete copy of the history too, either pointing to the old history or create a brand new deep copy history. I’m unsure how to test this but I believe if it is to behave as a proper deep copy method then it should create a new history that is a mirror of the earlier (instead of just pointing to it).

Test I did wrt memory allocation:

def clone_vs_deepcopy():
    import copy
    import torch

    x = torch.tensor([1,2,3.])
    x_clone = x.clone()
    x_deep_copy = copy.deepcopy(x)
    #
    x.mul_(-1)
    print(f'x = {x}')
    print(f'x_clone = {x_clone}')
    print(f'x_deep_copy = {x_deep_copy}')

output

x = tensor([-1., -2., -3.])
x_clone = tensor([1., 2., 3.])
x_deep_copy = tensor([1., 2., 3.])

since neither changed it must be a different memory. I just realized I could have checked it with id or something…alas.

I am still seeking clarification on the history part. Is it a deep copy of that or a pointer copy if we use deep copy?

I know for know for clone it is a pointer copy to the original history and not a complete deep copy.

ptrblck · June 18, 2020, 5:49am

The history will not be copied, as you cannot call copy.deepcopy on a non-leaf tensor:

x = torch.randn(1, requires_grad=True)
y = x + 1
copy.deepcopy(y)
> RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

While y will be attached to the computation graph and will have a valid .grad_fn, you can only copy leaves as stated in the error message.

If you want to keep the history, use .clone(), otherwise .detach() the tensor additionally to the clone() call.

pinocchio · June 18, 2020, 11:04pm

I always wondered why that error appeared!

Whats the choice for that semantics?

ptrblck · June 19, 2020, 5:07am

I don’t know, why the deepcopy isn’t supported (also wasn’t supported on Variables) and my best guess is that clone() or detach().clone() are a valid workaround and are also more explicit.

pinocchio · June 24, 2020, 5:30pm

who would know why it’s not supported?

jchiu · December 31, 2022, 4:50am

Hi! This thread has been extremely useful but I ran into an issue with using copy.deepcopy() on a model inherited from pl.LightningModule where "_trainer" attribute was not None. It would throw the following error:

AttributeError: 'MyModel' object has no attribute '_parameters'

I noticed that setting it to None, running copy.deepcopy() and then resetting it would solve this issue. Is this an issue caused by PyTorch Lightning?

ptrblck · December 31, 2022, 6:28am

This issue might be specific to Lightning and seems to be related to this one.

jchiu · December 31, 2022, 6:40am

Got it. Will follow up on this discussion in the github issue thread! Thanks for your help.