How to copy a Variable in a network graph

alan_ayu · April 2, 2017, 3:35pm

If I need to copy a variable created by an operation instead of user,and
let the copy have an independent memory,How can I do for that purpose?
Thank you!

albanD · April 2, 2017, 4:07pm

Hi,

You can use the .clone() function directly on the Variable to create a copy.

alan_ayu · April 2, 2017, 4:28pm

indeed,but I am not sure if the gradient can get through correctly during backpropagation in that way:pensive:

albanD · April 2, 2017, 4:29pm

Yes they will get through properly.

alan_ayu · April 2, 2017, 4:31pm

ok,I got it,thank you!!!

pietromarchesi · April 13, 2017, 2:21pm

@albanD I came across a behavior that I don’t really understand, whereby a cloned variable ends up having no gradient. Could you clarify what is going on here? This is the example:

import torch
from torch.autograd import Variable

def basic_fun(x):
    return 3*(x*x)

def get_grad(x):
    A = basic_fun(x)
    A.backward()
    return x.grad

x = Variable(torch.FloatTensor([1]), requires_grad=True)
xx = x.clone()

# this works fine 
print(get_grad(x))

# this doesn't
print(get_grad(xx))

albanD · April 13, 2017, 2:31pm

The clone operation corresponds to making a copy of the Tensor contained in this variable.
That means that xx will be a new Variable with its history linked to it.
When you perform the backward pass, the gradients will only be accumulated in the Variables that you created (we call them leaf Variables) and for which you set requires_grad=True:

import torch
from torch.autograd import Variable

def basic_fun(x):
    return 3*(x*x)

def get_grad(inp, grad_var):
    A = basic_fun(inp)
    A.backward()
    return grad_var.grad

x = Variable(torch.FloatTensor([1]), requires_grad=True)
xx = x.clone()

# Grad wrt x will work
print(x.creator is None) # is it a leaf? Yes
print(get_grad(x, x))
print(get_grad(xx, x))

# Grad wrt xx won't work
print(xx.creator is None) # is it a leaf? No
print(get_grad(xx, xx))
print(get_grad(x, xx))

pietromarchesi · April 13, 2017, 3:23pm

Thanks a lot for you answer. So if I understand correctly a Variable needs to have a creator which is None to be considered a leaf, and only in that case the gradient will be accumulated in it.

So then if I want to initialize a new Variable using values from another (effectively making a copy) I would go for xx = Variable(x.data, requires_grad=True), or is there a different option?

albanD · April 13, 2017, 3:26pm

Yes this is how you want to do it.

percqdeng · June 19, 2017, 3:06pm

It seems that by doing this xx and x still share the same memory.
Try xx = Variable(x.data.clone(), requires_grad=True).

albanD · June 20, 2017, 8:25am

@percqdeng no it won’t, calling .clone() will create a new storage with new memory and copy the content of the original into this new memory.

percqdeng · June 24, 2017, 10:45am

That’s right. Assuming that the goal is to “let the copy have an independent memory” (by alan_ayu) and " initialize a new Variable using values from another " (by pietromarchesi) , we should use x.data.clone().
Otherwise, xx appears to be just a reference to value in x.

Brando_Miranda · October 2, 2017, 8:27pm

I dont understand, what are you trying to do?

Royi · December 25, 2017, 7:34pm

Let’s say you want to do the trick of ResNet bypass.

At some point you want to create a copy of x and add it to a later result.

How would you do it?

mhyousefi · September 13, 2018, 3:35pm

Hey guys.

I am in a situation where I want to access the values in a 1D tensor by means of an integer index. So, perhaps something like this:

def basic_fun(x_cloned):
    res = []
    for i in range(len(x)):
        res.append(x_cloned[i] * x_cloned[i])
    print(res)
    return Variable(torch.FloatTensor(res))


def get_grad(inp, grad_var):
    A = basic_fun(inp)
    A.backward()
    return grad_var.grad


x = Variable(torch.FloatTensor([1, 2, 3, 4, 5]), requires_grad=True)
x_cloned = x.clone()
print(get_grad(x_cloned, x))

I am getting the following error message:

[tensor(1., grad_fn=<ThMulBackward>), tensor(4., grad_fn=<ThMulBackward>), tensor(9., grad_fn=<ThMulBackward>), tensor(16., grad_fn=<ThMulBackward>), tensor(25., grad_fn=<ThMulBackward>)]
Traceback (most recent call last):
  File "/home/mhy/projects/pytorch-optim/predict.py", line 74, in <module>
    print(get_grad(x_cloned, x))
  File "/home/mhy/projects/pytorch-optim/predict.py", line 68, in get_grad
    A.backward()
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/mhy/.local/lib/python3.5/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

And I dont understand how using the cloned version of a variable is supposed to keep that variable in gradient computation. The variable itself is effectively not used in the computation of A, and so when you call A.backward(), it should not be part of that operation.

I appreciate your help!

mhyousefi · September 13, 2018, 3:42pm

And if I change my basic_fun function to return torch.cat(res), I get the error:

RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated