In-place vs. the same variables' name

Alpha · March 30, 2018, 12:52pm

Hi,

From this question RuntimeError when using a += b but not when doing a= a + b

I got this answer:

The difference between a += b and a = a + b is that in the first case, b is added in a inplace (so the content of a is changed to now contain a+b). In the second case, a brand new tensor is created that contains a+b and then you assign this new tensor to the name a.
To be able to compute gradients, you sometimes need to keep the original value of a, and so we prevent inplace operation from being done because otherwise, we won’t be able to compute gradients

from @albanD

However, I have some other questions:

For a = a+b, a brand new tensor is created that contains a+b and then you assign this new tensor to the name a . So, where is the tensor of the original a after assigning this new to tensor to the name a ?
Can Pytorch ‘remember’ the original tensor of a for back propagation?
if I do this:

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))   --->stage①
        x = self.pool(F.relu(self.conv2(x)))   --->stage②
        x = x.view(-1, 16 * 5 * 5)             --->stage③
        x = F.relu(self.fc1(x))                --->stage④
        x = F.relu(self.fc2(x))                --->stage⑤
        x = self.fc3(x)
        return x

though I use the variable’s name x at all stages, the pytorch can remember all x at every stage， in order to calculate gradient during back propagation. Am I right?

thank you in advance!

jpeg729 · March 31, 2018, 7:59am

If no other names point to the original tensor named a, then the original tensor will be lost to the garbage collector. On the other hand if a was a Variable, then the original variable a is stored in the computation graph for the new Variable named a. When you operate on Variables, PyTorch automatically stores everything it needs for backpropagation.
Your code is fine. PyTorch will store all of the intermediate values that it needs.

Alpha · March 31, 2018, 8:38am

Thank you very much.

One more question, since Pytorch will store all of the intermediate values that it needs, why use the in-place operation will get some error

jpeg729 · March 31, 2018, 8:40am

Inplace operations modify the original data, which often makes it tricky/impossible to store the intermediate values needed for backpropagation.

Alpha · March 31, 2018, 8:45am

Thank you very much! I see :