From this question RuntimeError when using a += b but not when doing a= a + b
I got this answer:
The difference between a += b and a = a + b is that in the first case, b is added in a inplace (so the content of a is changed to now contain a+b). In the second case, a brand new tensor is created that contains a+b and then you assign this new tensor to the name a.
To be able to compute gradients, you sometimes need to keep the original value of a, and so we prevent inplace operation from being done because otherwise, we won’t be able to compute gradients
However, I have some other questions:
For a = a+b, a brand new tensor is created that contains a+b and then you assign this new tensor to the name a . So, where is the tensor of the original a after assigning this new to tensor to the name a ?
Can Pytorch ‘remember’ the original tensor of a for back propagation?
if I do this:
def forward(self, x): x = self.pool(F.relu(self.conv1(x))) --->stage① x = self.pool(F.relu(self.conv2(x))) --->stage② x = x.view(-1, 16 * 5 * 5) --->stage③ x = F.relu(self.fc1(x)) --->stage④ x = F.relu(self.fc2(x)) --->stage⑤ x = self.fc3(x) return x
though I use the variable’s name x at all stages, the pytorch can remember all x at every stage， in order to calculate gradient during back propagation. Am I right?
thank you in advance!