What's the difference between nn.ReLU() and nn.ReLU(inplace=True)?

I implemented generative adversarial network using both nn.ReLU() and nn.ReLU(inplace=True). It seems that nn.ReLU(inplace=True) saved very small amount of memory.

What’s the purpose of the using inplace=True?
Is the behavior different in backpropagation?

31 Likes

inplace=True means that it will modify the input directly, without allocating any additional output. It can sometimes slightly decrease the memory usage, but may not always be a valid operation (because the original input is destroyed). However, if you don’t see an error, it means that your use case is valid.

74 Likes

In this http://pytorch.org/docs/master/notes/autograd.html#in-place-operations-on-variables document, in place operation is not encouraged. But why did most official example in torchvision (e.g. Resnet) use nn.ReLU(inplace=True).

Also, does
x = self.conv1(x)
x = self.conv2(x)
be considered as in place operation (as they use the same variable name) or not?

6 Likes

x = self.conv1(x)
x = self.conv2(x)

is not an in place operation, because you use the same variable name, but it’s not the same variable underneath. You just point your x name to a new variable, the old one is still in memory (because it’s referenced by the pytorch graph)

13 Likes

Thanks! It makes sense!
But as inplace operation is not encouraged, why most official examples use nn.ReLU(inplace=True)?

7 Likes

I’m a newbie of pytorch. So I wonder whether nn.ReLI(inplace=True) would do harm to backprop? And what about F.ReLU(inplace)?

6 Likes

That’s a good question! I think that this was an initial limitation based on the PyTorch (whitepaper) manuscript on arxiv. But based on what I’ve seen (e.g., by @soumith https://gist.github.com/soumith/71995cecc5b99cda38106ad64503cee3) it seems that inplace ops like nn.ReLU(inplace=True) are supported in the autodiff engine now. Not sure, but I guess the same should be true for the functional one as it is referencing the nn. one.

5 Likes

@cdancette In that case, the relu(inplace=True) in vision/resnet.py is actually not inplace since it is used with x = relu(x) in forward()?

For relu, when input is negative, both the grad and output should be zero, grads should stop propagating from there, so inplace doesn’t hurt anything while saves memory.

10 Likes

Is this an in-place operation?

b = torch.tensor(5)
y = torch.sigmoid_(torch.tensor(4)) & y = torch.sigmoid(b)

Thanks!

Even if you use x = relu(x), it is still inplace, reassining x to the output of relu(x) does nothing here.

You can check :after relu(x) and x = relu(x), x has the same value in both cases.

torch.sigmoid_ is an inplace operation

torch.sigmoid is not.

You can check on pytorch:

>>> a = torch.tensor(1.0)
>>> torch.sigmoid(a)
tensor(0.7311)
>>> print(a)
tensor(1.0)
>>> a = torch.tensor(1.0)
>>> torch.sigmoid_(a)
tensor(0.7311)
>>> print(a)
tensor(0.7311)

In the first case, a still has its original value, while in the second case, a is different.

5 Likes

In case y = F.relu(x, inplace=True), it won’t hurt anything if value of x should always be positive in your computational graph. However, some other node that shares x as input while it requires x has both positive and negative value, then your network may malfunction.

For example, in the following situation,

y = F.relu(x, inplace=True) (1)
z = network(x) (2)

If (1) is declared first and execuated first, then value of x is changed, then z may have incorrect expected value.