I implemented generative adversarial network using both nn.ReLU() and nn.ReLU(inplace=True). It seems that nn.ReLU(inplace=True) saved very small amount of memory.
What’s the purpose of the using inplace=True?
Is the behavior different in backpropagation?
inplace=True means that it will modify the input directly, without allocating any additional output. It can sometimes slightly decrease the memory usage, but may not always be a valid operation (because the original input is destroyed). However, if you don’t see an error, it means that your use case is valid.
is not an in place operation, because you use the same variable name, but it’s not the same variable underneath. You just point your x name to a new variable, the old one is still in memory (because it’s referenced by the pytorch graph)
That’s a good question! I think that this was an initial limitation based on the PyTorch (whitepaper) manuscript on arxiv. But based on what I’ve seen (e.g., by @soumithhttps://gist.github.com/soumith/71995cecc5b99cda38106ad64503cee3) it seems that inplace ops like nn.ReLU(inplace=True) are supported in the autodiff engine now. Not sure, but I guess the same should be true for the functional one as it is referencing the nn. one.
For relu, when input is negative, both the grad and output should be zero, grads should stop propagating from there, so inplace doesn’t hurt anything while saves memory.
In case y = F.relu(x, inplace=True), it won’t hurt anything if value of x should always be positive in your computational graph. However, some other node that shares x as input while it requires x has both positive and negative value, then your network may malfunction.
For example, in the following situation,
y = F.relu(x, inplace=True) (1)
z = network(x) (2)
If (1) is declared first and execuated first, then value of x is changed, then z may have incorrect expected value.