ELU + Residual Unit = Not InPlace Computation

I have problem with using ELU, SELU, Sigmoid, Tanh with Residual Connection (Relu, PReLU works nice).
Here is code:

import torch.nn as nn
import math
import torch.nn.init as init

# Residual Blok
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1      = nn.Conv2d(inplanes, planes, kernel_size=3, stride=1,
                                                padding=1, bias=False)
        self.relu1      = nn.ELU(1.0, False)
        self.conv2      = nn.Conv2d(planes, inplanes, kernel_size=3, stride=1,
                                                padding=1, bias=False)
        self.relu2      = nn.ELU(1.0, False)

    def forward(self, x):
        residual = x
        out      = self.conv1(x)
        out      = self.relu1(out)
        out      = self.conv2(out)
        out      = self.relu2(out)

        out      += residual

        return out

if __name__ == "__main__":
    import torch
    from torch.autograd import Variable
    m    = BasicBlock(3,32).float()
    data = Variable(torch.Tensor(16,3,112,96).float())
    feat = m(data)
    loss = feat.sum()
    print (loss)


Traceback (most recent call last):
  File "grad_problem.py", line 42, in <module>
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 156, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/usr/local/lib/python2.7/dist-packages/torch/autograd/__init__.py", line 98, in backward
    variables, grad_variables, retain_graph)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Then I remove line out += residual, everything works again. I’m not able to figure out what is wrong. Ex. ELU is set to not-inplace computation.

Ubuntu 16.04
PyTorch 3.1

PS. I found that using torch.add instead of += make it works again. So maybe there is issue in adding value function.

Indeed, += is an in-place operation, you can’t use it with variables requiring gradients:

v1 = Variable(torch.rand(5,5), requires_grad=True)
v2 = Variable(torch.rand(5,5), requires_grad=True)
v1 += v2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.
1 Like

Ok, I agree.
But still do not get why nn.ReLU and nn.PReLU works with inplace operation, this is because nn.Threshold don’t need inplace operation?