Why relu(inplace=True) does not give error in official resnet.py but it gives error in my code?

The following is my code. If I remove inplace=True in relu, I do not have error. However, if I leave it, I will have RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation error

class BasicBlock(nn.Module):

def __init__(self, inplanes, planes, use_prelu, use_se_block, anchor=None):
    super(BasicBlock, self).__init__()
    self.anchor = anchor
    self.conv1 = conv3x3(inplanes, planes, stride=1)
    nn.init.xavier_normal_(self.conv1.weight)
    self.conv2 = conv3x3(planes, planes)
    nn.init.normal_(self.conv2.weight, mean=0, std=0.01)

    self.use_prelu = use_prelu
    if self.use_prelu:
        self.prelu1 = nn.PReLU(planes)
        self.prelu2 = nn.PReLU(planes)
    else:
        self.relu = nn.ReLU(inplace=True)

    self.use_se_block = use_se_block
    if self.use_se_block:
        self.se = SEBlock(planes)



def forward(self, x):
    if self.anchor is not None:
        x = self.anchor(x)

    residual = x

    x = self.conv1(x)
    if self.use_prelu:
        x = self.prelu1(x)
    else:
        x = self.relu(x)

    x = self.conv2(x)
    if self.use_prelu:
        x = self.prelu2(x)
    else:
        x = self.relu(x)

    if self.use_se_block:
        x = self.se(x)

    x += residual

    return x

I consulted the usage from https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py and cannot understand why there is no such error when used in resnet but I encounter the error in my code.

The usage in resnet.py is also weird. It is initialized with inplace=True. However, it is used with x = self.relu(x) in forward. Doesn’t this defeat the purpose of inplace=True? Is it equivalent to just writing self.relu(x) without reassignment?

Hi,

The difference is that in the resnet there is a batchnorm between the conv and the relu.
The conv operation need it’s output to be able to compute the backward pass. The batchnorm operations does not need it’s output to compute the backward pass.
So the operation that comes just after batchnorm is allowed to make changes inplace, while an operation coming just after a conv is not.

When you do x = self.relu(x) you basically assign to the python variable x the tensor returned by the self.relu operation. It happens that sometimes, this tensor is the same as the input one (if you have the inplace option enabled).

8 Likes

Correct me if i am wrong. Is writing self.relu(x) when relu is inplace the same as writing x=self.relu(x), such that in the resnet code, the assignment is unnecessary? Thats my understanding of inplace.

inplace means that it will not allocate new memory and change tensors inplace. But from the autograd point of view, you have two different tensors (even though they actually share the same memory). One is the output of conv (or batchnorm for resnet) and one is the output of the relu.

2 Likes

I am still confused if I write x=self.relu(x), I am also not allocating new memory since I replace the original x?

No you are, because the relu operation creates a new tensor, fill it with the result of the operation and then assign this new tensor the python variable x.
If x is a tensor, y = x does not do any copy, it just sets both python variables to the same tensor.

1 Like