Hi,
The difference is that in the resnet
there is a batchnorm between the conv and the relu.
The conv operation need it’s output to be able to compute the backward pass. The batchnorm operations does not need it’s output to compute the backward pass.
So the operation that comes just after batchnorm is allowed to make changes inplace, while an operation coming just after a conv is not.
When you do x = self.relu(x)
you basically assign to the python variable x
the tensor returned by the self.relu
operation. It happens that sometimes, this tensor is the same as the input one (if you have the inplace option enabled).