I know that `detach()`

is used for detaching a variable from the computational graph. In that context, are the following expressions `x = x - torch.mean(x, dim=0).detach()`

and `x = x - torch.mean(x, dim=0)`

equivalent? I just want to subtract the mean out, don’t want to pass gradients through the average calculation.

No, in second case you’ll have additional trainable shared term.

E.g. for two element vector, you’ll have xout[0] = x[0] - (x[0]+x[1])/2, so xout[0] gradient will affect x[1].