Usage of register_buffer gives RuntimeError

Hello,
I know that register_buffer method would create variable that won’t be updated during backprobagation. So It won’t be a problem if applied in-place operation on that variable. right?
but when I did that it gives me the error of RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 5]] is at version 3; expected version 2 instead.
I was using custom layer.

class NoisyLinear(nn.Linear):

    def __init__(self, in_features, out_features, sigma_init=0.017, bias=True):
        super(NoisyLinear, self).__init__(in_features, out_features, bias=bias)
        self.sigma_weight = nn.Parameter(torch.full((out_features, in_features), sigma_init))
        self.register_buffer("epsilon_weight", torch.zeros(out_features, in_features))
        if bias:
            self.sigma_bias = nn.Parameter(torch.full((out_features,), sigma_init))
            self.register_buffer("epsilon_bias", torch.zeros(out_features))
        self.reset_parameters()

    def reset_parameters(self):
        std = math.sqrt(3 / self.in_features)
        self.weight.data.uniform_(-std, std)
        self.bias.data.uniform_(-std, std)

    def forward(self, input):
        self.epsilon_weight.normal_()
        bias = self.bias
        if bias is not None:
            self.epsilon_bias.normal_()
            bias = bias + self.sigma_bias * self.epsilon_bias
        return F.linear(input, self.weight + self.sigma_weight * self.epsilon_weight, bias)

so in the forward method I wanted to change epsilon_weight in-place using noraml_ method but it gave me the mentioned error.
But when I used torch.normal like this

self.epsilon_weight= torch.normal(self.epsilon_weight)

it worked fine. so why??

Hi,

When you do self.sigma_weight * self.epsilon_weight, then to compute the gradient of self.sigma_weight which is you Parameter, you need the value of self.epsilon_weight.
So if you modify it inplace, after it was used in this computation, then you will see this error.

The out of place version creates a new Tensor containing the new value. So the original value that was used is still there and the backward can work fine.