I am trying to manipulate the running mean of the batchnorm layer in a NN as follows -
new_running_mean = running_mean * x + b
x and b are both trainable parameters defined as -
nn.Parameter(x, requires_grad = True)
I understand that batchnorm running mean is non trainable hence the
requires_grad = True flag is giving me an error -
the derivative for 'running_mean' is not implemented
However, I want
x and b to be trainable parameters, setting
requires_grad = False will directly contradict this.
How can I work around this issue?