Hello,
I am trying to manipulate the running mean of the batchnorm layer in a NN as follows -
new_running_mean = running_mean * x + b
x and b
are both trainable parameters defined as -
nn.Parameter(x, requires_grad = True)
I understand that batchnorm running mean is non trainable hence the requires_grad = True
flag is giving me an error - the derivative for 'running_mean' is not implemented
However, I want x and b
to be trainable parameters, setting requires_grad = False
will directly contradict this.
How can I work around this issue?
Thanks