Batch_norm the derivative for 'running_mean' is not implemented

I’m trying to reproduce the Wide residual network 28-2 for a semi supervised learning article I’m creating. But I’m having trouble using the Batch_norm.

I keep getting this error:
File “C:\Anaconda3\lib\site-packages\torch\nn\”, line 1708, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: the derivative for ‘running_mean’ is not implemented

Currently I’m using it like this:

where weight,bias,running_mean, and running_var all have been instatiated as:
nn.Parameter((torch.rand(16) - 0.5) * 1e-1)

Is batch_norm currently not working in training mode, or am I just doing something wrong here?

Hi tueboesen,
The running_mean and running_var should be registered as buffers and not parameters. The update to these tensors happens inside the function call. And gradients are not available. Hence, the error.


Okay thank you for clarifying.

If there is no plan of implementing that at some point I would say that the error should probably be changed to be more informative, because right now it just sounds like batch_norm isn’t working.