Use and Abuse of .register_buffer( )


I have some trouble understanding the use of register_buffer().
I found just a little bit of explanation in the docs, mentioning “running_mean” in BatchNorm.

My questions are:

  1. When should I register a buffer? For what sort of Variables and for which not?
  2. Could someone provide me with a simple example and code snippet of using register_buffer()?
    [3.] At the moment, I’m running some tests on an implementation of a custom gradient, which I subsequently modify. record the gradients before and after modification in two separate lists. Is that something I should consider register_buffer for to make the code cleaner? I guess not, if the buffer only holds one state at a time…

Any help much appreciated.
Many thanks,


you use register_buffer when:


Hi, using the dataparallel to train the model in the multi-gpu mode, BN is conducted in device-wise manner. how running mean and running variance is estimated? also in device-wise manner? or one master and multi replica? Thanks

A question about the register buffer. Can we delete the buffer after we register and how ?


I have the same question.