I have some buffers defined using self.register_buffer('x', ...) and I need it to be not broadcasted. However, I do not want to set broadcast_buffers=False because I want other behaviours to remain the same (e.g. running stats in BN being broadcasted).
I want to accumulate results (in different processes) every n forward passes, aggregate, do something and repeat.
Let say I store the result in self.x. If I broadcast self.x at the end of a forward pass. In the second pass, when I broadcast again, there will be double counting.