Torch.as_tensor vs register_buffer in nn.Module

dashesy · August 23, 2019, 10:58pm

What is the benefit of using register_buffer compared to declaring a normal tensor in __init__?

class MyModule(nn.Module):
    def __init__(self, child):
        self.child = torch.as_tensor(child).int()
        # vs
        self.register_buffer('child', torch.from_numpy(np.array(child, dtype=np.int32)))

The buffer is serialized along with the module, but if we initialize it in __init__ then that would not matter.
Also buffer will be pushed to cuda if the model is pushed to cuda, but we could ourselves check the device and push accordingly

So what is the special case that only buffer can do?
It would be nice to have a .register_buffer that does not serialize the tensor, just push to cuda with model.cuda

ptrblck · August 24, 2019, 12:20am

Since buffers are serialized, they can also be restored using model.load_state_dict. Otherwise you would end up with a newly initialized buffer.
That’s possible, but not convenient, e.g. if you are using nn.DataParallel, as each model will be replicated on the specified devices. Hard-coding a device inside your model won’t work, so you would end up using some utility functions are pushing the tensor inside forward to the right device.

dashesy · August 24, 2019, 12:54am

Would be nice as a feature to have a register_buffer option that avoid the serializing part. I have some intermediate tensors that are constant, so there is no point in serializing them for example. But still will benefit from automatic .cuda() pushing to the correct device for example.