Are tensors stored in nn.Module.register_buffer trainable?

ihexx · September 22, 2019, 6:40pm

I’m trying to implement a trainable hyperparameter as a member of an nn.Module inheritor

I want it to be moved to the same device as the rest of the module with child_module.to(my_device)

I have done this with
self.log_alpha= torch.tensor(0., requires_grad=True)
then overriden child_module.to() to move it manually then call the parent method as so:

    def to(self, *args, **kwargs):
        super(Trainer, self).to(*args, **kwargs)
        device, _, _ = torch._C._nn._parse_to(*args, **kwargs)
        self.log_alpha.to(device)
        self.conf.trainer_device = device

this little hack works just fine, but I read that the correct way to do this was with register_buffer
as so:
self.register_buffer('log_alpha', torch.tensor(0., requires_grad=True))

However, doing it this way, log_alpha doesn’t get updated by its optimizer, but it does with my old hack.
Am i using register_buffer() incorrectly, or is it not supposed to be used with trainable params?

ptrblck · September 22, 2019, 8:12pm

If self.log_alpha is trainable (requires gradients), you should wrap it in an nn.Parameter, which will also make sure to move the tensor to the device.