I’m trying to implement a trainable hyperparameter as a member of an nn.Module inheritor
I want it to be moved to the same device as the rest of the module with
I have done this with
self.log_alpha= torch.tensor(0., requires_grad=True)
child_module.to() to move it manually then call the parent method as so:
def to(self, *args, **kwargs): super(Trainer, self).to(*args, **kwargs) device, _, _ = torch._C._nn._parse_to(*args, **kwargs) self.log_alpha.to(device) self.conf.trainer_device = device
this little hack works just fine, but I read that the correct way to do this was with register_buffer
self.register_buffer('log_alpha', torch.tensor(0., requires_grad=True))
However, doing it this way,
log_alpha doesn’t get updated by its optimizer, but it does with my old hack.
Am i using
register_buffer() incorrectly, or is it not supposed to be used with trainable params?