I’m trying to implement a trainable hyperparameter as a member of an nn.Module inheritor
I want it to be moved to the same device as the rest of the module with child_module.to(my_device)
I have done this with
self.log_alpha= torch.tensor(0., requires_grad=True)
then overriden child_module.to()
to move it manually then call the parent method as so:
def to(self, *args, **kwargs):
super(Trainer, self).to(*args, **kwargs)
device, _, _ = torch._C._nn._parse_to(*args, **kwargs)
self.log_alpha.to(device)
self.conf.trainer_device = device
this little hack works just fine, but I read that the correct way to do this was with register_buffer
as so:
self.register_buffer('log_alpha', torch.tensor(0., requires_grad=True))
However, doing it this way, log_alpha
doesn’t get updated by its optimizer, but it does with my old hack.
Am i using register_buffer()
incorrectly, or is it not supposed to be used with trainable params?