Hi everyone,
I would like to constrain my optimisation on a subset of the training params, also imposing a relation among them, in order to reduce the number of DoF on my transform.
I have searched for “alias”, “parameter sharing” but nothing seems to fit.
For example, using STN components I would like to learn just 3 variables (e.g. 2 offsets and 1 scale), instead of the full 6 parameters. I thought that it could be done building the parameter matrix by hand.
In order to do so in my model init:
def __init__:
# [...]
self.mytensor = torch.rand(N, C, H, W) #learn this
self.offset = torch.randn(N, 2).double().cuda() # learn this
self.scale = torch.ones(1).double().cuda() # learn this
self.mytensor.requires_grad = True
self.scale.requires_grad = True
self.offset.requires_grad = True
# [...]
while in my model forward I compose the temporary tensor reading the actual learned values
def forward(self, index):
with torch.no_grad():
# temporary affine transform matrix
cm = torch.zeros(1,2,3, dtype=torch.double, device=self.mytensor.device, requires_grad=True)
cm[0,0,0], cm[0,1,1] = self.magfactor[0], self.magfactor[0]
cm[0,:,2] = self.offval[index]
cm.requires_grad = True
# affine transform matrix is ready for actual forward pass
grid = torch.nn.functional.affine_grid(cM, self.mytensor.size())
y = torch.nn.functional.grid_sample(self.mytensor, grid)
# some operations on y
retval = y**2
return retval
In my main script
model = myModel()
modelparams = [model.mytensor, model.scale, model.offset]
optim = torch.optim.Adam(modelparams_par, amsgrad=False)
loss = #...
[...]
loss.backward()
print(model.offval.grad, model.magfactor.grad) # print None
But the gradients on those two parameters are None. If instead of building a temporary tensor, I use a complete theta tensor, (as in a plain STN), everything works fine.
I thought that was a legit alias but it seems not.
So how to optimise just model.offval and model.magfactor? Thank you in advance