Optimisation on aliased parameters

Hi everyone,
I would like to constrain my optimisation on a subset of the training params, also imposing a relation among them, in order to reduce the number of DoF on my transform.
I have searched for “alias”, “parameter sharing” but nothing seems to fit.

For example, using STN components I would like to learn just 3 variables (e.g. 2 offsets and 1 scale), instead of the full 6 parameters. I thought that it could be done building the parameter matrix by hand.
In order to do so in my model init:

def __init__:
  # [...]
  self.mytensor = torch.rand(N, C, H, W) #learn this
  self.offset = torch.randn(N, 2).double().cuda() # learn this
  self.scale = torch.ones(1).double().cuda() # learn this
  self.mytensor.requires_grad = True
  self.scale.requires_grad = True
  self.offset.requires_grad = True
  # [...]

while in my model forward I compose the temporary tensor reading the actual learned values

def forward(self, index):
  with torch.no_grad():
    # temporary affine transform matrix
    cm = torch.zeros(1,2,3, dtype=torch.double, device=self.mytensor.device, requires_grad=True)
    cm[0,0,0], cm[0,1,1] = self.magfactor[0], self.magfactor[0]
    cm[0,:,2] = self.offval[index]
  cm.requires_grad = True
  # affine transform matrix is ready for actual forward pass  
  grid = torch.nn.functional.affine_grid(cM, self.mytensor.size())
  y = torch.nn.functional.grid_sample(self.mytensor, grid)
  # some operations on y
  retval = y**2
  return retval

In my main script

model = myModel()
modelparams = [model.mytensor, model.scale, model.offset]
optim = torch.optim.Adam(modelparams_par, amsgrad=False)
loss = #...
[...] 
loss.backward()
print(model.offval.grad, model.magfactor.grad) # print None

But the gradients on those two parameters are None. If instead of building a temporary tensor, I use a complete theta tensor, (as in a plain STN), everything works fine.
I thought that was a legit alias but it seems not.
So how to optimise just model.offval and model.magfactor? Thank you in advance

I will reply myself:

in the model forward, the correct code is:

def forward(self, index):
  # temporary affine transform matrix, no more within no_grad
  cm = torch.zeros(1,2,3, dtype=torch.double, device=self.mytensor.device, requires_grad=True)
  cm = cm + 0.0 ### here is the change
  cm[0,0,0], cm[0,1,1] = self.magfactor[0], self.magfactor[0]
  cm[0,:,2] = self.offval[index]
  # affine transform matrix is ready for actual forward pass  
  grid = torch.nn.functional.affine_grid(cM, self.mytensor.size())
  y = torch.nn.functional.grid_sample(self.mytensor, grid)
  # some operations on y
  retval = y**2
  return retval