Applying optimizer to a slice of a variable

There a are a few approaches such as:

  • Creating different “sub-tensors” and using e.g. torch.cat to create the parameter before using it via the functional API as described here.
  • Restoring the “frozen” part of the parameter after each parameter update.
  • Guaranteeing that the “frozen” part of the parameter was never and will never be updated in which case even optimizers with running stats or momentum should not update it since past updates were all zero.