Backward on iterative parameter updates

boskova · September 30, 2019, 3:29pm

Dear all,

The core of my architecture consists of encoding the features from the input image, and then iteratively regressing the incremental changes of output parameters, where the regressor regresses on features and the current value of the parameters, in each iteration. The loss is backpropagated in every iteration, and the step on the regressor optimizer is taken in every iteration likewise. The step for the optimizer for the feature encoder is done only once, at the end of the of the last iteration. The goal is to optimize the encoder parameters only once, after the incremental regression optimization has been performed.
The main part of the code to do this is shown below:

current_parameters = torch.zeros(size, device=self.device, dtype=torch.float)
features = self.encoder(images)

for n in range(self.n_regresor_iterations):

self.optimizer_encoder.zero_grad()
self.optimizer_param_regressor.zero_grad()
parameters = self.param_regressor(features, current_parameters)
current_parameters += parameters

loss_total = loss_fn(current_parameters, ground_truth_parameters)

if n < self.n_regresor_iterations - 1:
    loss_total.backward(retain_graph=True)
else:                
    loss_total.backward()

self.optimizer_param_regressor.step()
if n == self.n_regresor_iterations - 1:
    self.optimizer_encoder.step()

return loss_total.item()

This seems to be training just fine, however, I have been unsuccessfully looking for possible optimizations:
Is it possible to modify this code so that I can get rid of retain_graph=True above?
It increases the memory consumption proportionally to the number of iterations.
Do you see any other problems with this code?

Thank you in advance