I have a setup with two networks, a `classifier`

and a `predictor`

. The `classifier`

has two outputs, the class predictions and the intermediate features, `cls, f = classifier(x)`

. The `predictor`

simply outputs a scalar based on those intermediate features `pred = predictor(f)`

.

Now, I would like to train both networks on the following objective: `obj = CE + MSE`

, where `CE`

is the cross-entropy of the `classifier`

and `MSE`

is the mean squared error of the `predictor`

. The training is adversarial, i.e. the `classifier`

attempts to both lower the `CE`

and increase the `MSE`

, while the `predictor`

attempts to lower the `MSE`

. The relevant snippet of the training procedure:

```
cls, f = classifier(x)
predictor.eval()
with torch.no_grad():
pred = predictor(f)
classifier_loss = ce_loss(cls, class_target) - mse_loss(pred, prediction_target)
classifier_loss.backward()
predictor.train()
pred = predictor(f)
predictor_loss = mse_loss(pred, prediction_target)
predictor_loss.backward()
```

Needless to say, this requires double computation of `pred = predictor(f)`

. GAN-style `detach`

ing does not seem to be applicable here.

Is there any way to perform these two backward passes without recomputing `pred = predictor(f)`

or is it strictly necessary to construct two computation graphs and this part cannot be circumvented?