Derivative of two networks

Running the same code without torch.no_grad() doesn’t throw an error.

What is the difference between running .eval() and with torch.no_grad() and combining both?

EDIT: From 'model.eval()' vs 'with torch.no_grad()' - #2 by albanD

  • model.eval() will notify all your layers that you are in eval mode, that way, batchnorm or dropout layers will work in eval model instead of training mode.
  • torch.no_grad() impacts the autograd engine and deactivate it. It will reduce memory usage and speed up computations but you won’t be able to backprop (which you don’t want in an eval script)

So, when I do a gradient step for the classifier+encoder I want the adversary to be fixed. The combined loss from this step should not affect the adversary’s weights when I do the adversary weight update later.

Would changing the model from train to eval have that effect?