Retain_graph vs recompute

Hi,

  • You need to use retain_graph because .backward() goes through the whole graph (both encode/decoder here). And so if you want to be able to backward in the decoder again you need to retain_graph.
  • You can use retain_graph if you don’t change any value required by the backward. In particular here, the optimizer step() changes the parameters inplace and might prevent you from being able to backward a second time (make sure to run v1.5.0+ as this was fixed recently).

Both of them will just work very similarly. You will either do extra work during a backward that you don’t care of an extra forward.

1 Like