Memory usage baloons when saving output of forward pass

I am having a big memory issue where when I save and append the output of my model it increases the memory usage by a lot on every pass. I have no idea if this is intended or if this is a bug or something stupid I am doing. I modeled it off this example: https://github.com/pytorch/examples/blob/master/reinforcement_learning/actor_critic.py

Here is where I am appending the values to backprop on after the episode:

It looks like you are storing output Variables, and that is the source of the problem because when you store a Variable, you force python to keep in memory the entire computation graph for that Variable.

You should probably save the underlying tensors instead.

self.saved_actions.append(..., state_value.data, ...)

When I do a backprop, don’t I need that computation graph? I got around this by taking the float values from the Variables but then my network never actually learned :frowning:

EDIT: I just tested this by changing the pytorch RL example and saving the m.log_prob(action).data. It ended up not learning on the backprop and was stuck at an average length of 20-21. This leads me to believe I need the computation graph for backprop

I see.
Well, either you save the log_prob of the action with its computation graph, or you recalculate the log_prob from the state before you calculate the losses.

So, either you accept ballooning memory, or you accept redoing computations.