As you said, the script you have keeps the computation graph alive until the output = ...
in the second iteration. In the usual case where a loss is backproped in each iteration, the graph is freed after calling backward
, and thus OOM can’t happen.
The other way to do this is to wrap the iteration into a closure.
Either way, if you are not using the computation graph to do backward, you shouldn’t build it in the first place. So I would suggest using Variable(..., volatile=True)
(or torch.no_grad()
on master).