Computational graph is only built once

I’m using PyTorch for auto-grad without any neural network involved.

I’m unable to provide a reproducible code otherwise it will be too many lines. From a high level, it looks like the following:

class MarkovModel(nn.module):
    def __init__(self):
        super(MarkovModel, self).__init__()
        self.potentials = nn.ParameterDict()

    def set_potantials(self, potentials):
class BeliefProb():
    def __init__(self):
        self.model = MarkovModel()
        self.belifs = ...
    def update_beliefs(self):
        self.beliefs = ...
    def inference(self):
        for i in range(SOME_CONSTANT):
model = MarkovModel()
potentials = ...
bp = BeliefProb(model)
optimizer = Adam(bp.model.parameters(), lr=0.01)
target = ...
for i in range(iters):
    loss = torch.abs(target - inference.beliefs).sum()

If I set iters = 1, then it works fine. But if iters>1, I got the complains that

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

It seems that calling bp.inference() can build the computational graph once, but when I called bp.inference() again, the computational graph will not be built for the second time.

I think avoiding multiple calls of backward() resolves.
For example, initialize loss with None and accumulating loss calculated in each iteration and call backward after the for loop like

loss = None
for _ range(iters):
    cur_loss = torch.abs(target - inference.beliefs).sum()
    if loss is None: loss = cur_loss
    else: loss += cur_loss

Toy example is computational-graph-is-only-built-once.ipynb · GitHub

Thanks for your reply. I think I forgot to add the optimizer in the code snippet. Please see the last block where I add the optimizer.step() during each iteration. Therefore, I can’t call loss.backward() after the loop.

The updated code snippet looks basically fine.


What this error means is that some part of the graph is shared between iteration. This is most likely due to the fact that you pre-compute something outside of the for loop and re-use it at each iteration inside.
If you want gradients flowing throw these computations, you should move them inside the for-loop.
If you don’t want gradients flowing, you should .detach() the result of the pre-computations so that gradients won’t flow back there.

1 Like

Thanks for the reply! You are right, after I move pre-computation inside the loop, gradients propagate smoothly.