ViewBackward recursion

whoab · September 1, 2019, 3:23pm

What could cause this? It grows every backprop step and is severely slowing down my code

whoab · September 1, 2019, 3:53pm

I found out the reason. In my model, I have this tensor:
self.log_alpha = torch.zeros(1, requires_grad=True)

Using it in my loss is fine:

            alpha_loss = -(self.log_alpha * (log_pi + self.target_entropy).detach()).mean()
            self.alpha_optimizer.zero_grad()
            alpha_loss.backward()

But calling a view operation in my training loop, even with detach(), causes the ViewBackward recursion. Why?

    def _do_training(self):
        # self.log_alpha = self.log_alpha.view(1).detach()

albanD · September 3, 2019, 12:16am

Hi,

The graph above seems to be referencing the layer_norms layer, the first module in it, and a parameter called bias in that module. Are you sure it is this log_alpha?

whoab · September 3, 2019, 1:26am

The recursion was occurring in multiple scenarios. In the layer_norms example, I was passing the final probability distribution in an attention module outside of the training loop. But the log_alpha was another concrete example of how to cause the behavior.