ViewBackward recursion

What could cause this? It grows every backprop step and is severely slowing down my code

I found out the reason. In my model, I have this tensor:
self.log_alpha = torch.zeros(1, requires_grad=True)

Using it in my loss is fine:

            alpha_loss = -(self.log_alpha * (log_pi + self.target_entropy).detach()).mean()

But calling a view operation in my training loop, even with detach(), causes the ViewBackward recursion. Why?

    def _do_training(self):
        # self.log_alpha = self.log_alpha.view(1).detach()


The graph above seems to be referencing the layer_norms layer, the first module in it, and a parameter called bias in that module. Are you sure it is this log_alpha?

The recursion was occurring in multiple scenarios. In the layer_norms example, I was passing the final probability distribution in an attention module outside of the training loop. But the log_alpha was another concrete example of how to cause the behavior.