Hi Folks,

I have never done it. If you call another torch module, let’s say that module computes forward and backward passes, and it has its own optimization sub-routine. Let’s say that a task can be seen as a parallel task. How can you do that becase? Becase secondary backward step, I think that will trigger the entire backward pass.

For example, in my case, inside a training loop, I have a secondary computation task solver Ax=b (oversimplified example). So for the main optimization, i.e., inside the training loop, (for the sake of simplicity) we solve Ax=b, and it does not influence gradient flow back to the main optimizer.

Pseudocode.

y=model(x)

loss() → let say inside a loss, it computes backward() somewhere here.-> the oput of that not s.t

backward()