Flow control in pytorch autograd

pentachris · November 19, 2022, 3:12pm

Hey there!
Does anyone know if pytorch autograd also works for flow control (e. g. if-statements, for-loops, …). If yes does it use the same statements and constants decided by the forward pass, only backwards and with its respective gradients?
Thank you for your answer!

soulitzer · November 19, 2022, 4:02pm

PyTorch autograd is define-by-run, so you’re allow to do arbitrary things in Python and autograd (which sits at a lower level) only sees the operations that are performed by tensors and builds the graph based on that - so yes whatever is done in forward is respected in the backward.

pentachris · November 19, 2022, 7:43pm

Thank you for your reply. So if I understand correctly, everything you are doing in the forward pass (If the values are pytorch tensors) is done the same way in the backward pass by pytorchs autograd. Just with gradients from the output to the input, by creating the graphs based on if requires grad is set to true, by the first time this method is called (in the forward pass).

Would this mean that you can perform autograd even on non pytorch functions (such as reshape, mean, max, …) even if they are not part of the pytorch library? I ask because this functions also work on pytorch tensors.

soulitzer · November 20, 2022, 3:27am

Yes, that is correct.

Not exactly. Though it is possible for functions like “max” to call into torch operations underneath (in this case a select), generally only torch.* or methods on tensors are recorded by autograd because pytorch needs to know about these functions in order interpose and properly create the graph. So when I talk about python features I was thinking more about things that relate to control flow: generators, context managers, etc.

pentachris · November 20, 2022, 2:14pm

Ok thank you.
I was just wondering because on my own project I did, I used the mean() and reshape() methods on np.arrays first and then switched everything towards torch.tensors without changing this methods. Suprisingly, the backpropagation process with autograd worked fine (Select backward was active and the results made sense) but I was wondering why. I know there are pytorch equivilents to aforementioned methods [torch.mean() and torch.reshape()] but why even use them in this case if python methods work aswell.

What you mean by that is, autograd automatically use torch.mean and torch.reshape if applied on tensors and np.mean or np.reshape if applied on numpy arrays [Supposed both librarys are included]. This would make sense now.

soulitzer · November 20, 2022, 6:13pm

hmm I don’t think np functions generally will work directly on tensors that require grad actually (this should error), so it is interesting that stuff is working.

pentachris · November 21, 2022, 7:48pm

numpy.reshape — NumPy v1.23 Manual
torch.reshape — PyTorch 1.13 documentation
If you import both numpy and torch, shouldnt the method .reshape() work on both numpy arrays and tensors (dependent on which datatype you have) because its denoted the same? So you would be right and numpy functions/methods dont work on tensors as you said

soulitzer · November 21, 2022, 8:57pm

Oh ok I see, yup that should work