Backwards pass using Pytorch/XLA

almeetb · July 9, 2020, 9:44pm

Hello, sorry if this is repeat question but how does Pytorch autograd engine fit with dispatching operations to the XLA backend. I understand that functions will save outputs in forward passes to use in the backwards pass but I don’t think the XLA backend makes use of these functions or sets the grad_fn

albanD · July 9, 2020, 9:53pm

Hi,

The autograd should work just the same as with regular CPU Tensors.
Do you have a code sample where there is discrepancy?

almeetb · July 9, 2020, 9:57pm

No discrepancy, I was just trying to get a better understanding of how the lazy execution flow works with backprop.

albanD · July 10, 2020, 5:40pm

My basic understanding is that the lazy execution is completely hidden from pytorch by the xla binding.
So pytorch thinks it is synchronous. And only when we access the values on the CPU that xla actually compute stuff.

So it should have no impact