I have to do
torch.inverse() on tensor with a size of
4 x 240 x 320 x 3 x 3 every iteration during training. Since
torch.inverse() supports batch inverse, I guess
3 x 3 is too small so
torch.inverse() is pretty slow on GPU. So I moved the batch from GPU to CPU when doing
torch.inverse() and move the output back to GPU for other operations. I’m wondering if
GPU->CPU->GPU would break the back-propagation as I haven’t got any errors so far?