Hi,
I have to do torch.inverse()
on tensor with a size of 4 x 240 x 320 x 3 x 3
every iteration during training. Since torch.inverse()
supports batch inverse, I guess 3 x 3
is too small so torch.inverse()
is pretty slow on GPU. So I moved the batch from GPU to CPU when doing torch.inverse()
and move the output back to GPU for other operations. I’m wondering if GPU->CPU->GPU
would break the back-propagation as I haven’t got any errors so far?