Pytorch tensor inverse slower on GPU than CPU

[3, 3] is too small for gpu to be better