After i read the Pytorch docs, i think it’s not.
But what about this situation ?
op1
output a Tensor output1
(dtype=torch.float16). The next operation op2
is FP32 type, so it needs torch.float32
input. Now, does op2
need to convert output1
dtype to torch.float32
?
No, autocast
does not convert everything to float16
, as it’s numerically not stable enough for a lot of use cases. You could perform if via directly calling model.half()
and wouldn’t need autocast
for it.
Yes, the transformations are done when needed.
Similar to Tensor.to(device=torch.float32)
?
In this case, I think it will take a lot of time to switch between torch.float16
and torch.float32
, because there are many ops with different FP type.
Yes, you are right that transformations are not free, but eventually you would still see an end2end speedup using amp and you should profile your models to check it.
I get it, thank you very much.