Does torch.cuda.amp support O2 almost FP16 training now?

Yes, that’s correct as its deprecated apex implementation was too limiting in its flexibility and disabled a few important use cases. There are workarounds using a custom optimizer now holding the states.

1 Like