I was going through the amp docs and saw that autocast was applied only during the forward phase and not in backward phase.
Can someone shed some light on why fp16 is not used during backward phase?
I was going through the amp docs and saw that autocast was applied only during the forward phase and not in backward phase.
Can someone shed some light on why fp16 is not used during backward phase?
The cast operations are differentiable and will thus also be applied in the backward pass. You would however only need to add the autocast
usage in the forward pass (unless you want to define a custom backward
method).