How to handle the value outside the fp16 range when casting?

I know the fp32 and fp16 have different ranges.

How does the PyTorch handle the tensor whose value is outside the fp16 range when casting?

For example, x = torch.Tensor([66666])

If it cast x into inf, does this mean the gradient is Nan, and the training will fail?

Yes, a direct cast to float16 will overflow and create invalid values. During mixed-precision training with flaot16 this could happen if the loss scaling factor is too large and the gradients thus overflow.
The scaler.step(optimizer) call skips the optimizer.step() call if invalid gradients are detected and will decrease the scaling factor until the gradients contain valid values again.

1 Like

Hi @ptrblck, thanks for your reply. Is possible to paste the source code link of the skip behavior? I want to explore the details. :heart:

Yes, you can take a look at GradScaler.step and _maybe_opt_step to see the implementation.

1 Like

Got it, thanks a lot :blush:

Hello, I would like to ask if I use fp16 for the convolution operation during the inference process, so is the result after multiplication and addition fp32 or fp16? If it is fp32, how does pytorch intercept fp16? Is there any specific code for this part that I can refer to? looking forward to your reply.

Take a look at the Automatic Mixed Precision Package to see how PyTorch applies amp.