Is there a way to force some functions to be run with FP32 precision?

Currently I want to train my model using FP16. However, I have a model which utilizes some CUDA ops borrowed from others’ repo. Those ops only accept FP32 input. Since I’m not familiar with CUDA, I don’t want to modify them. A workaround I’ve came up with is to always apply FP32 in the forward function of these ops, but apply FP16 on the other part of the model. So I’m wondering is there any way in PyTorch to do so? Something like a decorator would be extremely helpful:

class CUDAOPWarper(nn.Module):

    ...

    @torch.amp.forcefp32()  # a decorator that indicates this function should be run in full-precision
    def forward(self, input):
        ...

Thanks!

Yes, you can disable autocast for custom method as described in this example.

Ahhh yes. That’s exactly what I want! Thanks so much for the reply.

Interesting, but does it is sufficient to disable the autocast in the forward pass if I use common operators in the custom block (such as multiplication for nn.Parameter)? Or something is required at backward level?

The link shows two examples:

  1. The first one showing how to disable autocast and explicitly cast the tensors to the desired dtype expected in your custom function. No backward implementation is shown and needed.
  2. The second example shows the addition of @custom_fwd and @custom_bwd to an autograd.Function which can be added if you are the author of this custom function.

Which approach is used depends on your workflow and control over custom autograd.Functions:

If you’re the function’s author (or can alter its definition) a better solution is to use the torch.amp.custom_fwd() and torch.amp.custom_bwd() decorators as shown in the relevant case below.