Is there a way to force some functions to be run with FP32 precision?

Currently I want to train my model using FP16. However, I have a model which utilizes some CUDA ops borrowed from others’ repo. Those ops only accept FP32 input. Since I’m not familiar with CUDA, I don’t want to modify them. A workaround I’ve came up with is to always apply FP32 in the forward function of these ops, but apply FP16 on the other part of the model. So I’m wondering is there any way in PyTorch to do so? Something like a decorator would be extremely helpful:

class CUDAOPWarper(nn.Module):


    @torch.amp.forcefp32()  # a decorator that indicates this function should be run in full-precision
    def forward(self, input):


Yes, you can disable autocast for custom method as described in this example.

Ahhh yes. That’s exactly what I want! Thanks so much for the reply.