Disabling mixed precsion in my own layers

Hi, after reading the docs about mixed precsion, amp_example
I’m still confused with several problems.

Let’s say if I have two networks, one is the standard resnet50 and another is a sparse conv layer.
input images are first passed through resnet50 and then sparse convs.

If I only want to use half for resnet and keep float32 for the sparse conv layer (so I don’t have to modify the code)
I only need to warp the model within the autocast function and disable it before the sparse conv layers?
like,

with autocast():
      out = resnet50(x)
      with autocast(enabled=False):
             out = sparseconv(out.float())

right?

And from my knowledge, gradients are scaled during mixed precision,
If I have to write my own backward function for sparse conv layers (warped in autocast(disabled)),
Do I still need to consider the scale?

1 Like

Your autocast example looks correct.

Yes, if you are using a GradScaler object.

No, gradient scaling and auto casting are working together but are independent from each other.
You don’t need to explicitly implement gradient scaling for the custom layer.

2 Likes

Imagine we have an nn.Sequential model with a specific layer type that should not be autocasted. Is there a way to “wrap” that layer so that it is ignored by the autocast?

If you are using custom modules, you could use the @autocast decorator on the forward method and disable it. On the other hand, if you are using built-in modules, you could wrap them in a custom class and disable autocast in the same way.
Alternatively you could also split the nn.Sequential container and just decorate the parts with autocast, which should use it.

1 Like