Autocast for DataParallel nn.Sequential?

I am trying to get automatic mixed precision’s autocast to work for an nn.Sequential model. The tutorial here:
https://pytorch.org/docs/master/notes/amp_examples.html#dataparallel-in-a-single-process

  • suggests to overwrite the forward() when using multiple GPUs via DataParallel.
    What is the best way to do this for the nn.Sequential case?

here is how I define my model currently:

layers = [ <bunch of layers>]
model = nn.Sequential(*layers)
model = nn.DataParallel(model)
model.to(device)

I came up with this solution, extending the Sequential class and overriding its forward like this:

class AutocastSequential(nn.Sequential):
    @autocast()
    def forward(self, input):
        for module in self:
            input = module(input)
        return input

It executes, although I am unsure how to check if mixed precision is actually enabled for all GPUs