I am trying to get automatic mixed precision’s autocast to work for an nn.Sequential model. The tutorial here:
https://pytorch.org/docs/master/notes/amp_examples.html#dataparallel-in-a-single-process
- suggests to overwrite the forward() when using multiple GPUs via DataParallel.
What is the best way to do this for the nn.Sequential case?
here is how I define my model currently:
layers = [ <bunch of layers>]
model = nn.Sequential(*layers)
model = nn.DataParallel(model)
model.to(device)