Autocast from DataParallel module

I’m trying to reduce memory usage by using autocast at method forward only for some modules inside the model.

However, in a situation where the model is Dataparalleled, doesn’t performance decrease occur even if I do not specify that autocast is performed on a specific device like torch.autocast("cuda", torch.float16) ?

I want to know if I can get better (memory or speed) performance when using something like torch.autocast(input.device, torch.float16)

No, the device_type argument in autocast doesn’t care about the actual device with its ID (so cuda:0, cuda:1 etc.) but only about the type (cuda vs. cpu).

1 Like