Possible to reweight GPU load with DataParallel?

Say I am training with 4 GPUs but GPU 1 is also in charge of doing some other stuff that I need to set aside some memory for. Is it possible to give GPUs 2-4 a bigger load and GPU 1 a smaller load instead of splitting a batch among all GPUs evenly (currently, this is leading to an OOM error on GPU 1 as expected)?

I don’t think this is easily doable in nn.DataParallel, but should be possible using DistributedDataParallel. Note that the former is also deprecated while the latter is the recommended approach, too.