DataParallel imbalanced memory usage

If you want the loss to be splitted among gpus, just make your loss layer part of the DataParallel and add a sum or mean operation on what you get out of it. That way if you use DataParallel on 4 devices, only 4 extra numbers will be allocated on the output_device. Is that a good solution for you?

6 Likes