My model fits on single gpu and works nicely but it is slow. In order to scale the computations, I want to parallelise the model so that I can use multiple gpus.
I have done
model = torch.nn.DataParallel(model).cuda()
in place of model.cuda()
When I run this with 2 gpus, it is working with batch-size=1 but only using single gpu and when I increase the batch-size it says out of memory.
Looking for response.