Multiple GPUs to run the same instance of a model with different inputs

Sherine_Brahma · January 22, 2021, 12:05pm

Hi,

First, I want to thank you for your effort in maintaining this forum which made learning PyTorch much easier for me. I had a question regarding using two GPUs to run the same instance of a model with different inputs (think of it as different samples of a distribution). I am using two GPUs because I have a computationally heavy framework and running my network once fills up the capacity of one GPU.

My question is whether it possible to update the same network, gen_net, based on different inputs? I had this question because if I used different GPUs, while backpropagating, I thought Pytorch will update separate instances of my network stored in different GPUs. What I want is to update the same network but using different GPUs.

Code snippet

gen_net = gen_net.to(torch.device(0))
by_gen_1 = gen_net(bx) # bx is my input sample

gen_net = gen_net.to(torch.device(1))
bx = bx.to(torch.device(1))
by_gen_2 = gen_net(bx) # This becomes a different input because we add noise along with bx inside gen_net

loss = loss_func(by_gen_1, by_gen_2)
loss.backward()

ptrblck · January 23, 2021, 10:30am

If I understand your use case correctly, you are trying to clone the same model onto GPU0 and GPU1, use different inputs, and calculate the loss based on both outputs.
Once this loss is calculated, you would like to calculate the gradients for both models.
How should these gradients be handled? Since the inputs were different, the model clone would get different gradients.

If you would like to reduce the gradients to a single model, I think you should be able to use a distributed approach e.g. via DDP.

Sherine_Brahma · January 23, 2021, 10:57am

Thank you for the reply. I would want the gradients to accumulate to the same model but I do not know how will that happen if I have two models in different GPUs.