Multi-GPU training using multiple Inputs

blueeagle · July 13, 2021, 6:17am

Hi everyone,

i implemented an architecture that handles multiple inputs, each being processed by its own encoder. In order to speed things up, I want to train my model on multiple gpus. This is my code:

def forward(self, x):
''' x: list of input tensors '''
h = list()
        for i, x_i in enumerate(x):
            h_i = self.encoders[i](x_i)
            h.append(h_i)
z = torch.cat(h, dim=0)
y_pred = self.classifier(z)

If I would simply use data_parallel class here, each encoder would get copied to each GPU, however I think it would be faster, if each encoder would be trained on its own GPU. Is there any possibility to achieve this?

Thank you!

ptrblck · July 13, 2021, 8:15am

Yes, you can push each model to a specific GPU via self.encoders[i].to('cuda:{}'.format(i)) (where i would be the GPU id) as well as the input, transfer the outputs back, and concatenate the final output tensor.

blueeagle · July 13, 2021, 8:39am

Thank you! I think this is exactly what I was looking for.