Hi everyone,
i implemented an architecture that handles multiple inputs, each being processed by its own encoder. In order to speed things up, I want to train my model on multiple gpus. This is my code:
def forward(self, x):
''' x: list of input tensors '''
h = list()
for i, x_i in enumerate(x):
h_i = self.encoders[i](x_i)
h.append(h_i)
z = torch.cat(h, dim=0)
y_pred = self.classifier(z)
If I would simply use data_parallel class here, each encoder would get copied to each GPU, however I think it would be faster, if each encoder would be trained on its own GPU. Is there any possibility to achieve this?
Thank you!