I have been having some success with DataParallel although one issue is still evading me.
I want to use DataParallel on sequential data to pass through to an RNN, so I have the following structure:
self.model = nn.DataParallel(self.model, ...) self.model(input, seq_lengths)
where input has shape (batch_size, feature_length, max_seq_length)
and seq_lengths has shape (batch_size)
However, the problem I am running into is that seq_lengths does not need to go on the GPU (it is something that only the CPU part of the model forward() uses). Thus seq_lengths is a numpy array and is not getting scattered in the GPU work splitting process. Of course, I could wrap it in a Tensor, but then DataParallel will insist on sending it to the GPU, which is a waste since only the CPU processing needs it. What is the best way to deal with this?