Multiple Models in DataParallel: RuntimeError: arguments are located on different GPUs

jerinphilip · November 8, 2018, 4:41am

The following is a traceback, and I think at the moment that the error is due to having to models - generator and discriminator applied with DataParallel separately leading to the following issue at D(G(z)).

What’s the way to fix this?

  File "/home/jerin/code/fairseq/fairseq/models/lstm.py", line 207, in forward
    x = self.embed_tokens(src_tokens)
  File "/home/jerin/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jerin/.local/lib/python3.5/site-packages/torch/nn/modules/sparse.py", line 110, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/jerin/.local/lib/python3.5/site-packages/torch/nn/functional.py", line 1110, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generic/THCTensorIndex.cu:513
Exception ignored in: <bound method tqdm.__del__ of | epoch 000:   0%|                                                                                                                                                                                                                                                                                 | 0/3750 [00:08<?, ?it/s]>

a57553ad19ca6baa05ac · February 2, 2019, 4:21pm

I met the same problems, have you fixed it?

jerinphilip · February 2, 2019, 4:38pm

I was trying to switch models between GPUs. I solved this by putting all models in one module inheriting nn.Module and applying DataParallel on that. In the forward method definitions, I called/accessed the individual models (generator and discriminator).

a57553ad19ca6baa05ac · February 2, 2019, 4:45pm

OK, I only have one model. But with the same embedding Error. Thank you anyway.

Khuong_Nguyen_Duy · March 20, 2019, 12:22am

I have the same error when I trying to use multiple embeddings

yyHaker · April 29, 2019, 7:13am

I also use the same one embedding and one model, but still the error, do you fix the problem?
I load the embedding from numpy array, but still has the error:

alibabadoufu · October 24, 2019, 12:57pm

When I apply Dataparallel, I have the same problem with multiple model passing embedding weights across each other.