Hello,
I have a simple (but large) encoder-decoder network. I have 2 GPUs, and I have put the encoder on one GPU and the decoder on the other. I have done this using .cuda(0) and .cuda(1) on the various modules and Variables. The forward pass works perfectly.
However, when I call loss.backward() it blows up with:
Traceback (most recent call last):
File "/home/mpeyrard/Workspace/nmt/nmt_train_europarl.py", line 66, in <module>
train_nmt(args.train, args.vocabulary)
File "/home/mpeyrard/Workspace/nmt/nmt_train_europarl.py", line 40, in train_nmt
loss.backward()
File "/home/mpeyrard/anaconda3/envs/nmt/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/mpeyrard/anaconda3/envs/nmt/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:269
Is this supported?