After I wrapped my model with DataParallel, this error happened:
RuntimeError: Assertion `THCTensor_(checkGPU)(state, 5, input, gradOutput, gradWeight, sorted, indices)’ failed. Some of weight/gradient/input tensors are located on different GPUs. Please move them to a single one. at /home/soumith/local/builder/wheel/pytorch-src/torch/lib/THCUNN/generic/LookupTable.cu:17
My model includes an embedding() layer.
Is this caused by embedding()?
If so, any suggestion on how to do multi-gpu properly with embedding() layers inside the model?
have the bug been fixed ? i meet the same error:
RuntimeError: Assertion `THCTensor_(checkGPU)(state, 5, input, gradOutput, gradWeight, sorted, indices)’ failed. Some of weight/gradient/input tensors are located on different GPUs. Please move them to a single one. at /data/plat/peakzeng/solfware/pytorch/torch/lib/THCUNN/generic/LookupTable.cu:17