DataParallel fails in eval mode, a bug?

I train a cnn network with batch normalization layers where my loss function is based on L2 distance.
When I use DataParallel to train the network, the evaluation results of the network dramatically fail in eval() mode compared to train() mode.
I understand the difference between eval() and train() and that their performance may differ at a certain level; however, I still think that this issue is related to DataParallel since when I train the model without DataParallel, i.e., with a single gpu, the evaluation results in eval() and train() modes remain almost identical.


Maybe try doubling the batch size?

Thanks for the response. In my case, I cannot increase the batch size due to memory limitations. However, I’ve noticed that when I evaluate the model trained using DataParallel, every change I make to, e.g., the batch size or the number of gpus, significantly degrade the performance even when evaluated in train() mode.

I meet the same problem.
The error information :
File “/home/wen/PycharmProjects/Attention-Echino/”, line 141, in evaluate
output= model(input,target)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/modules/”, line 477, in call
result = self.forward(*input, **kwargs)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/”, line 124, in forward
return self.gather(outputs, self.output_device)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/”, line 136, in gather
return gather(outputs, output_device, dim=self.dim)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/”, line 67, in gather
return gather_map(outputs)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/”, line 54, in gather_map
return Gather.apply(target_device, dim, *outputs)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/”, line 65, in forward
return comm.gather(inputs, ctx.dim, ctx.target_device)
File “/home/wen/anaconda3/lib/python3.6/site-packages/torch/cuda/”, line 160, in gather
return torch._C._gather(tensors, dim, destination)
RuntimeError: (gather at torch/csrc/cuda/comm.cpp:177)
frame #0: + 0xc48aea (0x7fc338656aea in /home/wen/anaconda3/lib/python3.6/site-packages/torch/
frame #1: + 0x39124b (0x7fc337d9f24b in /home/wen/anaconda3/lib/python3.6/site-packages/torch/
frame #2: _PyCFunction_FastCallDict + 0x154 (0x563b9f076744 in /home/wen/anaconda3/bin/python)
frame #3: + 0x19842c (0x563b9f0fd42c in /home/wen/anaconda3/bin/python)
frame #4: _PyEval_EvalFrameDefault + 0x30a (0x563b9f12238a in /home/wen/anaconda3/bin/python)
frame #5: + 0x1918e4 (0x563b9f0f68e4 in /home/wen/anaconda3/bin/python)
frame #6: + 0x192771 (0x563b9f0f7771 in /home/wen/anaconda3/bin/python)
frame #7: + 0x198505 (0x563b9f0fd505 in /home/wen/anaconda3/bin/python)