Error while using nn.dataparallel

Hi,

I am getting the below error when I am trying to use dataparallel in my code. Has anyone else seen a similar issue? I am not sure why this error is occurring, when I try to run my code on a single GPU card then I don’t get this error but I only get it when I parallelize it over multiple GPU cards.

outps = model(event_reprs)

File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1130, in _call_impl
return forward_call(*input, **kwargs)
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py”, line 169, in forward
return self.gather(outputs, self.output_device)
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py”, line 181, in gather
return gather(outputs, output_device, dim=self.dim)
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py”, line 78, in gather
res = gather_map(outputs)
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py”, line 73, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py”, line 73, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File “/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py”, line 73, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: ‘float’ object is not iterable
Uncaught exception. Entering post mortem debugging
Running ‘cont’ or ‘step’ will restart the program

/home/min/a/abcd/anaconda3/envs/pt17/lib/python3.9/site-packages/torch/nn/parallel/scatter_gather.py(73)gather_map()
→ return type(out)(map(gather_map, zip(*outputs)))

I encountered the same error as you, the reason is that the loss value is of type float when trained by multiple GPUs.