I am trying to train my CNN network on multiple GPUs using nn.DataParallel
. However, I’ve encountered the following issue: When distributing the input data [a1, a2, a3, a4]
to two GPUs (gpu:0 and gpu:1), I expected each GPU to receive data [a1, a2]
and [a3, a4]
, respectively. However, in reality, I’m getting [a1, a2]
on gpu:0 and [0s or data from an unknown source]
on gpu:1.
original input:
tensor([[1., 0., 0., 1.],
[0., 0., 1., 1.],
[0., 1., 0., 1.],
[0., 1., 1., 0.]], device=‘cuda:0’, dtype=torch.float64)
I use hook to obtain and print the data that flow in parallel gpus:
tensor([[1., 0., 0., 1.],
[0., 0., 1., 1.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], device=‘cuda:0’, dtype=torch.float64)
Can someone tell me why there is such a problem?
Post script:The training data for the network is generated based on the latest model, so there isn’t an existing dataset。
another questions:How to obtain output of intermediate layer When the model is trained on multiple GPUs