When DataParallel is used for traning how to know the ground truths of data batches on different GPUs?

In my code, every hidden layer output and the ground truths of that mini-batches are used for estimating a quantity in every iteration. This works fine when single GPU is used (say batch_size =32). As every hidden layer output and ground truths would be of same number i.e, 32

But when I use datatparallel with batch_size=64 on 2 GPUs, I have the grounds of size 64, but outputs from hidden layers are of size 32 from each of the 2 gpus.

How to know how the 64 images are divided onto 2 GPUs, so that I can get the ground truths of size 32 to use it with outputs from every hidden layer.

(Previously i was passing the entire ground truth values to forward hooks created on each layer)

What do you mean by “ground truths”?

Also, have you tried using the Model Parallel method? That will divide the model layers between GPUs instead of batches.

https://pytorch.org/tutorials/intermediate/model_parallel_tutorial.html

Ground truths mean the original labels for a given mini-batch of images.
I did not try model parallel , i will try it.

Ah. Gotcha. Can you post your model here? It’s tough to identify the exact issue with your hidden layer size output without seeing the structure.