ParameterList assigned to 1 GPU only (?)

Hey folks,

I am new to pytorch and I am trying to parallelize my network. Using nn.DataParallel seems to work as expected for the nn.modules living inside my class, however, it looks like the nn.ParameterLists that I’m defining as class members are listed as sitting in (GPU 0) only, when I print out the module’s parameters:

Is this expected behaviour and why are they not listed on both of the GPUs I’m using? Could somebody please explain what is going on here?


  • torch.cuda.device_count returns 2 as expected.

My code looks something like the following:

class Network(nn.Module):
    def __init__(self):
        ...
        self.templates = nn.ModuleList([nn.ParameterList([nn.Parameter(template_init, requires_grad=True) for i in range(n)]) for n in self.num_t])

...

self.Network = nn.DataParallel(self.Network)
self.Network.to(self.device)

Hi @ortho-stice

This is expected behavior. Here is the source code of DataParallel: https://github.com/pytorch/pytorch/blob/46539eee0363e25ce5eb408c85cefd808cd6f878/torch/nn/parallel/data_parallel.py#L148-L153

What happens is that, in every forward pass, DataParallel will

  1. scatters the input to all GPUs
  2. replicate the model to all GPUs
  3. launch parallel_apply so that every GPU will run its own forward pass using its input data split in parallel.
  4. gather all outputs to the output device

So the model replication only occurs in the forward pass, and hence you won’t see those model replicas outside the forward function.

BTW, we do recommend using DistributedDataParallel which only replicates the model once in constructor instead of in every forward invocation.