Fail to use multiple GPUs in Pytorch 0.4.1

Hello, I tried to use multiple GPUs to run my CNN based on Pytorch 0.4.1 using torch.nn.DataParallel function. Here is the code that uses this function:

if args.gpus and len(args.gpus) > 1:
        model = torch.nn.DataParallel(model, args.gpus)

It gave me error message:
RuntimeError: torch/csrc/autograd/variable.cpp:166: get_grad_fn: Assertion output_nr_ == 0 failed.

The line that shot this error is:
out += self.bias.view(1, -1, 1, 1).expand_as(out)

Some people in Github suggested using DistributedDataParallel instead of DataParallel, since there’s a bug for multiple GPU in Pytorch 0.4.x. If it’s true, could anyone show an example for how to use this DistributedDataParallel function? I’m not familiar with its settings…
Thanks

Hi, are you also using rnn in with multiGPU like in https://github.com/pytorch/pytorch/issues/7092?