Hello, I tried to use multiple GPUs to run my CNN based on Pytorch 0.4.1 using torch.nn.DataParallel function. Here is the code that uses this function:
if args.gpus and len(args.gpus) > 1:
model = torch.nn.DataParallel(model, args.gpus)
It gave me error message:
RuntimeError: torch/csrc/autograd/variable.cpp:166: get_grad_fn: Assertion output_nr_ == 0
failed.
The line that shot this error is:
out += self.bias.view(1, -1, 1, 1).expand_as(out)
Some people in Github suggested using DistributedDataParallel instead of DataParallel, since there’s a bug for multiple GPU in Pytorch 0.4.x. If it’s true, could anyone show an example for how to use this DistributedDataParallel function? I’m not familiar with its settings…
Thanks