Unable to finetune pretrained inception_v3 in multi-gpu training

PyTorch Version: 1.1.0
Torchvision Version: 0.3.0

I’m trying to finetune inception_v3 these days but meet a bug:

Traceback (most recent call last):
File “train.py”, line 133, in
outputs, aux = model(images)
File “/home/yjh/.conda/envs/phone/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 493, in call
result = self.forward(*input, **kwargs)
File “/home/yjh/.conda/envs/phone/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py”, line 153, in forward
return self.gather(outputs, self.output_device)
File “/home/yjh/.conda/envs/phone/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py”, line 165, in gather
return gather(outputs, output_device, dim=self.dim)
File “/home/yjh/.conda/envs/phone/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py”, line 67, in gather
return gather_map(outputs)
File “/home/yjh/.conda/envs/phone/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py”, line 62, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: new() missing 1 required positional argument: ‘aux_logits’

Finally, I figure out that there may be something wrong when the model is distributed to multi-gpu by model = nn.DataParallel(model), after commenting this line, everything seems ok and it began training.
However, I could not utilize the other gpu, which increase my time cost.

Is there bugs or I didn’t use it correctly?