Fasterrcnn_resnet50_fpn parallelization warning

Magolor · February 1, 2020, 7:20am

When training PyTorch built-in model fasterrcnn_resnet50_fpn on a custom dataset using nn.DataParallel, I received the following warning:

UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
  warnings.warn('Was asked to gather along dimension 0, but all '

By this post, I realized that the problem is, fasterrcnn_resnet50_fpn is using loss function inside the forward function.

So what should I do to fix this? I tried to ignore it and divide the final loss by num_workers. However, it seems that the model is not parallelized and all work was done by GPU 0 (maybe this is another irrelevant problem).

Westerby · May 12, 2020, 6:10am

You probably already found it, but this model won’t work with nn.DataParallel. Train with Distributed like here: https://github.com/pytorch/vision/blob/master/references/detection/train.py