Run inference with model trained using nn.DataParallel

I am working on lung segmentations using a Fully Convolutional DenseNet that has both batchnorm and dropout layers. The model was trained using nn.DataParallel on two GPUs. When I run inference, I load the weights after first wrapping the model in nn.DataParallel. Inference results without the flag model.eval() look as expected. However, when I run inference using model.eval(), the segmentation fails and I get random clusters of pixels inside the lungs. I have checked the .training properties for both the nn.BatchNorm2d and nn.Dropout2d layers in my parallel model using model.train() and model.eval(), and the properties look correct (True for model.train() and False for model.eval()):

model = nn.DataParallel(FCDN()).cuda()
model.eval()
for m in model.modules():
    if isinstance(m, nn.BatchNorm2d):
        print('bn', m.training) # bn False
    if isinstance(m, nn.Dropout2d):
        print('do', m.training)  # do False

Seems the problem could be related to this: https://github.com/pytorch/pytorch/issues/1051, but I’m not sure I fully understand the post.

Since segmentations look acceptable without using model.eval(), it is not a big issue for me, but I wonder if the results could be better if I knew how to use model.eval() correctly.

1 Like

I stumbled upon the exact same problem - there is a dramatic decrease of performance when using .eval() on DataParallel. I do not understand why though.

@jpcenteno, @hristo-vrigazov
Did you observe the same accuracy drop on your validation set during training or using a single GPU?

I am also stuck in this problem… Anybody solve the problem??

When i do not call model.eval() and assign the identical batch number as what used in training, then the estimation seems reasonable.

Have you solved this issue?