Output is changing with image per gpu

ASMIftekhar · July 8, 2019, 7:26pm

I am facing a very weird issue here. So, I am trying to train a pretrained resnext model for some specific task. I removed the last few layers from resnext and feeding my images to it. So let’s say, output size from my truncated resnext is (batch size,1024,14,14). I am printing output[0][0][13]. This output changes with batch size per gpu. I am feeding exactly the same images.

Batch Size 1:(Using one GPU)
Output:tensor([0.0000, 0.0000, 0.0931, 0.0179, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000]
Batch Size 2: (Using one GPU)
Output: tensor([0.0000, 0.0000, 0.2732, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000]
Batch Size 2:(Using two GPUs)
Output:tensor([0.0000, 0.0000, 0.0931, 0.0179, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000],

So, the result changes with number of images per gpu. Is there any explanation for this? This is just first iteration. I did not update any weights or anything.

ptrblck · July 8, 2019, 9:23pm

Did you set your model to evaluation mode using model.eval()?
Since ResNext uses batch norm layers, the batch statistics will change during training depending on the batch size.

ASMIftekhar · July 8, 2019, 9:54pm

Ya I realized it. Thanks a lot for clearing it out.