Model.cuda() results in a different output compared to when not used

AntiLibrary5 · June 18, 2021, 2:24pm

Hi,
I’ve been scratching my head for a while now.
Env: Python 2.7; pytorch 1.8 + cuda

class Model(nn.Module):
    def __init__(self, feature_extractor, dropout=0, pretrained=True, feat_dim=2048):
        super().__init__()
        self.dropout = dropout
        self.feature_extractor = feature_extractor
        self.feature_extractor.avgpool = nn.AdaptiveAvgPool2d(1)
        fe_out_planes = self.feature_extractor.fc.in_features
        self.feature_extractor.fc = nn.Linear(fe_out_planes, feat_dim)
        self.fc_t = nn.Linear(feat_dim, 3)
        self.fc_q = nn.Linear(feat_dim, 3)

        # initialize the model
        if pretrained:
            init_modules = [self.feature_extractor.fc, self.fc_xyz, self.fc_wpqr]
        else:
            init_modules = self.modules()
        for m in init_modules:
            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
                nn.init.constant_(m.weight.data, 0.01) # constant weights
                if m.bias is not None:
                    nn.init.constant_(m.bias.data, 0)

    def forward(self, x):
        s = x.size()
        x = x.view(-1, *s[2:])
        x = self.feature_extractor(x)
        x = F.relu(x)
        """if self.dropout > 0:
            x = F.dropout(x, p=self.dropout)"""
        t = self.fc_t(x)
        q = self.fc_q(x)
        out = torch.cat((t, q), 1)
        out = out.view(s[0], s[1], -1)
        return out

torch.manual_seed(seed)
if torch.cuda.is_available():
   torch.cuda.manual_seed(seed)

feature_extractor = models.resnet34(pretrained=True)
model = Model(feature_extractor, dropout=0, feat_dim=2048)

Now, the interesting part:
When I run,

model.cuda()
print('Feed a random batch to test the model: ')
input = torch.ones(1, 64, 3, 7, 7)*0.3
input = input.cuda()
model.eval()
output = model(input)
print(output)

tensor([[106.7721, 106.7721, 106.7721, 106.7721, 106.7721, 106.7721], … ]], device=‘cuda:0’)

Compared to when I run without model.cuda(), i.e.:

print('Feed a random batch to test the model: ')
input = torch.ones(1, 64, 3, 7, 7)*0.3
model.eval()
output = model(input)
print(output)

tensor([[53.3860, 53.3860, 53.3860, 53.3860, 53.3860, 53.3860],…])

Almost half the values and this is consistent with different inputs.

I actually discovered this when I was porting the original repo to Python 3 (v3.8 with same version of pytorch) and I tried comparing the outputs with same input data, same constant weight init, no shuffle, no dropout with model.eval() but I noticed different outputs.
Then after some many print statements, I noticed this. Interestingly, this is not the case in the Python 3 version i.e., running the script by putting the model and data into gpu by using model.cuda() and input = input.cuda() gives the same output as compared to when .cuda() is not used.

I am really confused as to what the issue is and I couldn’t find any relevant documentation regarding this.
Please help.

Thank you.

ptrblck · June 19, 2021, 2:10am

That’s an interesting finding, but you should also note that PyTorch dropped the Python2.x support when its end of life was triggered (Jan 2020), so I would recommend to update to Python3.x.