Splitting resnet into two modules yields different results

Hello all,

I’m separating Resnet into an ‘activations’ module and an ‘imagenet output’ module for a low memory application that would otherwise need duplicate models, and I notice that when I separate the two, the results of applying them sequentially are slightly different. I’m wondering if someone could shed some light as to what is happening beneath the surface or if my technique for splitting is somehow inaccurate.

from torchvision.models import resnet18

class skip(nn.Module):
    def __init__(self):
        super(skip,self).__init__()
    
    def forward(self, x):
        return x

def split_resnet():
    resnet = resnet18(pretrained=True)
    fc = nn.Linear(512,1000)
    fc.load_state_dict(resnet.fc.state_dict())
    resnet.fc = skip()
    return resnet, fc

inp = Var(torch.randn(1,3,256,256))
res1 = resnet18(pretrained=True)
res2, fc = split_resnet()

print(res1(inp))
print(fc(res2(inp))

Which yeilds:

Variable containing:
-0.3892 -0.2363 -0.7533 … -0.1491 1.8596 0.9131
[torch.FloatTensor of size 1x1000]

Variable containing:
-0.4197 -0.2726 -0.7471 … -0.1189 1.8350 0.9147
[torch.FloatTensor of size 1x1000]

Hello,
I executed your code multiple times and I found both results to be the same.
I compared every entry of both output.

Variable containing:
-0.6792 -0.0576 -0.5130  ...  -0.5083  1.4377  0.9571
[torch.FloatTensor of size 1x1000]

Variable containing:
-0.6792 -0.0576 -0.5130  ...  -0.5083  1.4377  0.9571
[torch.FloatTensor of size 1x1000]

are split-resnet and resnet equivalent:  True

So the only pointer I have for you is checking the version of your torch, mine is 0.1.12_2.

Hmm. I am using the same version.

Found it: I made an edit to the code in my first post,

fc.load_state_dict(resnet.fc.state_dict())

used to be

fc.weight = resnet.fc.weight

and I must have still been using the old function in the notebook. Learning opportunity, thank you anyways.