Hello all,
I’m separating Resnet into an ‘activations’ module and an ‘imagenet output’ module for a low memory application that would otherwise need duplicate models, and I notice that when I separate the two, the results of applying them sequentially are slightly different. I’m wondering if someone could shed some light as to what is happening beneath the surface or if my technique for splitting is somehow inaccurate.
from torchvision.models import resnet18
class skip(nn.Module):
def __init__(self):
super(skip,self).__init__()
def forward(self, x):
return x
def split_resnet():
resnet = resnet18(pretrained=True)
fc = nn.Linear(512,1000)
fc.load_state_dict(resnet.fc.state_dict())
resnet.fc = skip()
return resnet, fc
inp = Var(torch.randn(1,3,256,256))
res1 = resnet18(pretrained=True)
res2, fc = split_resnet()
print(res1(inp))
print(fc(res2(inp))
Which yeilds:
Variable containing:
-0.3892 -0.2363 -0.7533 … -0.1491 1.8596 0.9131
[torch.FloatTensor of size 1x1000]
Variable containing:
-0.4197 -0.2726 -0.7471 … -0.1189 1.8350 0.9147
[torch.FloatTensor of size 1x1000]