Hi,
I am experiencing this situation,
I trained a model named src_model
using resnet18, and I want to use the first four layer and its weight in another model dest_model
, as it is.
I saved the src_model
using torch.save()
In the dest_model
, I used to create by taking the first four layers from the src_model
which is the resnet18 with fc layers at last.
class dest_model(nn.Module):
def __init__(self):
super(dest_model, self).__init__()
resnet = models.resnet18(pretrained=True)
layers = list(resnet.children())
self.features = nn.Sequential(*layers[:4])
def forward(self, x):
return self.features(x)
print(dest_model)
dest_model(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
…
The first four layer from src_model
and dest_model
are same
and transferred its weights using this format -
dest_model.features[0].weight.data=src_model.features[0].weight.data.to(device).type(torch.cuda.FloatTensor)
dest_model.features[1].weight.data=src_model.features[1].weight.data.to(device).type(torch.cuda.FloatTensor)
dest_model.features[1].bias.data=src_model.features[1].bias.data.to(device).type(torch.cuda.FloatTensor)
and I saved the dest_model
using torch.save()
, ever thing works smoothly
But, the results from dest_model
is different. so, I manually checked the outputs of the intermediate layers using forward hook by referring this link.
I gave the same input sample to the src_model
and dest_model
and took the first four layers output using the forward hook from both models, to get to know what is happening,
for src_key, dest_key in zip(src_activation, dest_activation):
print(src_key, dest_key)
print(torch.equal(src_activation[src_key], dest_activation[dest_key]))
print(src_activation[src_key].shape, dest_activation[dest_key].shape)
src_activation
, dest_activation
are the Ordered_dict output from forward hook
Here is the result -
features.0 features.0
True
torch.Size([1, 64, 112, 112]) torch.Size([1, 64, 112, 112])
features.1 features.1
False
torch.Size([1, 64, 112, 112]) torch.Size([1, 64, 112, 112])
features.2 features.2
False
torch.Size([1, 64, 112, 112]) torch.Size([1, 64, 112, 112])
features.3 features.3
False
torch.Size([1, 64, 56, 56]) torch.Size([1, 64, 56, 56])
From this results, I get to know that the 1st layers has same output and the rest of the layers have different results even though they have same weights,
However, as I copied the weights, the results should be same in all the layers from both the models. But, the results are not same for same input.
The result gets differ from the 2nd layer which is BatchNorm2d
,
Could you please explain, What is happening in the BatchNorm2d
layer, why, it behaves differently in both models even though, it has same parameters
Thanks