Hi. Today I was trying to convert a model weight to my implementation. I found a problem converting a Squeeze-Excitation Block that is the following:
class SEBlock(nn.Module):
def __init__(self, in_channels, out_channels, reduction_ratio=4):
super(SEBlock, self).__init__()
red_channels = round_by(in_channels / reduction_ratio, 8)
self.conv1 = nn.Conv2d(in_channels, red_channels, 1)
self.activation = nn.ReLU(inplace=True)
self.conv2 = nn.Conv2d(red_channels, out_channels, 1)
def forward(self, input):
x = F.adaptive_avg_pool2d(input, 1)
x = self.conv1(x)
x = self.activation(x)
x = self.conv2(x)
return input * hard_sigmoid(x)
However, in the original model, the pointwise convolutions are nn.Linear
layers. I think that they could be easily converted as such (because the input is a 1x1 image):
linear = nn.Linear(16, 8)
conv = nn.Conv2d(16, 8, 1)
conv.weight.copy_(linear.weight.reshape(8, 16, 1, 1));
conv.bias.copy_(linear.bias);
Unfortunelly, this does not reproduce the same result!
x = torch.randn((3, 16, 1, 1))
torch.norm(conv(x) - linear(x.view(3, 16)))
>>> tensor(16.5712)
What am I missing here?