Layer with modified input size shouldn't work, but it does. Is this a bug?

Valerio_Biscione · April 26, 2020, 5:03pm

I have found a strange behaviour in a model I was writing, so I coded this little snippet to replicate it.

Basically I create a new linear layer, self.my_fc, deep-copying another layer, self.fc1. I change the input size of this layer to 3200 or whatever. However, I apply a tensor to both self.my_fc and self.fc1. I expect my new layer to fail, as the number of input doesn’t much the expected ones. Instead, they both work.
How is this possible?

import torch
import torch.nn as nn
import torch.nn.functional as F
import copy

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5) 
        self.conv2 = nn.Conv2d(6, 16, 5) 
        self.fc1 = nn.Linear(16 * 20 * 20, 120) 
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

        self.my_fc = copy.deepcopy(self.fc1)
        self.my_fc.in_features = 3200

    def forward(self, x):
        x1 = F.relu(self.conv1(x))
        x2 = F.relu(self.conv2(x1))
        x3 = x2.view(-1, 16 * 20 * 20) #  x3 has shape [1, 6400]
        x4 = F.relu(self.fc1(x3)) #  fc1 is Linear(in_features=6400, out_features=120, bias=True)
        # I run the same tensor x3 to my layer, which expects a different number of features.
        # It should fail (x3 is size [1, 6400]), but it works! 
        xxx = self.my_fc(x3) ## <---- Linear(in_features=3200, out_features=120, bias=True)
        # xxx has shape [1, 120] as expected
        x5 = F.relu(self.fc2(x4))
        x6 = self.fc3(x5)
        return x6

net = Net()
net(torch.rand(1, 1, 28, 28))

I also found that, sometime, by changing the input feature to THE SAME input feature as the deep-copied layer, the operation DOESN’T WORK and it actually complain about mismatch! I think the problems are related, but it would be a bit more tricky to create a minimal reproducible example so let’s start investigate this one

How course it is totally possible that I am doing something wrong!

KFrank · April 26, 2020, 6:17pm

Hello Valerio!

Valerio_Biscione:

Basically I create a new linear layer, self.my_fc, deep-copying another layer, self.fc1. I change the input size of this layer to 3200 or whatever. However, I apply a tensor to both self.my_fc and self.fc1. I expect my new layer to fail, as the number of input doesn’t much the expected ones. Instead, they both work.
        self.fc1 = nn.Linear(16 * 20 * 20, 120) 

        self.my_fc = copy.deepcopy(self.fc1)
        self.my_fc.in_features = 3200

I think that self.my_fc.in_features = 3200 is, in effect, lying to
your Linear layer. You already constructed it with an input
dimension of 6400.

Here’s an example script:

import torch
import copy

torch.__version__

fc1 = torch.nn.Linear (10, 2)
my_fc = copy.deepcopy (fc1)
my_fc.in_features = 5

fc1.in_features
fc1.weight
my_fc.in_features
my_fc.weight

Here is the output:

>>> import torch
>>> import copy
>>>
>>> torch.__version__
'0.3.0b0+591e73e'
>>>
>>> fc1 = torch.nn.Linear (10, 2)
>>> my_fc = copy.deepcopy (fc1)
>>> my_fc.in_features = 5
>>>
>>> fc1.in_features
10
>>> fc1.weight
Parameter containing:
 0.0767  0.1974  0.0051  0.1165  0.0270 -0.2915 -0.0038 -0.2985  0.2314  0.2616
 0.0099 -0.0060  0.1231  0.1886  0.2331  0.0675 -0.1929 -0.1226 -0.1118 -0.2078
[torch.FloatTensor of size 2x10]

>>> my_fc.in_features
5
>>> my_fc.weight
Parameter containing:
 0.0767  0.1974  0.0051  0.1165  0.0270 -0.2915 -0.0038 -0.2985  0.2314  0.2616
 0.0099 -0.0060  0.1231  0.1886  0.2331  0.0675 -0.1929 -0.1226 -0.1118 -0.2078
[torch.FloatTensor of size 2x10]

As you can see, setting my_fc.in_features doesn’t actually change
the weight tensor, so (presumably) doesn’t actually change the
expected shape of the input tensor.

You could argue that this is a bug in that it shouldn’t be possible
to create such an “inconsistent” Linear for which in_features
and weight.shape don’t match.

I would guess that this is the same issue. You’re trying to change
in_features to match the shape of your input tensor, but this
doesn’t actually change the shape of the weight tensor, so you
do get a mismatch.

Good luck.

K. Frank

Valerio_Biscione · April 26, 2020, 6:27pm

Thanks! That was extremely clear and it makes total sense
I would suggest them to put an underscore at the beginning of the parameter name, to indicate internal use, or to set it only for reading, not writing.