I have found a strange behaviour in a model I was writing, so I coded this little snippet to replicate it.
Basically I create a new linear layer, self.my_fc, deep-copying another layer, self.fc1. I change the input size of this layer to 3200 or whatever. However, I apply a tensor to both self.my_fc and self.fc1. I expect my new layer to fail, as the number of input doesn’t much the expected ones. Instead, they both work.
How is this possible?
import torch
import torch.nn as nn
import torch.nn.functional as F
import copy
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 20 * 20, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
self.my_fc = copy.deepcopy(self.fc1)
self.my_fc.in_features = 3200
def forward(self, x):
x1 = F.relu(self.conv1(x))
x2 = F.relu(self.conv2(x1))
x3 = x2.view(-1, 16 * 20 * 20) # x3 has shape [1, 6400]
x4 = F.relu(self.fc1(x3)) # fc1 is Linear(in_features=6400, out_features=120, bias=True)
# I run the same tensor x3 to my layer, which expects a different number of features.
# It should fail (x3 is size [1, 6400]), but it works!
xxx = self.my_fc(x3) ## <---- Linear(in_features=3200, out_features=120, bias=True)
# xxx has shape [1, 120] as expected
x5 = F.relu(self.fc2(x4))
x6 = self.fc3(x5)
return x6
net = Net()
net(torch.rand(1, 1, 28, 28))
I also found that, sometime, by changing the input feature to THE SAME input feature as the deep-copied layer, the operation DOESN’T WORK and it actually complain about mismatch! I think the problems are related, but it would be a bit more tricky to create a minimal reproducible example so let’s start investigate this one
How course it is totally possible that I am doing something wrong!