Size mismatch after loading saved model. Please explain!

Hey guys, I am trying to load a trained CNN classifier that I saved so I can modify the linear layers, but I get a size mismatch error when performing a forward pass (train or eval, doesn’t matter). Here is the output:

Exception NameError: “global name ‘FileNotFoundError’ is not defined” in <bound method _DataLoaderIter.del of < object at 0x7fcbc02c3350>> ignored
Traceback (most recent call last):
File “”, line 169, in
outputs = F.softmax(old_model(images))
File “/home/zswartz/.local/lib/python2.7/site-packages/torch/nn/modules/”, line 491, in call
result = self.forward(*input, **kwargs)
File “/home/zswartz/.local/lib/python2.7/site-packages/torch/nn/modules/”, line 91, in forward
input = module(input)
File “/home/zswartz/.local/lib/python2.7/site-packages/torch/nn/modules/”, line 491, in call
result = self.forward(*input, **kwargs)
File “/home/zswartz/.local/lib/python2.7/site-packages/torch/nn/modules/”, line 55, in forward
return F.linear(input, self.weight, self.bias)
File “/home/zswartz/.local/lib/python2.7/site-packages/torch/nn/”, line 994, in linear
output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [72 x 2], m2: [144 x 100] at /pytorch/aten/src/THC/generic/

If I completely strip away the linear layers, and just leave the conv layers, there is no size mismatch error. Keep in mind, I am using the same exact data loader that I used to train the network in the first place. This isn’t a huge deal because I plan to strip away the linear layers regardless, but I would like to verify that the frozen model performs as it was trained. I have a feeling that this may have something to do with the fact that I use “.view(-1, 144)” to flatten my final feature map in the forward method before the first linear layer, which is where this error is occurring.

Did you change the view() after loading the model?
If you model was fine before you saved it, it should also work after loading it.
The shape of m1 looks like it could be [144 x 1], but this is just a guess.

Could you explain a bit more how you saved and reloaded the model?
Also, the code would be interesting to see.

Hey! Thanks for the reply.

I use, filename) to save, and then I use:
model = train.Net()
in order to load the model.
I think the problem has something to do with the fact that I am using:
old_model = nn.Sequential(*list(model.children())).cuda(),

after loading the model.

I have done it this way so I have the ability to create a sub-network by indexing model.children(), which works as long as I index up to but not including the first linear layer.

I’m not sure how much liberty I have in sharing all of the code, but I can include snippets.

Thank you!

Could you check again, that the forward pass runs successfully:

x = torch.randn( YOUR_SIZE )
output = model(x), filename)
model = train.Net()
output = model(x)
old_model = nn.Sequential(...)
x ='cuda')
output = old_model(x)

So, the model runs perfectly as long as I don’t stick the layers together using nn.Sequential(…).

…Any ideas?

I’m under the impression that using sequential might mess with the flattening of the final feature map.

Thanks for the hint! You are absolutely right.

You are re-creating the model as a nn.Sequential module, so that the view you are probably using in forward will be missing.

You can fix using a Flatten module between your layers. Here is a small example:

class Flatten(nn.Module):
    def __init__(self):
        super(Flatten, self).__init__()
    def forward(self, x):
        return x.view(x.size(0), -1)

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 3, 1, 1)
        self.fc1 = nn.Linear(6*24*24, 10)
        self.fc2 = nn.Linear(10, 2)
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = MyModel()
x = torch.randn(1, 3, 24, 24)
output = model(x)

layers = list(model.children())[:1] + [Flatten()] + list(model.children())[1:]
model = nn.Sequential(*layers)
output = model(x)

Awesome! Sorry, did you mean to include an instance of your Flatten class somewhere in your model?

Sorry, I didnt read all of it. Thank you!

Hey, sorry to resurface this issue, but I have a lingering question. Does stitching things together using sequential ignore everything that occurs in the forward method, including functional relus?

Yes it does. You can only use instances of classes to give them to the sequential model. For almost every function you can simply wrap it inside a torch module (and for some methods as Relu such a wrapper exists already).

so, If I’m removing and adding layers of a saved model by using nn.sequential, how would I reintroduce the relu’s??

You could simply add a torch.nn.ReLU() layer to use relu inside your sequential model.

1 Like