I’m trying to apply senet for classification problem. as you can see in the image, last conv2d takes in 128 channels and outputs 2048 channels. so I made last linear layer (2048, 1383)
I only changed last 1000 to 1383 and the following error comes out.
RuntimeError: size mismatch, m1: [32 x 4096], m2: [2048 x 1383] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:266
how can I solve the problem??
It looks like your spatial size is not
1x1 after the last pooling layer, but apparently
Could you check that by adding a print statement showing the shape of your activation before flattening it it your
I printed out the shape after the avg_pool layer and I got (batch size, 2048, 1, 1). Isn’t this correct?
if I make in_features 4096 it works fine, but it makes me uncomfortable…
In that case, could you post your
Maybe the reshaping might be buggy.
I’m not sure if I’m doing the right thing
I assume you are using @Cadene’s implementation of SeNet.
Could you add
print(x.shape) before this line of code and use the complete model again?
I changed the following code and generated x = torch.randn(1, 3, 224, 224)
and got the following shape (1, 2048, 1, 1)
I do not see which part that I’m doing wrong
I’m not sure, what’s going on, as I’ve just tested your approach and it’s working fine:
model = pretrainedmodels.senet154()
# Use vanilla model
x = torch.randn(1, 3, 224, 224)
output = model(x)
>Linear(in_features=2048, out_features=1000, bias=True)
# Change last_linear
model.last_linear = nn.Linear(2048, 1383)
output = model(x)
> torch.Size([1, 1383])
Could you check your code again for differences?
I’m so sorry… I checked the transform and I found that I resized the input images to (224, 244) not (224, 224)
again sorry for wasting your time
No worries! Glad you figured it out!