Size mismatch error Se Resnext50

Jaideep_Valani · August 9, 2019, 1:35pm

hi
i get below error when i use se res next 50 34x2d model

size mismatch, m1: [80 x 8192], m2: [2048 x 1] at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/generic/THCTensorMathBlas.cu:268.

Below is last phase of the model

(2): SEResNeXtBottleneck(
      (conv1): Conv2d(2048, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
      (bn2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace)
      (se_module): SEModule(
        (avg_pool): AdaptiveAvgPool2d(output_size=1)
        (fc1): Conv2d(2048, 128, kernel_size=(1, 1), stride=(1, 1))
        (relu): ReLU(inplace)
        (fc2): Conv2d(128, 2048, kernel_size=(1, 1), stride=(1, 1))
        (sigmoid): Sigmoid()
      )
    )
  )
  (avg_pool): AvgPool2d(kernel_size=7, stride=1, padding=0)
  (last_linear): Sequential(
    (0): view() #x.view(x.size(0),-1)
    (1): Linear(in_features=2048, out_features=1, bias=True)
  )

ptrblck · August 10, 2019, 12:02am

Could you post the shape of your input and if possible a code snippet to reproduce this issue?

Jaideep_Valani · August 10, 2019, 3:48am

Hi
Image (3, 256, 256). This is image passed in batch size of 80.
I use fastai without making any changes to original model except the one above where i change the class from 1000 to one…

def se_resnext50(pretrained=False):
    pretrained = 'imagenet' if pretrained else None
    model = pt.se_resnext50_32x4d(pretrained=pretrained)

def _resnext_split(m): return   (m.layer3, m.avg_pool)    model.load_state_dict(torch.load('/tmp/.cache/torch/checkpoints/se_resnext50_32x4d-a260b3a4.pth'))
    return model

loss_func = MSELossFlat()

learn1 = Learner(data, 
                se_resnext50(),
                loss_func=loss_func ,
                
                metrics = [qk,r2_score,exp_rmspe], 
                path='.',callback_fns=[partial(SaveModelCallback,monitor='valid_loss',mode='min'),partial(ReduceLROnPlateauCallback, min_delta=1e-5, patience=3)])

learn1.callback_fns.append(partial(OverSamplingCallback1,weights=[1,1,1,1,1],bn=len(learn1.data.train_dl)))

learn1.model.last_linear=nn.Sequential(view(),nn.Linear(2048,1,bias=True))

apply_init(learn1.model.last_linear, nn.init.kaiming_uniform_)

learn1.split(_resnext_split) #this is to form param group for differential LR
learn1.model.cuda()
learn1.to_fp16()

Jaideep_Valani · August 11, 2019, 11:18am

any help here will be appreciated ?

ptrblck · August 11, 2019, 11:27am

I’m not sure, how Learner etc. are defined, but it seems they are coming from a high-level API.
Based on the error message, if looks like your last_linear contains the wrong number of input features.
Try to read the in_features attribute before replacing the this layer with your custom one or alternatively try to set it to 8192 (based on the error message).