Size mismatch problem

Zhang_Chi · February 10, 2018, 5:48am

hi,
When I try to split resnet101 into two parts and forward an image one by one, this problem occurs.
here is the code and error

first_part=[]
second_part=[]
for index,layer in enumerate(resnet101.named_children()):
    name,module=layer
    if index<=6:
        first_part.append(module)
    else:
        second_part.append(module)
first_part=nn.Sequential(*first_part)
second_part=nn.Sequential(*second_part)
first_part.cuda(cuda_id)
second_part.cuda(cuda_id)

test_input=Variable(torch.FloatTensor(2,3,224,224)).cuda(cuda_id)
out1=first_part(test_input)
out2=second_part(out1)

this is the error

    out2=second_part(out1)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 837, in linear
    output = input.matmul(weight.t())
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 386, in matmul
    return torch.matmul(self, other)
  File "/home/zhangchi/anaconda3/lib/python3.6/site-packages/torch/functional.py", line 191, in matmul
    output = torch.mm(tensor1, tensor2)
RuntimeError: size mismatch at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorMathBlas.cu:243

update:
I just found the reason. I can not simply copy the modules from the resnet. Before FC layer, i should use view() to reshape the tensors explicitly so that i have to split the network into three parts. Does anyone know any better solutions ?