Not working for larger batch size

I am doing transfer learning from large network based in Resnet.
It is related to image classification that outputs labels.

When I feed input with one image and corresponding target as below:

torch.Size([1, 1, 2500, 1900]) 
tensor([[0., 1.]], device='cuda:1') torch.Size([1, 2])

I have correct output like:

outputs: tensor([[[-0.4243, -1.0620],
                  [-0.0913, -2.4389]]], device='cuda:1', grad_fn=<SliceBackward>) torch.Size([1, 2, 2])

When I input a 2 image batch (using Dataloader)
The input/target are like that:

torch.Size([2, 1, 2500, 1900]) 
tensor([[1., 0.], [1., 0.]], device='cuda:1') torch.Size([2, 2])

And I get following output:

outputs: tensor([[[-0.2792, -1.4123],
                  [-0.0257, -3.6729]],

                 [[-0.2532, -1.4974],
                  [-0.0518, -2.9860]]], device='cuda:1', grad_fn=<SliceBackward>) torch.Size([2, 2, 2])

–> Correct too.

But after that I got an error when I try input/target:

torch.Size([3, 1, 2500, 1900]) 
tensor([[0., 1.],
        [0., 1.],
        [1., 0.]], device='cuda:1') torch.Size([3, 2])

The output tensor becomes:

outputs: tensor([[[-0.1289, -2.1125],
                  [-0.0123, -4.4008]],

                 [[-0.2694, -1.4434],
                  [-0.0322, -3.4504]]], device='cuda:1', grad_fn=<SliceBackward>) torch.Size([2, 2, 2])

Which is not 3 in first dimension as I expected, and happens an error message below:

/home/dpetrini/.local/lib/python3.6/site-packages/torch/nn/modules/ UserWarning: Using a target size (torch.Size([3, 2])) that is different to the input size (torch.Size([2, 2])) is deprecated. Please ensure they have the same size.
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
Traceback (most recent call last):
File “src/modeling/”, line 593, in
File “src/modeling/”, line 553, in main
loss_func, optimizer, device, mini_batch, num_epochs)
File “src/modeling/”, line 97, in train_and_validate
loss = loss_criterion(y_hat, labels) # compute loss
File “/home/dpetrini/.local/lib/python3.6/site-packages/torch/nn/modules/”, line 547, in call
result = self.forward(*input, **kwargs)
File “/home/dpetrini/.local/lib/python3.6/site-packages/torch/nn/modules/”, line 498, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File “/home/dpetrini/.local/lib/python3.6/site-packages/torch/nn/”, line 2044, in binary_cross_entropy
“!= input nelement ({})”.format(target.numel(), input.numel()))
ValueError: Target and input must have the same number of elements. target nelement (6) != input nelement (4)

The model is a Resnet plus the layers below:

  (fc1): Linear(in_features=256, out_features=256, bias=True)
  (output_layer): OutputLayer(
    (fc_layer): Linear(in_features=256, out_features=4, bias=True)

I appreciate any help or suggestion. It prevents me to better use GPU filling one or two more samples for each batch and making it faster.


I think you have some extra code between the model that you give (where the last layer is a linear that outputs 4 features so batch x 4) and the size you print (which are batch x 2 x 2)
You should make sure that this extra code has no part that is hardcoded that ignore part of the input.

I see also in model declaration:

layers.OutputLayer(256, (2, 2))

I may explain the (batch x 2 x 2).
I got first question layers print from: “print(model)” and it actually is not very precise, as you point out.

Is there any way to print model with more detail?

The user can create any model he wants so there is not one way to print the model.
The best way is to look at the code of the forward function to make sure what happens :slight_smile:

I see, I think I get the point:
The model itself has this forward:

    def forward(self, x):
        result = self.view_resnet(x)
        h = F.relu(self.fc1(result))
        h = self.output_layer(h)[:2]
        return h

And output_layer by its turn has the following forward:

    def forward(self, x):
        h = self.fc_layer(x)
        if len(self.output_shape) > 1:
            h = h.view(h.shape[0], *self.output_shape)  
        h = F.log_softmax(h, dim=-1)
        return h

So it seems that h = self.output_layer(h)[:2] in top model is preventing from getting 3 outputs or more. I changed to h = self.output_layer(h) and is just working.

Thank you!

1 Like