Target size (torch.Size([10])) must be the same as input size (torch.Size([2]))

Your use case mixes some workflows for a binary classification.
You could either:

  • use two output units + nn.CrossEntropyLoss and a target of shape [batch_size] containing the class indices
  • or a single output unit + nn.BCEWithLogitsLoss and a target of shape [batch_size, 1]

Neither use case uses a softmax activation at the end, as both criteria will use an activation function internally, so you should remove the softmax.

That being said, the shape mismatch is probably created in:

x = x.view(-1, self._to_linear)

Could you use x = x.view(x.size(0), -1) to keep the batch dimension constant.
This could potentially yield a shape mismatch in the feature dimension, which you would need to fix by changing the in_features in the conflicting linear layer.

1 Like