Hello,
first of all let me say that I have seen different issues related to this problem however I cannot get it to work for my use case.
I am using a resnet network with a single output and therefore trying to use BCEWithLogitsLoss for binary classification (1 or 0). This is some code snippets:
return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0
BATCH SIZE = 3
OUTPUT and TARGET/LABELS SIZE
torch.Size([3]) torch.Size([3])
the issue is obviously related to the the weights being a tensor of size 2 while target a tensor of size 3 = to the batchsize… things work for batchsize = 2 or no weights.
What do I need to do to work with a generic batchsize ?
You are right and I misread the weight argument as pos_weight.
Based on the docs the weight will rescale the loss of each batch element and should thus have the same shape as the batch size.
Unsure, but you might want to use pos_weight instead which is used to balance e.g. an class imbalance in the dataset and can be defined as pos_weight = nb_negative_examples/nb_positive_examples.
I had to change logic when the batch size is == 1 as the dimension was not correct.
if datasize==1:
prediction_logist = torch.tensor([prediction_logist])
predicted= torch.tensor([predicted])
I am impressed with the performace of this model compared when I used cross entropy and 2 multilabels.
It took 200 EPOCHs before and not quite happy with the result while now with BCE modification I am getting good results even after 30 EPOCHs. Why is that ?