Hello guys!
I’m working on fine-tuning a MobileNetV3 for binary classification. My first attempt was to change the classification layer to output one feature and use BCEWithLogisticLoss as my loss function and torch.sigmoid for prediction.
Now, I want to check if the result would be better if I change to output 2 features and use the BCELoss as my loss function and here is where I’m stuck.
Firstly, I only changed the code to set 2 output features, and with that I got this error:
Using a target size (torch.Size([64])) that is different to the input size (torch.Size([64, 2])) is deprecated. Please ensure they have the same size.
After researching a little, I discovered that the right way is to use one_hot labels to perform the loss function, and with that I changed the implementation to one_hot labels
the implementation is this:
labels_one_hot = torch.zeros(labels.size(0), 2).to(device)
labels_one_hot.scatter_(1, labels.unsqueeze(1), 1)
With this change, I got this error:
t./aten/src/ATen/native/cuda/Loss.cu:94: operator(): block: [0,0,0], thread: [2,0,0] Assertion `input_val >= zero && input_val <= one` failed.
Searching again, I found that this error is related to out of range exceptions, and that probably the range of my labels tensor is not between 0 and num_classes-1.
So, to check that I printed both my labels and one_hot_labels tensors and got this:
labels: tensor([0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1,
0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0,
1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0], device='cuda:0')
one hot: tensor([[1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [1., 0.], [0., 1.], [1., 0.], [0., 1.], [1., 0.], [0., 1.], [1., 0.], [0., 1.], [1., 0.], [0., 1.], [0., 1.], [1., 0.], [0., 1.], [0., 1.], [0., 1.], [1., 0.], [1., 0.], [0., 1.], [0., 1... [1., 0.], [0., 1.], [1., 0.], [1., 0.], [1., 0.], [0., 1.], [1., 0.], [1., 0.], [0., 1.], [0., 1.], [1., 0.], [0., 1.], [1., 0.], [1., 0.], [1., 0.], [0., 1.], [1., 0.], [1., 0.]], device='cuda:0')
So, it seems that the range is right (between 0 and 1) the only difference is that the one_hot is float and the labels are int. With that in mind, I decide to force the one_hot_labels to int type and now the “out of range” error disappeared but I got this error:
Found dtype Int but expected Float
Long story short, if I use float I get an “out of range error”, but if I change to int this error disappears and I get one “I need float” error.
Do you guys know what I could be doing wrong?