Loss always equal to zero while training the model

zuliani99 · February 28, 2023, 9:51pm

Thanks for responding,

My output/label shape is [batch_size, 487], moreover fom the entire dataset of 1 milion video I’ve taken 1% of all, originally each label have around 5000 video associated, but now since I do not know the proportion of the label, I’ve learned that this becomes an imbalance dataset. Correct?

Then I’ve also view many other discussions regarding the choiche of the class weight, the most useful was this one.

Now I’ve computed the class weights like so:

col = list(map(str, range(len(LABELS))))
count_train = train_df[col].sum(axis=0).to_numpy()
count_test = test_df[col].sum(axis=0).to_numpy()
count = count_train + count_test
sum = count.sum()
weight = torch.from_numpy(count / sum).to(device)

Putting the weight tensor as parameter for pos_weight in the loss_fn = nn.BCEWithLogitsLoss(pos_weight=weight) I get these result as outputs and loss values:

0 tensor([[ 0.0065, -0.0074,  0.0149,  ..., -0.0020,  0.0109, -0.0017],
        [ 0.0073, -0.0084,  0.0133,  ..., -0.0044,  0.0084, -0.0034],
        [ 0.0072, -0.0089,  0.0136,  ..., -0.0017,  0.0115, -0.0009],
        ...,
        [ 0.0093, -0.0110,  0.0162,  ..., -0.0021,  0.0105, -0.0038],
        [ 0.0086, -0.0083,  0.0104,  ..., -0.0017,  0.0125, -0.0029],
        [ 0.0067, -0.0084,  0.0136,  ..., -0.0027,  0.0110, -0.0014]],
       grad_fn=<AddmmBackward0>) torch.Size([16, 487]) torch.Size([487])
0 tensor(0.6916, dtype=torch.float64,
       grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)
1 tensor([[ 0.0074, -0.0078,  0.0197,  ..., -0.0006,  0.0112, -0.0009],
        [ 0.0057, -0.0129,  0.0150,  ..., -0.0003,  0.0086, -0.0009],
        [ 0.0073, -0.0097,  0.0135,  ..., -0.0008,  0.0078, -0.0015],
        ...,
        [ 0.0050, -0.0091,  0.0133,  ..., -0.0008,  0.0112, -0.0020],
        [ 0.0092, -0.0140,  0.0168,  ..., -0.0062,  0.0103, -0.0022],
        [ 0.0057, -0.0084,  0.0142,  ..., -0.0048,  0.0138,  0.0008]],
       grad_fn=<AddmmBackward0>) torch.Size([16, 487]) torch.Size([487])
1 tensor(0.6916, dtype=torch.float64,
       grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)
2 tensor([[ 0.0101, -0.0071,  0.0152,  ..., -0.0020,  0.0111,  0.0005],
        [ 0.0087, -0.0094,  0.0142,  ..., -0.0031,  0.0109, -0.0008],
        [ 0.0104, -0.0077,  0.0180,  ..., -0.0012,  0.0115,  0.0007],
        ...,
        [ 0.0079, -0.0111,  0.0159,  ..., -0.0010,  0.0057, -0.0004],
        [ 0.0052, -0.0127,  0.0168,  ..., -0.0032,  0.0102, -0.0029],
        [ 0.0066, -0.0112,  0.0165,  ..., -0.0032,  0.0066, -0.0034]],
       grad_fn=<AddmmBackward0>) torch.Size([16, 487]) torch.Size([487])
2 tensor(0.6916, dtype=torch.float64,
       grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)
3 tensor([[ 0.0069, -0.0101,  0.0123,  ..., -0.0029,  0.0111, -0.0003],
        [ 0.0060, -0.0102,  0.0122,  ..., -0.0034,  0.0077, -0.0023],
        [ 0.0071, -0.0111,  0.0183,  ..., -0.0002,  0.0110, -0.0055],
        ...,
        [ 0.0066, -0.0107,  0.0157,  ..., -0.0004,  0.0086,  0.0005],
        [ 0.0049, -0.0101,  0.0152,  ..., -0.0004,  0.0097,  0.0004],
        [ 0.0065, -0.0115,  0.0159,  ..., -0.0004,  0.0117, -0.0004]],
       grad_fn=<AddmmBackward0>) torch.Size([16, 487]) torch.Size([487])
3 tensor(0.6916, dtype=torch.float64,
       grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)
4 tensor([[ 0.0046, -0.0096,  0.0147,  ..., -0.0037,  0.0100,  0.0012],
        [ 0.0081, -0.0164,  0.0193,  ..., -0.0046,  0.0075,  0.0038],
        [ 0.0095, -0.0094,  0.0169,  ..., -0.0005,  0.0124, -0.0018],
        ...,
        [ 0.0064, -0.0095,  0.0131,  ..., -0.0037,  0.0100, -0.0003],
        [ 0.0070, -0.0112,  0.0159,  ..., -0.0006,  0.0081, -0.0010],
        [ 0.0061, -0.0111,  0.0124,  ..., -0.0052,  0.0105, -0.0026]],
       grad_fn=<AddmmBackward0>) torch.Size([16, 487]) torch.Size([487])

Now I have to try to go through all the batch and also with many epochs however it seems like that the loss is stuck like the f1-score (0.00205)