Loss for Multi-label Classifier

se7en.dreams · June 15, 2022, 1:41am

Hi, I am working on a multi-label classification problem. My gt labels are of shape 14 x 10 x 128, where 14 is the batch_size, 10 is the sequence_length, and 128 is the vector with values 1 if the item in sequence belongs to the object and 0 otherwise.

My output is also of same shape: 14 x 10 x 128. Since, my input sequence was of varying length I had to pad it to make it of fixed length 10. I’m trying to find the loss of the model as follows:

total_loss = 0.0
unpadded_seq_lengths = [3, 4, 5, 7, 9, 3, 2, 8, 5, 3, 5, 7, 7, ...] # true lengths of sequences

optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.BCEWithLogitsLoss()

for data in training_dataloader:
    optimizer.zero_grad()

    # shape of input 14 x 10 x 128
    output = model(data)
    
    batch_loss = 0.0
    for batch_idx, sequence in enumerate(output):
        # sequence shape is 10 x 128
        true_seq_len = unpadded_seq_lengths[batch_idx]
        
        # only keep unpadded gt and predicted labels since we don't want loss to be influenced by padded values
        predicted_labels = sequence[:true_seq_len, :] # for example, 3 x 128
        gt_labels =  gt_labels_padded[batch_idx, :true_seq_len, :] # same shape as above, gt_labels_padded has shape 14 x 10 x 128
        
        # loop through unpadded predicted and gt labels and calculate loss  
        for item_idx, predicted_labels_seq_item in enumerate(predicted_labels):
            # predicted_labels_seq_item and gt_labels_seq_item are 1D vectors of length 128
            gt_labels_seq_item = gt_labels[item_idx]
            current_loss = criterion(predicted_labels_seq_item, gt_labels_seq_item)                     
            total_loss += current_loss
            batch_loss += current_loss


    batch_loss.backward()
    optimizer.step()

Can anybody please check to see if I’m calculating loss correctly. Thanks