How can I train my model on nn.CosineEmbeddingLoss() but with weights?

jungminc88 · December 29, 2021, 10:40am

Like in the code below, I’m trying to train my model on nn.CosineEmbeddingLoss(), and since the dataset has severe label imbalance (the examples where the pair of sentences are not similar far outnumber the ones where they are similar), I need to train it with weights. What I gathered from the document is that nn.CosineEmbeddingLoss() does not directly support weights? If so, is there a workaround? How can I attain weighted loss? I would greatly appreciate your advice. Thank you!

    criterion = nn.CosineEmbeddingLoss()
    for i, batch in enumerate(dataloader):
        sent_a_input_ids = batch[0].to(device)
        sent_a_attention_mask = batch[1].to(device)
        sent_b_input_ids = batch[2].to(device)
        sent_b_attention_mask = batch[3].to(device)
        labels = torch.where(batch[4]==0, -1, 1).to(device) # -1 if dissimilar 1 if similar
        a_out = model(input_ids=sent_a_input_ids, attention_mask=sent_a_attention_mask).last_hidden_state
        b_out = model(input_ids=sent_b_input_id, attention_mask=sent_b_attention_mask).last_hidden_state
        a_emb = _mean_pooling(a_out, sent_a_attention_mask)
        b_emb = _mean_pooling(b_out, sent_b_attention_mask)

        loss = criterion(a_emb, b_emb, labels)
        loss.backward()