Like in the code below, I’m trying to train my model on nn.CosineEmbeddingLoss(), and since the dataset has severe label imbalance (the examples where the pair of sentences are not similar far outnumber the ones where they are similar), I need to train it with weights. What I gathered from the document is that nn.CosineEmbeddingLoss() does not directly support weights? If so, is there a workaround? How can I attain weighted loss? I would greatly appreciate your advice. Thank you!
criterion = nn.CosineEmbeddingLoss() for i, batch in enumerate(dataloader): sent_a_input_ids = batch.to(device) sent_a_attention_mask = batch.to(device) sent_b_input_ids = batch.to(device) sent_b_attention_mask = batch.to(device) labels = torch.where(batch==0, -1, 1).to(device) # -1 if dissimilar 1 if similar a_out = model(input_ids=sent_a_input_ids, attention_mask=sent_a_attention_mask).last_hidden_state b_out = model(input_ids=sent_b_input_id, attention_mask=sent_b_attention_mask).last_hidden_state a_emb = _mean_pooling(a_out, sent_a_attention_mask) b_emb = _mean_pooling(b_out, sent_b_attention_mask) loss = criterion(a_emb, b_emb, labels) loss.backward()