I have a CNN network that returns three different model outputs, so before doing doing the backward operation of the network, should i be taking the mean of the losses or just sum them up and do the backward pass. Something like this:
criterion1 = nn.CrossEntropyLoss(weights_label_0) criterion2 = nn.CrossEntropyLoss(weights_label_1) criterion3 = nn.CrossEntropyLoss(weights_label_2) loss_1 = criterion1(output, label_0) loss_2 = criterion2(output, label_1) loss_3 = criterion3(output, label_2) loss = loss_1+loss_2+loss_3 loss.backward() # or loss = (loss_1+loss_2+loss_3)/3 loss.backward()
which of the two would be correct ? Also, i have a slight confusion regarding calculating the weights for the labels. should the weights be calculated per batch or per dataset ?