Binary classification using datasets with different importance


I am built a NN to separate signal from background events. For the signal training I have a single dataset of 20000 events. For the background training I have instead 5 different sets, with a different number of events for each each set.
I managed successfully to train and implement a NN (based on MSELoss or BCELoss) when I concatenate the 5 background sets into a single set/tensor.

My problem is, however, that I need to treat the 5 different sets with 5 different weights which are associated to the “importance” (i.e. absolute normalization) of each set. To be clear: the set #1 contains 1000 events but they are 10 times more important than those from set #2 which contains 5000 events.

How can I take this “importance weight” into account using the standard torch losses?



You could use unreduced losses (reduction=‘none’), calculate the weight tensor for each element of your current batch and multiply it with the per-sample loss.
nn.BCEW(WithLogits)Loss povides a weight argument, which can be used alternatively.