Index filtering or vector product for filtered backpropagation


I’m working on a supervised deep learning model. The loss used should be different depends on the class of the input.
For example : If X_0 is the input vector and its from the class 0, resp X_1 from the class 1. The used loss for X_0 will be L_0 and the one for X_1 will be L_1.

Now we are in a training session. Inside the current batch we have D our current data of n samples and Y their associated label (either 0 or 1).

What is the difference, if there is one, between :

  1. Filtering the calcul though index selection
    LOSS = L_0(D[class == 0]) + L_1(D[class == 1])

  2. Forcing 0 for the unwanted calcul.
    LOSS = (1-Y)L_0(D) + YL_1(D)


I would claim it depends on the used reductions in the loss functions and how the losses are masked in the second approach.
E.g. if you are calculating the mean losses in L_0 and L_1 I would expect to see a difference since the number of samples would differ. However, I also wouldn’t know how you are masking the reduced losses in the second use case if the losses are already reduced.

Thank you I did not think about the impact of the data size used in the following calculation.