Selective masking when computing the gradient

Is it possible to only compute the loss of some model with respect to certain batch elements?

For concrete, let’s say we have some model, with loss = model(x), where x has size [batch_size, dim]. There is a boolean mask mask with entries which are True at the batch indices where the tensor should be evaluated and False otherwise. Is there a way to allow the model to only compute the gradient for these entries?

While it is possible to compute the loss on unmasked elements only, loss = model(x[mask]), I would like to make use of the loss of the masked elements in x, just not allow these elements to affect the model.

1 Like

Consider you have simple classifier whose input size is [20,5] with 20 being the batch_size. After you forward the input to the model, it will give the output of size [20, num_classes].
Now, when you calculate the loss using the nn.CrossEntropyLoss() or nn.MSELoss(), you have an argument reduction, if you pass its value as 'none', the resulting loss-tensor would be a vector of size 25. Each entry in the loss would be the corresponding loss for each image in the batch.
You can apply the Boolean-mask here and then average it or sum it up.