How to mask linear layer input to prevent invalid feature input from updating parameters

I have video features (Batch, Time, Spatial, Feature_dim), like (32,15,4,2048).
and I need first a dimension transform using a nn.Liear(2048,512) .
But some features are masked according the associating mask (B,T,S) (True means yes,False means no)
How do I avoid updating this layer with unwanted features ?

How is the mask applied? How do you update the parameters of this layer? Is there a loss function?

For example :

  • in attention models the mask is multiplied by the attention scores to avoid taking them into account in the calculation of the output: everything is managed at the model level

  • in classification it is also common to provide a weight parameter to the loss function (binary_cross_entropy_with_logits, cross_entropy…) to avoid the update of some neuron in output : everything is managed at the loss level.
    Or if we have to deal with a regression, we can play on the loss like this:

import torch
import torch.nn.functional as F

B, n = 4, 6

torch.manual_seed(0) 
y = torch.empty((B, n)).uniform_(-10, 10)
torch.manual_seed(1) 
y_pred = torch.empty((B, n)).uniform_(-10, 10)

if True :
    # if the mask is batch wise
    mask = torch.empty(B).random_(2).bool()
    loss = F.mse_loss(y_pred, y, reduction='none').sum(dim=1) * mask
else :
    # if the mask is element wise
    mask = torch.empty((B, n)).random_(2).bool() 
    loss = ( F.mse_loss(y_pred, y, reduction='none') * mask ).sum(dim=1) 

loss = loss.mean() #  loss.sum() 
loss

If possible provide more details about your task.