Dynamically update input mask

I’m implementing a binary mask at the input level of a logistic regression. This is how I defined it:

new_mask = torch.ones(1, input_dim)

class DropFeatures(nn.Module):
    def __init__(self, input_dim, *kargs, **kwargs):
        super(DropFeatures, self).__init__()
        self.weights = new_mask
        
    def forward(self, x):
        return (x * self.weights)
    
class drop_logit(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(drop_logit, self).__init__()
        self.drop = DropFeatures(input_dim)
        self.fc1 = nn.Linear(input_dim, 1)
        self.fc2 = nn.Sigmoid()
          
    def forward(self, x):
        x = self.drop(x)
        x = self.fc1(x)
        x = self.fc2(x)
        return x
    
model = drop_logit(input_dim = 69, output_dim = 1)

Now I would like to be able to modify the new_mask dynamically inside the training loop to make the model more parsimonious overtime.

But it is not clear to me how I could achieve that.

Based on your code snippet it seems self.weights is an int and will not be trained, so I’m unsure how the masking is performed as it seems you are scaling the activation instead.

If you would like to train the self.weights parameter, create it as an nn.Parameter and pass it to the optimizer.

Ok, so in this configuration, the mask is applied only on evaluation? Because when I do model(X_test) the mask seems to be applied.

I tried what you suggested, namely:

class DropFeatures(nn.Module):
    def __init__(self, input_dim, *kargs, **kwargs):
        super(DropFeatures, self).__init__()
        self.weights = nn.Parameter(new_mask)
        
    def forward(self, x):
        return (x * self.weights)

The problem is that if I train the model setting new_mask = torch.zeros(1, 69) the loss is actually decreasing. Instead should not as all the features are zeroed, correct me if I am wrong.

The new code snippet isn’t consistent with the previous one, as you were previously using an int and not a mask.

        self.drop = DropFeatures(input_dim)
        self.fc1 = nn.Linear(input_dim, 1)

This indicates that input_dim is an int, not a tensor.

No, since you are not checking the self.training attribute to switch between training and evaluation mode.

I don’t know where you are setting this, but yes, if you are using a mask with all-zeros, the output will also be all zero.

I just took new_mask outside to change it at convenience the original code was:

class DropFeatures(nn.Module):
    def __init__(self, input_dim, *kargs, **kwargs):
        super(DropFeatures, self).__init__()
        self.weights = torch.zeros(1, input_dim)
        
    def forward(self, x):
        return (x * self.weights)
    
class drop_logit(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(drop_logit, self).__init__()
        self.drop = DropFeatures(input_dim)
        self.fc1 = nn.Linear(input_dim, 1)
        self.fc2 = nn.Sigmoid()
          
    def forward(self, x):
        x = self.drop(x)
        x = self.fc1(x)
        x = self.fc2(x)
        return x
    
model = drop_logit(input_dim = 69, output_dim = 1)

I do not see why you are telling me that self.weights is an int when if I print it is a tensor. Of course in this example all features are zeroed so I expect the loss to not decrease.