Based on your code snippet it seems self.weights is an int and will not be trained, so I’m unsure how the masking is performed as it seems you are scaling the activation instead.
If you would like to train the self.weights parameter, create it as an nn.Parameter and pass it to the optimizer.
The problem is that if I train the model setting new_mask = torch.zeros(1, 69) the loss is actually decreasing. Instead should not as all the features are zeroed, correct me if I am wrong.
I just took new_mask outside to change it at convenience the original code was:
class DropFeatures(nn.Module):
def __init__(self, input_dim, *kargs, **kwargs):
super(DropFeatures, self).__init__()
self.weights = torch.zeros(1, input_dim)
def forward(self, x):
return (x * self.weights)
class drop_logit(nn.Module):
def __init__(self, input_dim, output_dim):
super(drop_logit, self).__init__()
self.drop = DropFeatures(input_dim)
self.fc1 = nn.Linear(input_dim, 1)
self.fc2 = nn.Sigmoid()
def forward(self, x):
x = self.drop(x)
x = self.fc1(x)
x = self.fc2(x)
return x
model = drop_logit(input_dim = 69, output_dim = 1)
I do not see why you are telling me that self.weights is an int when if I print it is a tensor. Of course in this example all features are zeroed so I expect the loss to not decrease.