Trainable mask threshold?

Hi. I’d like to implement custom Conv2d layer that black-out image(2D tensor) in passed feature map(4D tensor) by preset threshold, which is trainable during the process.

In code…

class MyConv2d(nn.Module):
    def __init__(self, threshold: float = 0.1, **kwargs):
        self.threshold = nn.Parameter(torch.tensor(threshold, requires_grad=True))
        self.conv = nn.Conv2d(~) # initialized with given kwargs

    def forward(self, x: torch.Tensor):
        assert x.dim() == 4
        activation_of_each_image = torch.mean(x, dim=[2, 3])
        mask =, self.threshold)
        mask = mask.unsqueeze(2).unsequeeze(3) # convert dim to fit with original x.
        x = x.masked_fill(~mask, 0) # black-out images (filling 0) which have lower activation than threshold.
        y = self.conv(x)
        return y

I thought that self.threshold will be automatically changes by internal gradient calculation process, but I found It didn’t trained at all during the training.

How to make my threshold trainable? Is it impossible to calculate gradient for threshold with ge logic?