Alternate form of Dropout


In my current project, I want to randomly drop some information before a custom classification module (MIL related) as my training set is relatively small. However, I do not want to set the activations to 0 because it might be picked as useful information by the last module which selects the top activations.
My solution to that is to select “by hand” the desired amount of activations as follows:

        self.drop = drop

    def forward(self, input: torch.Tensor):
        N, C, H, W = input.size()
        device = input.device
        activs = input.view(N, C, H * W)

        if self.drop and
            keep = round(H * W * (1 - self.drop))
            selector = torch.rand_like(activs).sort(-1)[1][:,:,:keep]
            activs = torch.masked_select(activs, torch.zeros_like(activs).scatter_(-1, selector, 1).byte()).view(N, C, -1)

However this solution is a bit dirty and does not preserve the shape of the input. Any idea to another solution ?

EDIT: randperm is not efficient as it is not implemented on GPU, rand with sorting works faster