I am working on Semantic Role Labeling, I have tensors which are the result of concatenating the BERT embedding of a word and a bit which indicates where the predicate is in the sentence, so I have tensors of shape (batch_size, sequence_len, bert_embedding+1). I pass these to a dropout layer but the fact that the indicator bit will be zeroed is not desirable. Is there a way to tell dropout not to apply to that index?
There is no option for the nn.Dropout layer or the functional F.dropout call.
However, you could simply create a mask using torch.bernoulli and create a custom dropout layer.
Here is a simple exmaple:
x = torch.randn(10)
p = 0.5
probs = torch.cat((torch.ones(2,), torch.tensor([p]*8))) # don't zero out first 2 values
mask = torch.bernoulli(probs)
out = x * mask
Don’t forget to add the scaling during training or validation as is done for the vanilla dropout. Otherwise the expected values will differ when you disable your dropout and the model might perform bad.