Monte Carlo dropout and independent dropout masks within mini batches?

Has anyone already implemented Monte Carlo dropout as described by Gal & Ghahramani '15 and in Gal’s blog: What my model doesn’t know - using pytorch for estimating a model’s confidence in its predictions? I know it would be fairly trivial to implement, but if someone has already done it that’s even better.

One could parallelize MCD inference by having multiple instances of a given item in a mini-batch. However, in order for that to work the dropout masks have to be independent for all the members of a given mini-batch. Normally it would be faster to re-use a mask across members so I’m curious how it’s done in torch.


In pytorch dropout masks are independent for all the samples in the minibatch (a single mask the size of the whole input is generated).

Good to hear. Thanks.