Dropout for long tensor?

david-leon · July 18, 2019, 7:56am

Currently it prompts "fused_dropout" not implemented for 'Long', whereas it’s not a problem for Theano.

Is this feature we should wait for a while? or is it a bug?

tom · July 18, 2019, 12:12pm

Neither.
The problem is that modern dropout scales the outputs by 1/p. This would mean that you can only use 1/n, n integer . Also you don’t get autograd with long tensors. As such, the use-case seems too limited to include it in PyTorch given that torch.empty_like(a).bernoulli_(keep_prob) * a will give you an (unscaled) dropout for long with not much code.

Best regards

Thomas

david-leon · July 19, 2019, 2:36am

There’s no much sense to do mandatory scaling in Dropout, this feature should be optional controlled by a input argument.

Anyway, thanks for replying.

ptrblck · July 19, 2019, 10:53am

I’m not so sure about this claim, as the expected values will be off, which will most likely result in bad validation and test performance as described in the Dropout paper.