Hi!

I am currently implementing the Monte Carlo sampling in order to calculate not only the output of the network but also its variance. To do so, it is needed to run the CNN with dropout enabled and input the same image like 20 times (which is, of course, 20 time slower compared to typical method). These outpus then are processed to obtain the mean and variance of the prediction.Right now the idea is working, but it takes 0.5s/image with following execution:

outputs = [net(inputs) for i in range(20)]

Is there a faster way to execute this? I thought about building batches with 20 images repeated, but I guess that the dropout will be the same for rsch batch, obtaining 20 times the same output per batch.

Maybe anything related to parallelism?