Hello,

**Backstory:**

I’ve taken some inspiration from this post on the fast.ai forums:

to build in dropout at evaluation time as a way of attempting to measure the uncertainty of a prediction.

I also used this post as a basis for `.apply()`

-ing a function at `.eval()`

time:

**The way I understand these techniques:**

By applying dropout at evaluation time and running over many forward passes (10-100+), you get predictions from a variety of different models.

What you can then do with these predictions is measure how much they differ (get the `.var()`

of your 100 different samples).

With this difference, you can then see what samples the model is ‘uncertain’ about (the ones with high variance).

**Use case example:**

Uber seems to be using this technique for some of their predictions: https://eng.uber.com/tag/monte-carlo-dropout/

**My main question (more of a sound check…):**

I’ve put together an example pipeline using MNIST but I’m unsure of some of the custom functions I’ve created/taken from code examples online.

Has anyone had experience with Monte Carlo Dropout or another method of measuring uncertainty they can share?

**My code (critiques/advice welcome):**

```
# Create function to apply to model at eval() time
def apply_dropout(m):
if type(m) == nn.Dropout2d:
m.train()
# Func to predict MNIST class
def predict_class(model, X):
model = model.eval()
model.apply(apply_dropout) # apply dropout at pred time (see func above)
outputs = model(Variable(X))
#print(outputs)
_, pred = torch.max(outputs.data, 1)
return pred.numpy()
# Run for T times and get list_of_preds for measuring variance
def predict(model, X, T=100):
list_of_preds = []
standard_pred=predict_class(model, X)
y1 = []
y2 = []
for _ in range(T):
_y1 = model(Variable(X))
_y2 = F.softmax(_y1, dim=1)
y1.append(_y1.data.numpy())
y2.append(_y2.data.numpy())
list_of_preds.append(predict_class(model, X)) # predict T times
return standard_pred, np.array(y1), np.array(y2), np.array(list_of_preds)
```