Hello,
Backstory:
I’ve taken some inspiration from this post on the fast.ai forums:
to build in dropout at evaluation time as a way of attempting to measure the uncertainty of a prediction.
I also used this post as a basis for .apply()
-ing a function at .eval()
time:
The way I understand these techniques:
By applying dropout at evaluation time and running over many forward passes (10-100+), you get predictions from a variety of different models.
What you can then do with these predictions is measure how much they differ (get the .var()
of your 100 different samples).
With this difference, you can then see what samples the model is ‘uncertain’ about (the ones with high variance).
Use case example:
Uber seems to be using this technique for some of their predictions: https://eng.uber.com/tag/monte-carlo-dropout/
My main question (more of a sound check…):
I’ve put together an example pipeline using MNIST but I’m unsure of some of the custom functions I’ve created/taken from code examples online.
Has anyone had experience with Monte Carlo Dropout or another method of measuring uncertainty they can share?
My code (critiques/advice welcome):
# Create function to apply to model at eval() time
def apply_dropout(m):
if type(m) == nn.Dropout2d:
m.train()
# Func to predict MNIST class
def predict_class(model, X):
model = model.eval()
model.apply(apply_dropout) # apply dropout at pred time (see func above)
outputs = model(Variable(X))
#print(outputs)
_, pred = torch.max(outputs.data, 1)
return pred.numpy()
# Run for T times and get list_of_preds for measuring variance
def predict(model, X, T=100):
list_of_preds = []
standard_pred=predict_class(model, X)
y1 = []
y2 = []
for _ in range(T):
_y1 = model(Variable(X))
_y2 = F.softmax(_y1, dim=1)
y1.append(_y1.data.numpy())
y2.append(_y2.data.numpy())
list_of_preds.append(predict_class(model, X)) # predict T times
return standard_pred, np.array(y1), np.array(y2), np.array(list_of_preds)