Loaded Model Returns Different Predictions

murphinator · April 20, 2020, 4:09pm

I performed transfer learning on a VGG-16 and re-trained the classifier portion. Training worked, the predictions work. However, I am puzzled as to why when I continue to run a prediction on the same image file, the prediction oscillates between two classifications that are very similar to each other. I must have forgotten to change a setting, but I’m not sure which one.

def predict_breed_transfer(img_path, prob=None):
    # load the image and return the predicted breed 
    
    img = Image.open(img_path)
    
    myResize         = transforms.Resize((255,255))
    myCenterCrop     = transforms.CenterCrop(224)
    myRandomRotation = transforms.RandomRotation(35)
    myToTensor       = transforms.ToTensor()
    myNormalize      = transforms.Normalize(mean=(0.485,0.456,0.406), 
                                            std=(0.229,0.224,0.225))
    
    transform  = transforms.Compose([myResize, 
                                     myCenterCrop,
                                     # myRandomRotation,
                                     myToTensor, 
                                     myNormalize])
    
    img = transform_test(img)
    img = img.unsqueeze(0)
    img = img.cuda()
    
    prediction = model_transfer(img)
    prediction = prediction.cpu()
    
    predicted_class_idx = prediction.data.numpy().argmax()
    predicted_class_str = class_names[predicted_class_idx]
    
    if prob == None:
        return predicted_class_str
    elif prob != None:
        return prediction.data.numpy().max()

ptrblck · April 21, 2020, 4:14am

You might have forgotten to call model_transfer.eval(), which will disable dropout and will use the running stats in batchnorm layers instead of the current batch stats.
Let me know, if that’s the case or if you are still seeing shaky outputs.

murphinator · April 21, 2020, 7:16pm

You are correct. Calling .eval() stopped the oscillation of predicted outputs. Many thanks!

murphinator · April 28, 2020, 9:45pm

Hi again, I was thinking about this more. If I am simply calling my prediction function to run an image through the net, meaning I am not backpropping or updating gradients, then I guess I still don’t understand why the output would change?

ptrblck · April 28, 2020, 11:04pm

The gradient calculation is independent from model.train()/eval().
train() and eval() will change the behavior of some modules, such as dropout and batch norm layers.
The gradient calculation (frozen parameters or not) won’t be changed.
So even if you wrap the code in a with torch.no_grad() block, which will disable the gradient calculation, the batch norm stats will be updated, if the modules in still in training mode.