Gradients not returned by register_hook upon loading a saved model dict

theairbend3r · June 6, 2020, 11:05am

Here’s my classifier -

class Classifier(nn.Module):
    def __init__(self, num_classes):
        super(Classifier(, self).__init__()
        
        
        self.base_conv = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2)
        
        self.conv_block_1 = ConvBlock(in_channels=16, out_channels=32)
        self.conv_block_2 = ConvBlock(in_channels=32, out_channels=64)
                
        self.lastcnn = nn.Conv2d(in_channels=64, out_channels=num_classes, kernel_size=28, stride=1, padding=0)
        
        self.gradients = None
        
        
    def forward(self, x):
        
        x = self.base_conv(x)
        
        x = self.conv_block_1(x)
        x = self.conv_block_2(x)
                                       
        if x.requires_grad: x.register_hook(self.store_gradients)

        x = self.lastcnn(x)
        
        return x
      
      
    def store_gradients(self, grad):
        self.gradients = grad    
    
    def get_gradients(self):
        return self.gradients

After I training this model, I save it as a dict using torch.save(model.state_dict(), PATH).

On loading the model, when I try to access it’s get_gradient method as follows, no gradients are returned.

model.eval()
pred = model(single_image.to(device))
pred = torch.softmax(pred, dim = 1)
pred_idx = pred.argmax(dim=1).item()

gradients = model.get_gradients()
gradients  #this is empty

On the flipside, If run the exact same code after training (instead of saving and loading), gradients are returned.

I tried saving the whole model using torch.save() and then load it again. The code works. So, it’s because of the parameters not getting saved in the model dict. Any way to store all the additional parameters when doing torch.save(model.state_dict(), PATH)?

tom · June 7, 2020, 9:02am

Saving and loading the model doesn’t restore gradients nor tensors assigned as attributes. It will save and restore parameters (nn.Parameter) and buffers. As your gradient is not a parameter, you could use register_buffer or so.
Whether or not saving and restoring the gradient is a good idea in general is another topic, but as long as it’s code for yourself, you’re the arbiter of this. I would probably advise against doing so for things you want to put into a library or somesuch.

Best regards

Thomas

theairbend3r · June 8, 2020, 9:43am

I’m using it for GradCAM. I train multiple models => save them => load all of them in a new script and run forward pass => get gradients and plot gradcam heatmaps.

That’s why I need those parameters. Is there a better way to do this?

Also, how would I go about saving the parameters I use register_hook to extract into register_buffer? And once stored, how I would I go about loading my model?