Gradient of the output wrt activation

halahup · February 20, 2019, 2:37am

Ok, I have been able to solve the problem. It seems that the way my model class is written when I call get_activations() a new tensor is returned, hence the backward() doesn’t compute the gradient of the outputs with respect to the activations. I fix it by attaching a hook to the activation tensor within the forward method like this:

class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()
        
        # get the pretrained VGG19 network
        self.vgg = vgg19(pretrained=True)
        
        # disect the network to access its last convolutional layer
        self.features_conv = self.vgg.features[:35]
        
        # get the relu and the max pool of the features stem
        self.relu_max_pool_features = self.vgg.features[35:37]
        
        # get the classifier of the vgg19
        self.classifier = self.vgg.classifier
        
        # placeholder for the gradients
        self.gradients = None
        
    def activations_hook(self, grad):
        self.gradients = grad
        
    def forward(self, x):
        x = self.features_conv(x)
        
        # register the hook
        x.register_hook(self.activations_hook)
        
        x = self.relu_max_pool_features(x)
        x = x.view((1, -1))
        x = self.classifier(x)
        return x
    
    def get_activations_gradient(self):
        return self.gradients

This way the gradient is returned as expected when called with vgg.get_activations_gradient().