Ok, I have been able to solve the problem. It seems that the way my model class is written when I call get_activations()
a new tensor is returned, hence the backward()
doesn’t compute the gradient of the outputs with respect to the activations. I fix it by attaching a hook to the activation tensor within the forward
method like this:
class VGG(nn.Module):
def __init__(self):
super(VGG, self).__init__()
# get the pretrained VGG19 network
self.vgg = vgg19(pretrained=True)
# disect the network to access its last convolutional layer
self.features_conv = self.vgg.features[:35]
# get the relu and the max pool of the features stem
self.relu_max_pool_features = self.vgg.features[35:37]
# get the classifier of the vgg19
self.classifier = self.vgg.classifier
# placeholder for the gradients
self.gradients = None
def activations_hook(self, grad):
self.gradients = grad
def forward(self, x):
x = self.features_conv(x)
# register the hook
x.register_hook(self.activations_hook)
x = self.relu_max_pool_features(x)
x = x.view((1, -1))
x = self.classifier(x)
return x
def get_activations_gradient(self):
return self.gradients
This way the gradient is returned as expected when called with vgg.get_activations_gradient()
.