Unable to find gradient for Tensor

Omroth · January 10, 2020, 12:00pm

Hello.

I’m looking to visualise which nodes in my network influence a particular prediction. In order to learn how to do that, I’m starting from the beginning. I read this - https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html - and am just trying to look at the gradient for a tensor that has gone through the network, but the tensor’s grad member is always None. Here is my code:

class Test_Network(nn.Module):
    def __init__(self):
        super(Test_Network, self).__init__()
        
        self.gradients = None

        #

        layer_list = []

        #

        layer = nn.Linear(4, 6)
        layer_list.append(layer)

        #

        layer = nn.Linear(6, 1)
        layer_list.append(layer)

        #

        self.layer_list = nn.ModuleList(layer_list)


        
    def to(self, device):
        model = super(Test_Network, self).to(device)

        for i, layer in enumerate(model.layer_list):
            model.layer_list[i] = layer.to(device)

        return model


    def save_gradient(self, grad):
        self.gradients = grad


    def forward(self, x):
        x = x.float()

        for i, layer in enumerate(self.layer_list):
            if i == len(self.layer_list) - 1:
                x.register_hook(self.save_gradient)

            x = layer(x)    

        return x


    

#

def scratch():
    model = Test_Network()

    #

    datum = [0.9,0.1,0.2,0.4]

    batch = np.array([datum])

    batch_tensor = torch.from_numpy(batch)

    prediction = model(batch_tensor)
    prediction.backward()

    pprint.pprint(prediction[0].grad)

To be specific - I’d expect prediction[0].grad to be the gradient object for the tensor’s path through the model, but it is None.

(You may notice that I’ve also tried to create a hook in the network, but have not receieved a gradient from that either.)

Any help would be appreciated, thank you.
Ian

Omroth · January 10, 2020, 12:35pm

Ok, I’ve got the gradients - I hope - via setting requires_grad to True, and adding hooks at each layer:

class Test_Network(nn.Module):
    def __init__(self):
        super(Test_Network, self).__init__()
        
        self.gradients = None

        #

        layer_list = []

        #

        layer = nn.Linear(4, 6)
        layer_list.append(layer)

        #

        layer = nn.Linear(6, 1)
        layer_list.append(layer)

        #

        self.layer_list = nn.ModuleList(layer_list)


        
    def to(self, device):
        model = super(Test_Network, self).to(device)

        for i, layer in enumerate(model.layer_list):
            model.layer_list[i] = layer.to(device)

        return model


    def save_gradient(self, grad):
        print(grad)

        self.gradients = grad


    def forward(self, x):
        x = x.float()

        for i, layer in enumerate(self.layer_list):
            x.register_hook(self.save_gradient)

            x = layer(x)    

        return x


    

#

def scratch():
    model = Test_Network()

    #

    datum = [0.9,0.1,0.2,0.4]

    batch = np.array([datum])

    batch_tensor = torch.from_numpy(batch)

    batch_tensor.requires_grad = True

    prediction = model(batch_tensor)
    prediction.backward()

    pprint.pprint(batch_tensor.grad)

I’m now a bit lost in trying to transfer these gradients into a heatmap of which nodes of each layer were important to the prediction.

ptrblck · January 10, 2020, 11:17pm

You could store the gradients in save_gradients in a list or dict and try to visualize them.
However, since the shapes will most likely differ, I assume you would like to visualize the gradients of each parameter separately?

Omroth · January 13, 2020, 10:03am

I would - what’s the best way of selecting one parameter or CNN feature and seeing how it has impacted the classification decision?

ptrblck · January 13, 2020, 3:40pm

I’m not sure what the current state of the art algorithms are, but I would suggest to have a look at Captum which provides different algorithms for model interpretation.