Variable gradient not updating

ameertg · February 12, 2021, 8:58pm

Hi,

I’ve trained a CNN for regression on 3D images with no gradient issues. My goal is now to visualise the gradients of some input image to identify the most important pixels. The output of my network is a tensor of size 1 and so I use this output directly to do the backward pass.
The issue is that the gradients that I get out are all zero no matter what I do. Can anyone help me understand why this is?

    image_tensor = torch.Tensor(image_tensor)  # convert numpy or list to tensor
    if cuda:
        image_tensor = image_tensor.cuda()
    X = Variable(image_tensor[None], requires_grad=True)  # add dimension to simulate batch
    output = model(X)

    # Backward pass.
    model.zero_grad()
    output.backward()
    relevance_map = X.grad.cpu().numpy()[0]

    return relevance_map

FYI, the model is a sequence of 3D conv blocks, relus and then a sequence of 3 linear layers + relus reducing the dimension down to 1.

Thanks a lot!

ptrblck · February 13, 2021, 2:15am

I would recommend to remove the usage of Variables, as they are deprecated since PyTorch 0.4, but your code should generally work.
Here is a small example showing that gradients are accumulated in the input using a simple model:

x = torch.randn(1, 10, requires_grad=True)
model = nn.Linear(10, 10)
output = model(x)

# Backward pass.
model.zero_grad()
output.mean().backward()
relevance_map = x.grad.cpu().numpy()[0]
print(relevance_map)