I’ve trained a CNN for regression on 3D images with no gradient issues. My goal is now to visualise the gradients of some input image to identify the most important pixels. The output of my network is a tensor of size 1 and so I use this output directly to do the backward pass.
The issue is that the gradients that I get out are all zero no matter what I do. Can anyone help me understand why this is?
image_tensor = torch.Tensor(image_tensor) # convert numpy or list to tensor if cuda: image_tensor = image_tensor.cuda() X = Variable(image_tensor[None], requires_grad=True) # add dimension to simulate batch output = model(X) # Backward pass. model.zero_grad() output.backward() relevance_map = X.grad.cpu().numpy() return relevance_map
FYI, the model is a sequence of 3D conv blocks, relus and then a sequence of 3 linear layers + relus reducing the dimension down to 1.
Thanks a lot!