Is there a way to visualize the gradient path of the back propagation of the entire network


Now, my network has two branches, one of which is the normal ResNet50 and the other branch is forked from the third convolution block of ResNet50. In the latter branch, I set some operations, one of which is as follows.


    # adapted from
    # (x - y)^2 = x^2 - 2*x*y + y^2
    def get_knn_indices(self, batch_mat, k):
        r = torch.bmm(batch_mat, batch_mat.permute(0, 2, 1)) 
        N = r.size()[0]
        HW = r.size()[1]
        if self.use_gpu:
            batch_indices = torch.zeros((N, HW, k)).cuda()
            batch_indices = torch.zeros((N, HW, k))
        for idx, val in enumerate(r):
            # get the diagonal elements
            diag = val.diag().unsqueeze(0)
            diag = diag.expand_as(val)
            # compute the distance matrix
            D = (diag + diag.t() - 2 * val).sqrt()
            topk, indices = torch.topk(D, k=k, largest=False)
            batch_indices[idx] =
        return batch_indices

I think __get_knn_indices is non-differentiable, which may cause the parameters of this branch not to be updated when backpropagating.

So, I have some questions:

  1. How can I tell if my thoughts are correct?
  2. Is there a way to visualize the gradient path of the back propagation of the entire network? If there is any, it is estimated that my first problem can be easily solved.

Meh, there are some works which allows you to plot the graph but you can get masive huge graphs.

Why don’t you just visualize gradients? if there is a detach you will see None instead of a number

1 Like

Thanks for your reply.

According to your advice, I find tensorboardX can do it:

I’ll try it. Thank you again.

I am working on implementing this as well. At what point during the training should you check for the gradient?

Currently, I am checking at the end of each epoch by iterating through my models parameters and calling the variable .grad As shown in code below. However, for some reason when I visualize it in Tensorboard all my layers have zero gradients, even though the histograms show that the weights and bias are changing.

for tag, parm in model.named_parameters:
     writer.add_histogram(tag,, epoch)
1 Like

I think you’re saying that this method is not accurate for observing gradients.
I hope I get it right.

I haven’t tried yet, but the question you’re talking about is really interesting, so here’s the question. How should we observe the gradient?

Perhaps you should be calling add_histogram before .zero_grad() … Calling it after, I’d expect to get zeros!


@drb12 you have a typo in your code.


is a method, so to iterate over parameters you should call it