Calculate differential of a tensor before loss

Y_Y · February 8, 2022, 8:24pm

Hi all,

If I have the following piece of code

  similarity_matrix = torch.matmul(features, features.T)
  mask = torch.eye(labels.shape[0], dtype=torch.bool).cuda() 
  labels = labels[~mask].view(labels.shape[0], -1)      
  similarity_matrix = similarity_matrix[~mask].view(similarity_matrix.shape[0], -1)
  positives = similarity_matrix[labels.bool()].view(labels.shape[0], -1)
  negatives = similarity_matrix[~labels.bool()].view(similarity_matrix.shape[0], -1)
  logits = torch.cat([positives, negatives], dim=1)
  P = logits[0, 0]

Is it possible to calculate the differential of P using something like P.backward()?
I have register hook of middle layer. The final goal is to calculate the differential of \partial P/\partial middle layer.
I tried but got all 0s.

Thank you for any help.

osama-usuf · February 9, 2022, 4:42am

You can take use of PyTorch’s autograd to compute the differential for P - here’s a link to the documentation https://pytorch.org/docs/stable/generated/torch.autograd.grad.html

If this is a repeated operation, I would recommend you register a hook (as you say you’ve done) on the tensor which would be called every time a gradient with respect to that tensor is calculated. Then, when loss.backward() is called, it would trigger your hook and you would have your required gradient (assuming that’s what you required at the first place).

I would have to see your code for the hook and maybe the training loop to be able to pinpoint why your gradient is all 0s even after the hook. One possible explanation of that could be that your hook is never called, because again, tensor hooks are called ONLY when a gradient with respect to that tensor is computed somewhere in your computational graph. Please try logging the hook to verify if it’s actually called as a starting step.

Here’s a link that might be helpful: torch.Tensor.register_hook — PyTorch 1.10 documentation