Difference between MULBackward, requires_grad and Sum_backward

I am trying to find contrastive loss by sending an instance of input using:
loss = criterion(*input_loss)
criterion is contrastive loss.

The following code updates loss:
loss_value = 0.5 * (target.float() * distances + (1 + -1 * target).float() * F.relu(self.margin - (distances + self.eps).sqrt()).pow(2))

Loss_value here is of type (tensor MULBackward) and distance is (tensor Sum_backward)

But when I try to detach the input and find the loss using numpy, the loss isn’t getting updated:

out1=output1
out2=output2
out1=out1.detach().numpy()
out2=out2.detach().numpy()
dist=np.sum(np.power(out2 - out1, 2),axis=1)
loss_value = 0.5 * (target.float() * dist +(1 + -1 * target).float() * np.maximum(0,np.power(self.margin - np.sqrt(dist + self.eps),2))
loss_value.requires_grad_()
dist=torch.from_numpy(dist)
dist.requires_grad_()
out1.requires_grad_()
out2.requires_grad_()

I tried to do dist.requires_grad_().sum() but it doesn’t help. Can someone please help me understand the issue.