In my work, I need to back-propagate different gradient values to different inputs in one operation (e.g. add, matmul, conv).
For example, for y = torch.matmul(A, X), I can get grad_y_a and grad_y_x, and I want to backprop grad_y_a to A, and grad_y_x to X.
For now, I can get around it by repeating the operation:
y_a = torch.matmul(A,X.detach()) y_b = torch.matmul(A.detach(), X)
But this approach wasted a lot of computation, is there a better way to do this?